Query External Data with Autonomous Database
Describes packages and tools to query and validate data with Autonomous Database.
External data is not managed by the database; however, you can use
procedures to query your external data. Although queries on external data will not be as fast as queries on database tables, you can use this approach
to quickly start running queries on your external source files and external data.
You can use
DBMS_CLOUD procedures to validate the data in the external source
files for an external table so that you can identify problems and either correct the data in the external table or exclude invalid data before you use
Note:If you are not using
ADMINuser, ensure the user has the necessary privileges for the operations the user needs to perform. See Manage Database User Privileges for more information.
Query External Data
To query data in files in the Cloud, you need to first store your object storage credentials in your Autonomous Database, and then create an external table using the PL/SQL procedure
The source file in this example,
channels.txt, has the following
S,Direct Sales,Direct T,Tele Sales,Direct C,Catalog,Indirect I,Internet,Indirect P,Partners,Others
- Store your Cloud Object Storage credential using the
DBMS_CREDENTIAL.CREATE_CREDENTIALprocedure. For example:
DBMS_CREDENTIAL.CREATE_CREDENTIAL( credential_name => 'DEF_CRED_NAME', username => 'firstname.lastname@example.org', password => 'password' ); END; /
The values you provide for
passworddepend on the Cloud Object Storage service you are using:
Oracle Cloud Infrastructure Object Storage:
usernameis your Oracle Cloud Infrastructure user name and
passwordis your Oracle Cloud Infrastructure auth token. See Working with Auth Tokens.
Oracle Cloud Infrastructure Object Storage Classic:
usernameis your Oracle Cloud Infrastructure Classic user name and
passwordis your Oracle Cloud Infrastructure Classic password.
This operation stores the credentials in the database in an encrypted format. You can use any name for the credential name. Note that this step is required only once unless your object store credentials change. Once you store the credentials you can then use the same credential name for all data loads.
- Create an external table on top of your source files using the procedure
DBMS_CLOUD.CREATE_EXTERNAL_TABLEsupports external files in the supported cloud object storage services. The credential is a table level property; therefore, the external files must be on the same object store.
DBMS_CLOUD.CREATE_EXTERNAL_TABLE( table_name =>'CHANNELS_EXT', credential_name =>'DEF_CRED_NAME', file_uri_list =>'https://swiftobjectstorage.us-phoenix-1.oraclecloud.com/v1/idthydc0kinr/mybucket/channels.txt', format => json_object('delimiter' value ','), column_list => 'CHANNEL_ID NUMBER, CHANNEL_DESC VARCHAR2(20), CHANNEL_CLASS VARCHAR2(20) ); END; /
The parameters are:
table_name: is the external table name.
credential_name: is the name of the credential created in the previous step.
file_uri_list: is a comma delimited list of the source files you want to query.
In this example,
file_uri_listis an Oracle Cloud Infrastructure Swift URI that specifies the
channels.txtfile in the
mybucketbucket in the
idthydc0kinris the object storage namespace in which the bucket resides.) For information about the supported URI formats, see Cloud Object Storage URI Formats.
format: defines the options you can specify to describe the format of the source file. For information about the format options you can specify, see Format Parameter.
column_list: is a comma delimited list of the column definitions in the source files.
You can now run queries on the external table you created in the previous step. For example:
SELECT count(*) FROM channels_ext;
By default the database expects all rows in the external data file to be valid and match both the target data type definitions as well as the format definition of the file(s). If there are any rows in the source files that do not match the format options you specified, the query reports an error. You can use
formatparameter options like
rejectlimitto suppress these errors. As an alternative, you can also validate the external table you created to see the error messages and the rejected rows so that you can change your format options accordingly. See Validate External Data for more information.
For detailed information about the parameters, see CREATE_EXTERNAL_TABLE Procedure.
Validate External Data
To validate an external table, you use the procedure
Before validating an external table you need to create the external table using
DBMS_CLOUD.CREATE_EXTERNAL_TABLE procedure. Then use the
DBMS_CLOUD.VALIDATE_EXTERNAL_TABLE procedure to
validate it. For
DBMS_CLOUD.VALIDATE_EXTERNAL_TABLE( table_name => 'CHANNELS_EXT' ); END; /
This procedure scans your source files and validates them using the format options specified when you created the external table.
The validate operation, by default, scans all the rows in your source files and stops when a row is rejected. If you want to validate only
a subset of the rows, use the
rowcount parameter. When the
rowcount parameter is set, the validate operation scans
rows and stops either when a row is rejected or when the specified number of rows are validated without errors.
For example, the following validate operation scans 100 rows and stops when a row is rejected or when 100 rows are validated without errors:
BEGIN DBMS_CLOUD.VALIDATE_EXTERNAL_TABLE ( table_name => 'CHANNELS_EXT', rowcount => 100 ); END; /
If you do not want the validate to stop when a row is rejected and you want to see all rejected rows, set the
stop_on_error parameter to
FALSE. In this case
VALIDATE_EXTERNAL_TABLE scans all rows and reports
all rejected rows.
If you want to validate only a subset of rows use the
rowcount parameter. When
rowcount is set and
stop_on_error is set to
FALSE, the validate operation scans rows and stops
either when the specified number of rows are rejected or when the specified number of rows are validated without errors. For example, the following
example scans 100 rows and stops when 100 rows are rejected or when 100 rows are validated without
BEGIN DBMS_CLOUD.VALIDATE_EXTERNAL_TABLE ( table_name => 'CHANNELS_EXT', rowcount => 100 stop_on_error => FALSE ); END; /
See VALIDATE_EXTERNAL_TABLE Procedure for detailed information about
See View Logs for Data Validation to see the
results of validate operations in the tables
View Logs for Data Validation
After validating an external table, you can see the result of the validate operation by querying a load operations table:
dba_load_operations: shows all validate operations.
user_load_operations: shows the validate operations in your schema.
You can use these tables to view load validation information. For example use this SELECT statement to query
SELECT table_name, owner_name, type, status, start_time, update_time, logfile_table, badfile_table FROM user_load_operations WHERE type = 'VALIDATE'; TABLE_NAME OWNER_NAME TYPE STATUS START_TIME UPDATE_TIME LOGFILE_TABLE BADFILE_TABLE ------------- ---------- ---------- ---------- -------------- ------------- --------------- --------------- CHANNELS_EXT SH VALIDATE COMPLETED 13-NOV-17... 13-NOV-17... VALIDATE$21_LOG VALIDATE$21_BAD
Using this SQL statement with the
WHERE clause on the
TYPE column displays all of the load operations
LOGFILE_TABLE column shows the name of the table you can query to look at
the log of a validate operation. For example, the following query shows the log for this validate operation:
SELECT * FROM VALIDATE$21_LOG;
shows the name of the table you can query to look at the rows where there were errors during validation. For example, the following query shows the
rejected records for the above validate operation:
SELECT * FROM VALIDATE$21_BAD;
Depending on the errors shown in the log and the rows shown in the
BADFILE_TABLE, you can correct the error by dropping the external table using the
DROP TABLE command and recreating it by specifying the correct format options in
BADFILE_TABLEtables are stored for two days for each validate operation and then removed automatically.
Query External Data with Autonomous Database
Copyright © 2021, Oracle and/or its affiliates.