3 Loading Data with Autonomous Transaction Processing

Describes packages and tools to load data with Autonomous Transaction Processing.

About Data Loading

Traditionally transaction processing systems ingest data through routine transactions or DML operations; you can also bulk load data into Autonomous Transaction Processing using Oracle Database tools, and Oracle or other 3rd party data integration tools.

In general you load data from files local to your client computer or from files stored in a cloud-based object store. For data loading from files in the cloud, Autonomous Transaction Processing provides a new PL/SQL package, DBMS_CLOUD.

For the fastest data loading experience Oracle recommends uploading the source files to a cloud-based object store, such as Oracle Cloud Infrastructure Object Storage, before loading the data into your Autonomous Transaction Processing. Oracle provides support for loading files that are located locally in your data center, but when using this method of data loading you should factor in the transmission speeds across the Internet which may be significantly slower.

For more information on Oracle Cloud Infrastructure Object Storage, see Putting Data into Object Storage and Overview of Object Storage.

For a tutorial on data loading using Oracle Cloud Infrastructure Object Storage, see Loading Your Data.

Note:

If you are not using ADMIN user, ensure the user has the necessary privileges for the operations the user needs to perform. See Manage User Privileges with Autonomous Transaction Processing for more information.

Load Data from Files in the Cloud

The PL/SQL package DBMS_CLOUD provides support for loading data from text, Parquet, and Avro files in the Cloud to your tables in Autonomous Transaction Processing.

Topics

The package DBMS_CLOUD supports loading from files in the following cloud services: Oracle Cloud Infrastructure Object Storage, Oracle Cloud Infrastructure Object Storage Classic, Azure Blob Storage, and Amazon S3.

Load Data - Create Credentials and Copy Data into an Existing Table

For data loading from files in the Cloud, you need to first store your object storage credentials in your Autonomous Transaction Processing and then use the procedure DBMS_CLOUD.COPY_DATA to load data.

The source file in this example, channels.txt, has the following data:

S,Direct Sales,Direct
T,Tele Sales,Direct
C,Catalog,Indirect
I,Internet,Indirect
P,Partners,Others
  1. Store your object store credentials using the procedure DBMS_CLOUD.CREATE_CREDENTIAL. For example:
    SET DEFINE OFF
    BEGIN
      DBMS_CLOUD.CREATE_CREDENTIAL(
        credential_name => 'DEF_CRED_NAME',
        username => 'atpc_user@example.com',
        password => 'password'
      );
    END;
    /

    This operation stores the credentials in the database in an encrypted format. You can use any name for the credential name. Note that this step is required only once unless your object store credentials change. Once you store the credentials you can then use the same credential name for all data loads.

    For detailed information about the parameters, see CREATE_CREDENTIAL Procedure.

    Note:

    Some tools like SQL*Plus and SQL Developer use the ampersand character (&) as a special character. If you have the ampersand character in your password use the SET DEFINE OFF command in those tools as shown in the example to disable the special character and get the credential created properly.
  2. Load data into an existing table using the procedure DBMS_CLOUD.COPY_DATA. For example:
    CREATE TABLE CHANNELS
       (channel_id CHAR(1),
        channel_desc VARCHAR2(20),
        channel_class VARCHAR2(20)
       );
    
    /
    
    BEGIN
     DBMS_CLOUD.COPY_DATA(
        table_name =>'CHANNELS',
        credential_name =>'DEF_CRED_NAME',
        file_uri_list =>'https://objectstorage.us-phoenix-1.oraclecloud.com/n/atpc/b/atpc_user/o/channels.txt',
        format => json_object('delimiter' value ',')
     );
    END;
    /
    

    The parameters are:

    • table_name: is the target table‚Äôs name.

    • credential_name: is the name of the credential created in the previous step.

    • file_uri_list: is a comma delimited list of the source files you want to load.

    • format: defines the options you can specify to describe the format of the source file, including whether the file is of type text, Parquet, or Avro.

    For detailed information about the parameters, see COPY_DATA Procedure and COPY_DATA Procedure for Parquet or Avro Files.

Load Data – Monitor and Troubleshoot Loads

All data load operations done using the PL/SQL package DBMS_CLOUD are logged in the tables dba_load_operations and user_load_operations:

  • dba_load_operations: shows all load operations.

  • user_load_operations: shows the load operations in your schema.

Query these tables to see information about ongoing and completed data loads. For example:


SELECT table_name, owner_name, type, status, start_time, update_time, logfile_table, badfile_table 
   FROM user_load_operations WHERE type = 'COPY';

TABLE_NAME OWNER_NAME  TYPE   STATUS     START_TIME              UPDATE_TIME           LOGFILE_TABLE   BADFILE_TABLE
---------- ----------- ------- ---------- ---------------------- --------------------- --------------- -------------
CHANNELS   SH          COPY   COMPLETED  06-NOV-18 01.55.19.3    06-NOV-18 01.55.28.2  COPY$21_LOG     COPY$21_BAD

Using this SELECT statement with a WHERE clause predicate on the TYPE column, shows load operations with the type COPY.

The LOGFILE_TABLE column shows the name of the table you can query to look at the log of a load operation. For example, the following query shows the log of the load operation:

select * from COPY$21_LOG;

The column BADFILE_TABLE shows the name of the table you can query to look at the rows that got errors during loading. For example, the following query shows the rejected records for the load operation:

select * from COPY$21_BAD;

Depending on the errors shown in the log and the rows shown in the specified BADFILE_TABLE table you can correct the error by specifying the correct format options in DBMS_CLOUD.COPY_DATA.

Note:

The LOGFILE_TABLE and BADFILE_TABLE tables are stored for two days for each load operation and then removed automatically.

See DELETE_ALL_OPERATIONS Procedure for information on clearing the user_load_operations table.

Monitor and Troubleshoot Parquet and Avro File Loading

As with other data files, Parquet and Avro data loads generate logs that are viewable in the tables dba_load_operations and user_load_operations. Each load operation adds a record to dba[user]_load_operations that indicates the table containing the logs.

The log table provides summary information about the load.

Note:

For Parquet and Avro files, when the format parameter type is set to the value parquet or avro the BADFILE_TABLE table is always empty.

  • PRIMARY KEY constraint errors throw an ORA error.
  • If data for a column encounters a conversion error, for example, the target column is not large enough to hold the converted value, the value for the column is set to NULL. This does not produce a rejected record.

Load Data – List Credentials

The PL/SQL package DBMS_CLOUD provides the ability to store your object storage credentials in the database using the procedure DBMS_CLOUD.CREATE_CREDENTIAL. You can list credentials from the view ALL_CREDENTIALS.

For example, to list credentials, run the following command:

SELECT credential_name, username, comments FROM all_credentials;

CREDENTIAL_NAME                                            USERNAME    
---------------------------–-----------------------------  --------------------
COMMENTS
---------------------------–-----------------------------  --------------------
ADB_TOKEN                                                  user_name@example.com
{"comments":"Created via DBMS_CLOUD.create_credential"}
DEF_CRED_NAME                                              user_name@example.com
{"comments":"Created via DBMS_CLOUD.create_credential"}
 

See ALL_CREDENTIALS for more information.

Load Data – Delete Credentials

The PL/SQL package DBMS_CLOUD provides the ability to store your object storage credentials in the database using the procedure DBMS_CLOUD.CREATE_CREDENTIAL. You remove credentials with DBMS_CLOUD.DROP_CREDENTIAL.

For example, to remove the credential named DEF_CRED_NAME, run the following command:

BEGIN
   DBMS_CLOUD.DROP_CREDENTIAL('DEF_CRED_NAME');
END;

For more information about the DBMS_CLOUD procedures and parameters, see Summary of DBMS_CLOUD Subprograms.

Import Data Using Oracle Data Pump on Autonomous Transaction Processing

Oracle Data Pump offers very fast bulk data and metadata movement between Oracle databases and Autonomous Transaction Processing.

Data Pump Import lets you import data from Data Pump files residing on Oracle Cloud Infrastructure Object Storage, Microsoft Azure, AWS S3, and Oracle Cloud Infrastructure Object Storage Classic. You can save your data to your Cloud Object Store and use Oracle Data Pump to load data to Autonomous Transaction Processing.

Export Your Existing Oracle Database to Import into Autonomous Transaction Processing

You need to use Oracle Data Pump Export to export your existing Oracle Database schemas to migrate them to Autonomous Transaction Processing using Oracle Data Pump Import.

Oracle recommends using the following Data Pump Export parameters for faster and easier migration to Autonomous Transaction Processing:

exclude=cluster,db_link
parallel=n
schemas=schema_name
dumpfile=export%u.dmp

Oracle Data Pump Export provides several export modes, Oracle recommends using the schema mode for migrating to Autonomous Transaction Processing. You can list the schemas you want to export by using the schemas parameter.

For a faster migration, export your schemas into multiple Data Pump files and use parallelism. You can specify the dump file name format you want to use with the dumpfile parameter. Set the parallel parameter to at least the number of CPUs you have in your Autonomous Transaction Processing database.

The exclude and data_options parameters ensure that the object types not available in Autonomous Transaction Processing are not exported and table partitions are grouped together so that they can be imported faster during the import to Autonomous Transaction Processing.

The following example exports the SH schema from a source Oracle Database for migration to an Autonomous Transaction Processing database with 16 CPUs:

expdp sh/sh@orcl \
exclude=cluster,db_link \
parallel=16 \
schemas=sh \
dumpfile=export%u.dmp

You can use other Data Pump Export parameters, like compression, depending on your requirements. For more information on Oracle Data Pump Export see Oracle Database Utilities.

Import Data Using Oracle Data Pump Version 18.3 or Later

Oracle recommends using the latest Oracle Data Pump version for importing data from Data Pump files into your Autonomous Transaction Processing as it contains enhancements and fixes for a better experience.

Download the latest version of Oracle Instant Client, which includes Oracle Data Pump, for your platform from Oracle Instant Client Downloads. See the installation instructions on the platform install download page for the installation steps required after you download Oracle Instant Client.

In Oracle Data Pump version 18.3 and later, the credential argument authenticates Data Pump to the Cloud Object Storage service you are using for your source files. The dumpfile argument is a comma delimited list of URLs for your Data Pump files.

In Oracle Data Pump, if your source files reside on the Oracle Cloud Infrastructure Object Storage you can use the Oracle Cloud Infrastructure native URIs or the Swift URIs. See DBMS_CLOUD Package File URI Formats for details on these file URI formats.

Importing with Oracle Data Pump and Setting credential Parameter

  1. Store your Cloud Object Storage credential using DBMS_CLOUD.CREATE_CREDENTIAL. For example:
    BEGIN
      DBMS_CLOUD.CREATE_CREDENTIAL(
        credential_name => 'DEF_CRED_NAME',
        username => 'atpc_user@example.com',
        password => 'password'
      );
    END;
    /

    For more information on the credentials for different Cloud Object Storage services, see CREATE_CREDENTIAL Procedure.

  2. Run Data Pump Import with the dumpfile parameter set to the list of file URLs on your Cloud Object Storage and the credential parameter set to the name of the credential you created in the previous step. For example:
    impdp admin/password@ATPC1_high \       
         directory=data_pump_dir \       
         credential=def_cred_name \       
         dumpfile= https://objectstorage.us-ashburn-1.oraclecloud.com/n/atpc/b/atpc_user/o/export%u.dmp \
         parallel=16 \
         transform=segment_attributes:n \
         transform=dwcs_cvt_iots:y transform=constraint_use_default_index:y \
         exclude=cluster,db_link

    For the best import performance use the HIGH database service for your import connection and set the PARALLEL parameter to the number of CPUs in your Autonomous Transaction Processing as shown in the example.

    For information on which database service name to connect to run Data Pump Import, see Managing Priorities on Autonomous Transaction Processing.

    For the dump file URL format for different Cloud Object Storage services, see DBMS_CLOUD Package File URI Formats.

    This example shows the recommended parameters for importing into your Autonomous Transaction Processing.

    In this example:

    • Storage attributes for tables are ignored.

    • Index-organized tables are converted into a regular heap tables with a primary key index.

    • Indexes created for primary key and unique key constraints will be renamed to the constraint name.

    • Clusters are excluded during Data Pump Import.

    For information on disallowed objects in Autonomous Transaction Processing, see Restrictions for SQL Commands.

    For detailed information on Oracle Data Pump Import parameters see Oracle Database Utilities.

Import Data Using Oracle Data Pump (Versions 12.2.0.1 and Earlier)

You can import data from Data Pump files into your Autonomous Transaction Processing using Data Pump client versions 12.2.0.1 and earlier by setting the default_credential parameter.

Data Pump Import versions 12.2.0.1 and earlier do not have the credential parameter. If you are using an older version of Data Pump Import you need to define a default credential property for Autonomous Transaction Processing and use the default_credential keyword in the dumpfile parameter.

Importing with Older Oracle Data Pump Versions and Setting default_credential

  1. Store your Cloud Object Storage credential using DBMS_CLOUD.CREATE_CREDENTIAL. For example:
    BEGIN
      DBMS_CLOUD.CREATE_CREDENTIAL(
        credential_name => 'DEF_CRED_NAME',
        username => 'atpc_user@example.com',
        password => 'password'
      );
    END;
    /

    For more information on the credentials for different Cloud Object Storage services, see CREATE_CREDENTIAL Procedure.

  2. Set the credential as the default credential for your Autonomous Transaction Processing, as the ADMIN user. For example:
    alter database property set default_credential = 'ADMIN.DEF_CRED_NAME'
  3. Run Data Pump Import with the dumpfile parameter set to the list of file URLs on your Cloud Object Storage, and set the default_credential keyword. For example:
    impdp admin/password@ATPC1_high \      
         directory=data_pump_dir \
         dumpfile=default_credential:https://objectstorage.us-ashburn-1.oraclecloud.com/n/atpc/b/atpc_user/o/export%u.dmp \
         parallel=16 \      
         transform=segment_attributes:n \      
         exclude=cluster,db_link
    

    For the best import performance use the HIGH database service for your import connection and set the PARALLEL parameter to the number of CPUs in your Autonomous Transaction Processing as shown in the example.

    For information on which database service name to connect to run Data Pump Import, see Managing Priorities on Autonomous Transaction Processing.

    For the dump file URL format for different Cloud Object Storage services, see DBMS_CLOUD Package File URI Formats.

    This example shows the recommended parameters for importing into your Autonomous Transaction Processing.

    In this example:

    • Storage attributes for tables are ignored.

    • Clusters and database links are excluded during Data Pump Import.

    For information on disallowed objects in Autonomous Transaction Processing, see Restrictions for SQL Commands.

    For detailed information on Oracle Data Pump Import parameters see Oracle Database Utilities.

Access Log Files for Data Pump Import

The log files for Data Pump Import operations are stored in the directory DATA_PUMP_DIR; this is the only directory you can specify for the data pump directory parameter.

To access the log file you need to move the log file to your Cloud Object Storage using the procedure DBMS_CLOUD.PUT_OBJECT. For example, the following PL/SQL block moves the file import.log to your Cloud Object Storage:

BEGIN
  DBMS_CLOUD.PUT_OBJECT(
    credential_name => 'DEF_CRED_NAME',
    object_uri => 'https://objectstorage.us-ashburn-1.oraclecloud.com/n/atpc/b/atpc_user/o/import.log',
    directory_name  => 'DATA_PUMP_DIR',
    file_name => 'import.log');
END;
/

For more information, see Summary of DBMS_CLOUD Subprograms.

Use Oracle GoldenGate to Replicate Data to Autonomous Transaction Processing

You can replicate data to Autonomous Transaction Processing using Oracle GoldenGate On Premises and Oracle GoldenGate Cloud Service.

Load Data from Local Files Using SQL*Loader

You can use Oracle SQL*Loader to load data from local files in your client machine into Autonomous Transaction Processing.

Using SQL*Loader may be suitable for loading small amounts of data, as the load performance depends on the network bandwidth between your client and Autonomous Transaction Processing. For large amounts of data Oracle recommends loading data from the Cloud Object Storage. (For information on loading from Cloud Object Store, see Load Data from Files in the Cloud).

Oracle recommends using the following SQL*Loader parameters for the best load performance:

readsize=100M
bindsize=100M
direct=N

For detailed information on SQL*Loader parameters see Oracle Database Utilities.

For loading multiple files at the same time you can invoke a separate SQL*Loader session for each file.

Autonomous Transaction Processing gathers optimizer statistics for your tables during bulk load operations if you use the recommended parameters. If you do not use the recommended parameters, then you need to gather optimizer statistics manually as explained in Manage Optimizer Statistics on Autonomous Transaction Processing or wait for the automatic statistic gathering task to kick in.

For detailed information on SQL*Loader see, Oracle Database Utilities.