8.2.27 OCI Autonomous Data Warehouse

Oracle Autonomous Data Warehouse (ADW) is a fully managed database tuned and optimized for data warehouse workloads with the market-leading performance of Oracle Database.

8.2.27.1 Detailed Functionality

The ADW Event handler is used as a downstream Event handler connected to the output of the OCI Object Storage Event handler. The OCI Event handler loads files generated by the File Writer Handler into Oracle OCI Object storage. All the SQL operations are performed in batches providing better throughput.

8.2.27.2 ADW Database Credential to Access OCI ObjectStore File

To access the OCI ObjectStore File:

  1. A PL/SQL procedure needs to be run to create a credential to access Oracle Cloud Infrastructure (OCI) Object store files.
  2. An OCI authentication token needs to be generated under User settings from the OCI console. See CREATE_CREDENTIAL in Using Oracle Autonomous Data WareHouse on Shared Exadata Infrastructure. For example:
    BEGIN  DBMS_CLOUD.create_credential
          (    credential_name =>
            'OGGBD-CREDENTIAL',    username => 'oci-user',    password =>
            'oci-user');
            END;
            /
  3. The credential name can be configured using the followng property: gg.eventhandler.adw.objectStoreCredential. For example: gg.eventhandler.adw.objectStoreCredential=OGGBD-CREDENTIAL.

8.2.27.3 ADW Database User Privileges

ADW databases come with a predefined database role named DWROLE. If the ADW 'admin' user is not being used, then the database user needs to be granted the role DWROLE.

This role provides the privileges required for data warehouse operations. For example, the following command grants DWROLE to the user dbuser-1:

GRANT DWROLE TO dbuser-1;

Note:

Ensure that you do not use Oracle-created database user ggadmin for ADW replication, because this user lacks the INHERIT privilege.

8.2.27.4 Unsupported Operations/ Limitations

  • DDL changes are not supported.
  • Replication of Oracle Object data types are not supported.
  • If the GoldenGate trail is generated by Oracle Integrated capture, then for the UPDATE operations on the source LOB column, only the changed portion of the LOB is written to the trail file. Oracle GoldenGate for Big Data Autonomous Data Warehouse (ADW) apply doesn't support replication of partial LOB columns in the trail file.

8.2.27.5 Troubleshooting and Diagnostics

  • Connectivity Issues to ADW
  • DDL not applied on the target table: The ADW handler will ignore DDL.
  • Target table existence: It is expected that the ADW target table exists before starting the apply process. Target tables need to be designed with appropriate primary keys, indexes and partitions. Approximations based on the column metadata in the trail file may not be always correct. Therefore, replicat will ABEND if the target table is missing.
  • Diagnostic throughput information on the apply process is logged into the handler log file.

    For example:

    File Writer finalized 29525834 records
            (rate: 31714) (start time: 2020-02-10 01:25:32.000579) (end time: 2020-02-10
            01:41:03.000606).

    In this sample log message:

    • This message provides details about the end-end throughput of File Writer handler and the downstream event handlers (OCI Event handler and ADW event handler).
    • The throughput rate also takes into account the wait-times incurred before rolling over files.
    • The throughput rate also takes into account the time taken by the OCI event handler and the ADW event handler to process operations.
    • The above examples indicates that 29525834 operations were finalized at the rate of 31714 operations per second between start time: [2020-02-10 01:25:32.000579] and end time: [2020-02-10 01:41:03.000606].
    Example:
     
    INFO 2019-10-01 00:36:49.000490 [pool-8-thread-1] – Begin DWH Apply stage and load statistics
    ********START*********************************
         
    INFO 2019-10-01 00:36:49.000490 [pool-8-thread-1] - Time spent for staging process [2074 ms] 
    INFO 2019-10-01 00:36:49.000490 [pool-8-thread-1] - Time spent for merge process [992550 ms] 
    INFO 2019-10-01 00:36:49.000490 [pool-8-thread-1] - [31195516] operations processed, rate[31,364]operations/sec. 
            
    INFO 2019-10-01 00:36:49.000490 [pool-8-thread-1] – End DWH Apply stage and load statistics 
    ********END*********************************** 
    INFO 2019-10-01 00:37:18.000230 [pool-6-thread-1] – Begin OCI Event handler upload statistics 
    ********START********************************* 
    INFO 2019-10-01 00:37:18.000230 [pool-6-thread-1] - Time spent loading files into ObjectStore [71789 ms]
    INFO 2019-10-01 00:37:18.000230 [pool-6-thread-1] - [31195516] operations processed, rate[434,545] operations/sec. 
    INFO 2019-10-01 00:37:18.000230 [pool-6-thread-1] – End OCI Event handler upload statistics 
    ********END***********************************

    In this example:

    ADW Event handler throughput:

    • In the above log message, the statistics for the ADW event handler is reported as DWH Apply stage and load statistics. ADW is classified as a Data Ware House (DWH), and therefore, this name.
    • Here 31195516 operations from the source trail file were applied to ADW database at the rate of 31364 operations per second.
    • ADW uses stage and merge. The time spent on staging is 2074 milliseconds and the time spent on executing merge SQL is 992550 milliseconds.
    OCI Event handler throughput:
    • In the above log message, the statistics for the OCI event handler is reported as OCI Event handler upload statistics.
    • Here 31195516 operations from the source trail file were uploaded to the OCI object store at the rate of 434545 operations per second.
  • Errors due to ADW credential missing grants to read OCI object store files:
    • A SQL exception indicating authorization failure is logged in the handler log file. For example:
      java.sql.SQLException: ORA-20401: 
      Authorization failed for URI - 
      https://objectstorage.us-ashburn-1.oraclecloud.com/n/some_namespace/b/some_bucket/o/ADMIN.NLS_AllTypes/ADMIN.NLS_AllTypes_2019-12-16_11-44-01.237.avro
  • Errors in file format/column data:

    In case the ADW Event handler is unable to read data from the external staging table due to column data errors, the Oracle GoldenGate for Big Data handler log file provides diagnostic information to debug the issue.

    The following details are available in the log file:

    • JOB ID
    • SID
    • SERIAL #
    • ROWS_LOADED
    • START_TIME
    • UPDATE_TIME
    • STATUS
    • TABLE_NAME
    • OWNER_NAME
    • FILE_URI_LIST
    • LOGFILE_TABLE
    • BADFILE_TABLE

    The contents of the LOGFILE_TABLE and BADFILE_TABLE should indicate the specific record and the column(s) in the record which have error and the cause of the error. This information is also queried automatically by the ADW Event handler and logged into the OGGBD FW handler log file. Based on the root cause of the error, customer can take action. In many cases, customers would have to modify the target table definition based on the source column data types and restart replicat. In other cases, customers may also want to modify the mapping in the replicat prm file. For this, Oracle recommends that they re-position replicat to start from the beginning.

  • Any other SQL Errors:

    In case there are any errors while executing any SQL, the entire SQL statement along with the bind parameter values are logged into the OGGBD handler log file.

  • Co-existence of the components:

    The location/region of the machine where replicat process is running, OCI Objects storage bucket region and the ADW region would impact the overall throughput of the apply process. Data flow is as follows: GoldenGate  OCI Object store  ADW. For best throughput, the components need to located as close as possible.

  • Debugging row count mismatch on the target table

    For better throughput, ADW event handler does not validate the row counts modified on the target table. We can enable row count matching by using the Java System property: disable.row.count.validation. To enable row count validation, provide this property in the jvm.bootoptions as follows: jvm.bootoptions=-Xmx512m -Xms32m -Djava.class.path=.:ggjava/ggjava.jar:./dirprm -Ddisable.row.count.validation=false

  • Replicat ABEND due to partial LOB records in the trail file:

    Oracle GoldenGate for Big Data ADW apply does not support replication of partial LOB. The trail file needs to be regenerated by Oracle Integrated capture using TRANLOGOPTIONS FETCHPARTIALLOB option in the extract parameter file.

  • Throughput gain with uncompressed UPDATE trails:

    If the source trail files contain the full image (all the column values of the respective table) of the row being updated, then you can include the JVM boot option -Dcompressed.update=false in the configuration property jvm.bootoptions.

    For certain workloads and ADW instance shapes, this configuration may provide a better throughput. You may need to test the throughput gain on your environment.

8.2.27.6 Classpath

ADW apply relies on the upstream File Writer handler and the OCI Event handler. Include the required jars needed to run the OCI Event handler in gg.classpath.

ADW Event handler uses the Oracle JDBC driver and its dependencies. The Autonomous Data Warehouse JDBC driver and other required dependencies are packaged with Oracle GoldenGate for Big Data.

For example: gg.classpath=./oci-java-sdk/lib/*:./oci-java-sdk/third-party/lib/*

8.2.27.7 Configuration

8.2.27.7.1 Automatic Configuration

Autonomous Data Warehouse (ADW) replication involves configuring of multiple components, such as file writer handler, OCI event handler and ADW event handler.

The Automatic Configuration functionality helps to auto configure these components so that the user configuration is minimal. The properties modified by auto configuration will also be logged in the handler log file.

To enable auto configuration to replicate to ADW target we need to set the parameter

gg.target=adw

gg.target
Required
Legal Value: adw
Default:  None
Explanation: Enables replication to ADW target

When replicating to ADW target, customization of OCI event hander name and ADW event handler name is not allowed.

8.2.27.7.2 File Writer Handler Configuration

File writer handler name is pre-set to the value adw. The following is an example to edit a property of file writer handler: gg.handler.adw.pathMappingTemplate=./dirout

8.2.27.7.3 OCI Event Handler Configuration

OCI event handler name is pre-set to the value ‘oci’.

The following is an example to edit a property of the OCI event handler: gg.eventhandler.oci.profile=DEFAULT

8.2.27.7.4 ADW Event Handler Configuration

ADW event handler name is pre-set to the value adw.

The following are the ADW event handler configurations:

Property Required/Optional Legal Values Default Explanationtes
gg.eventhandler.adw.connectionURL Required ADW None Sets the ADW JDBC connection URL. Example: jdbc:oracle:thin:@adw20190410ns_medium?TNS_ADMIN=/home/sanav/projects/adw/wallet
gg.eventhandler.adw.UserName Required JDBC User name None Sets the ADW database user name.
gg.eventhandler.adw.Password Required JDBC Password None Sets the ADW database password.
gg.eventhandler.adw.maxStatements Optional Integer value between 1 to 250. The default value is 250. Use this parameter to control the number of prepared SQL statements that can be used.
gg.eventhandler.adw.maxConnnections Optional Integer value. 10 Use this parameter to control the number of concurrent JDBC database connections to the target ADW database.
gg.eventhandler.adw.dropStagingTablesOnShutdown Optional true | false false If set to true, the temporary staging tables created by the ADW event handler is dropped on replicat graceful stop.
gg.eventhandler.adw.objectStoreCredential Required A database credential name. None ADW Database credential to access OCI object-store files.
gg.initialLoad Optional true | false false If set to true, initial load mode is enabled. See INSERTALLRECORDS Support.
gg.operation.aggregator.validate.keyupdate Optional true or false false If set to true, Operation Aggregator will validate key update operations (optype 115) and correct to normal update if no key values have changed. Compressed key update operations do not qualify for merge.
gg.compressed.update Optional true or false true If set the true, then this indicates that the source trail files contain compressed update operations. If set to true, then the source trail files are expected to contain uncompressed update operations.
gg.eventhandler.adw.connectionRetries Optional Integer Value 3 Specifies the number of times connections to the target data warehouse will be retried.
gg.eventhandler.adw.connectionRetryIntervalSeconds Optional Integer Value 30 Specifies the delay in seconds between connection retry attempts.

8.2.27.7.5 INSERTALLRECORDS Support

Stage and merge targets supports INSERTALLRECORDS parameter.

See INSERTALLRECORDS in Reference for Oracle GoldenGate. Set the INSERTALLRECORDS parameter in the Replicat parameter file (.prm). Set the INSERTALLRECORDS parameter in the Replicat parameter file (.prm)

Setting this property directs the Replicat process to use bulk insert operations to load operation data into the target table.

You can tune the batch size of bulk inserts using the File writer property gg.handler.adw.maxFileSize. The default value is set to 1GB. The frequency of bulk inserts can be tuned using the File writer property gg.handler.adw.fileRollInterval, the default value is set to 3m (three minutes).
To process initial load trail files, set the INSERTALLRECORDS parameter in the Replicat parameter file (.prm). Setting this property directs the Replicat process to use bulk insert operations to load operation data into the target table.

You can tune the batch size of bulk inserts using the File Writer property gg.handler.adw.maxFileSize. The default value is set to 1GB. The frequency of bulk inserts can be tuned using the File Writer property gg.handler.adw.fileRollInterval, the default value is set to 3m (three minutes).

8.2.27.7.6 End-to-End Configuration

The following is an end-end configuration example which uses auto configuration for FW handler, OCI and ADW Event handlers. The sample properties file is available at the following location:
  • In an Oracle GoldenGate Classic install: <oggbd_install_dir>/AdapterExamples/big-data/adw-via-oci/adw.props.
  • In an Oracle GoldenGate Microservices install: <oggbd_install_dir>/opt/AdapterExamples/big-data/adw-via-oci/adw.props.
# Configuration to load GoldenGate trail operation records
# into Autonomous Data Warehouse (ADW) by chaining
# File writer handler -> OCI Event handler -> ADW Event handler.
# Note: Recommended to only edit the configuration marked as TODO
gg.target=adw
##The OCI Event handler
# TODO: Edit the OCI config file path.
gg.eventhandler.oci.configFilePath=<path/to/oci/config>
# TODO: Edit the OCI profile name.
gg.eventhandler.oci.profile=DEFAULT
# TODO: Edit the OCI namespace.
gg.eventhandler.oci.namespace=<OCI namespace>
# TODO: Edit the OCI region.
gg.eventhandler.oci.region=<oci-region>
# TODO: Edit the OCI compartment identifier.
gg.eventhandler.oci.compartmentID=<OCI compartment id>
gg.eventhandler.oci.pathMappingTemplate=${fullyQualifiedTableName}
# TODO: Edit the OCI bucket name.
gg.eventhandler.oci.bucketMappingTemplate=<ogg-bucket>
##The ADW Event Handler
# TODO: Edit the ADW JDBC connectionURL
gg.eventhandler.adw.connectionURL=jdbc:oracle:thin:@adw20190410ns_medium?TNS_ADMIN=/path/to/ /adw/wallet
# TODO: Edit the ADW JDBC user
gg.eventhandler.adw.UserName=<db user>
# TODO: Edit the ADW JDBC password
gg.eventhandler.adw.Password=<db password>
# TODO: Edit the ADW Credential that can access the OCI Object Store.
gg.eventhandler.adw.objectStoreCredential=<ADW Object Store credential>
# TODO:Set the classpath to include OCI Java SDK.
gg.classpath=./oci-java-sdk/lib/*:./oci-java-sdk/third-party/lib/*
#TODO: Edit to provide sufficient memory (at least 8GB).
jvm.bootoptions=-Xmx8g -Xms8g

8.2.27.7.7 Compressed Update Handling

A compressed update record contains values for the key columns and the modified columns.

An uncompressed update record contains values for all the columns.

Oracle GoldenGate trails may contain compressed or uncompressed update records. The default extract configuration writes compressed updates to the trails.

The parameter gg.compressed.update can be set to true or false to indicate compressed/uncompressed update records.

8.2.27.7.7.1 MERGE Statement with Uncompressed Updates

In some use cases, if the trail contains uncompressed update records, then the MERGE SQL statement can be optimized for better performance by setting gg.compressed.update=false.