Replicate data from Autonomous AI Transaction Processing to Apache Iceberg

Discover how to replicate data from Autonomous AI Transaction Processing to Apache Iceberg in OCI GoldenGate.

Before you begin

To successfully complete this quickstart, you must have:

Environment set up: Autonomous AI Transaction Processing

If you don’t already have a source database set up for replication, you can follow these steps to load a sample schema to use for this quickstart. This quickstart uses Autonomous AI Transaction Processing for the source database.

To set up the source Autonomous AI Transaction Processing:

  1. Download and unzip the sample database schema.

  2. In the Oracle Cloud console, select your Autonomous AI Transaction Processing (ATP) instance from the Autonomous AI Databases page to view its details and access Database Actions.

  3. Unlock the GGADMIN user:

    1. Select Database actions, then select Database Users.

    2. Locate GGADMIN and then select its ellipsis menu (three dots) and select Edit.

    3. In the Edit User panel, enter the GGADMIN password, confirm the password, and then deselect Account is Locked.

    4. Select Apply Changes.

  4. Load the source sample schema and data:

    1. From the Database actions menu, under Development, select SQL.

    2. Copy and paste the script from OCIGGLL_OCIGGS_SETUP_USERS_ATP.sql into the SQL worksheet.

    3. Select Run Script. The Script Output tab displays confirmation messages.

    4. Clear the SQL worksheet and then copy and paste the SQL script from OCIGGLL_OCIGGS_SRC_USER_SEED_DATA.sql.

      Tip: You may need to run each statement separately for the SQL tool to execute the scripts successfully.

    5. To verify that the tables were created successfully, close the SQL window and reopen it again. In the Navigator tab, look for the SRC_OCIGGLL schema and then select tables from their respective dropdowns.

  5. Enable supplemental logging:

    1. Clear the SQL Worksheet.

    2. Enter the following statement, and then select Run Statement:

      ALTER PLUGGABLE DATABASE ADD SUPPLEMENTAL LOG DATA;

Task 1: Create the resources

This quickstart example requires deployments and connections for both the source and target.

  1. Create an Oracle deployment for the source Autonomous AI Transaction Processing instance.

  2. Create a Big Data deployment for the Apache Iceberg target.

  3. Create an Autonomous AI Transaction Processing connection.

  4. Create an Apache Iceberg connection.

  5. Create a GoldenGate server connection and assign it to the Oracle deployment.

  6. Assign the Autonomous AI Transaction Processing connection to the Oracle deployment.

  7. Assign the Apache Iceberg connection to the Big Data deployment.

Task 2: Add the Extract

  1. On the Deployments page, select the source Autonomous AI Transaction Processing deployment.

  2. On the deployment details page, select Launch Console.

  3. Log in with the source deployment's administrator username and password.

  4. Add an Extract.

Task 3: Add and run the Distribution Path

  1. If using GoldenGate credential store, create a user for the Distribution Path in the target Big Data deployment, otherwise skip to Step 3.

  2. In the source GoldenGate deployment console, add a Path Connection for the user created in Step 1.

    1. In the source GoldenGate deployment console, select Path Connections in the left navigation.

    2. Select Add Path Connection (plus icon), and then complete the following:

      1. For Credential Alias, enter GGSNetwork.

      2. For User ID, enter the name of the user created in Step 1.

      3. Enter the user's password twice for verification.

    3. Select Submit.

      The path connection appears in the Path Connections list.

  3. In the source deployment console, add a Distribution Path with the following values:

    1. On the Source Options page:

      • For Source Extract, select the Extract created in Task 2.

      • For Trail Name, enter a two-character name, such as E1.

    2. On the Target Options page:

      • For Target Host, enter the host domain of the target deployment.

      • For Port Number, enter 443.

      • For Trail Name, enter a two-character name, such as E1.

      • For Alias, enter the Credential Alias created in Step 2.

  4. In the target Big Data deployment console, review the Receiver Path created as a result of the Distribution Path.

    1. In the target Big Data deployment console, select Receiver Service.

    2. Review the path details. This path was created as a result of the Distribution Path created in the previous step.

Task 4: Add and run the Replicat

To add and run a Replicat:

  1. In the target Big Data deployment console navigation menu, select Replicats, then Add Replicat (plus icon).

  2. In the Add Replicat panel, on the Replicat Information page, complete the fields as needed, and then select Next:

    • For Replicat Type, select Classic Replicat

    • Enter a Process Name, no more than 5 characters long.

    • Enter a Description, to help distinguish this process from others.

  3. On the Replicat Options page, complete the fields as needed, and then select Next:

    1. For Replicat Trail, enter the Extract trail name.

    2. For Target, select Apache Iceberg.

    3. For Format, select the format you want to ingest to Apache Iceberg. The default is Parquet.

    4. For Available Alias, select the Apache Iceberg connection from the dropdown.

  4. On the Managed Options page, leave the default settings and select Next.

  5. On the Replicat Parameters page, leave the default settings, and select Next.

  6. On the Replicat Properties page, update the fields marked, TODO, and then select Create and Run.

    See Apache Iceberg target details for more information.

Task 5: Verify the replication

To verify the replication, perform updates to the source ATP instance.

  1. In the Oracle Cloud console, open the navigation menu, select Oracle AI Database, and then select Autonomous AI Transaction Processing.

  2. In the list of Autonomous AI Transaction Processing instances, select your source instance to view its details.

  3. On the database details page, select Database actions.

    Note: You should be automatically logged in. If not, log in with the database credentials.

  4. On the Database actions home page, select SQL.

  5. Enter the following into the worksheet and select Run Script.

    Insert into SRC_OCIGGLL.SRC_CITY (CITY_ID,CITY,REGION_ID,POPULATION) values (1000,'Houston',20,743113);
    Insert into SRC_OCIGGLL.SRC_CITY (CITY_ID,CITY,REGION_ID,POPULATION) values (1001,'Dallas',20,822416);
    Insert into SRC_OCIGGLL.SRC_CITY (CITY_ID,CITY,REGION_ID,POPULATION) values (1002,'San Francisco',21,157574);
    Insert into SRC_OCIGGLL.SRC_CITY (CITY_ID,CITY,REGION_ID,POPULATION) values (1003,'Los Angeles',21,743878);
    Insert into SRC_OCIGGLL.SRC_CITY (CITY_ID,CITY,REGION_ID,POPULATION) values (1004,'San Diego',21,840689);
    Insert into SRC_OCIGGLL.SRC_CITY (CITY_ID,CITY,REGION_ID,POPULATION) values (1005,'Chicago',23,616472);
    Insert into SRC_OCIGGLL.SRC_CITY (CITY_ID,CITY,REGION_ID,POPULATION) values (1006,'Memphis',23,580075);
    Insert into SRC_OCIGGLL.SRC_CITY (CITY_ID,CITY,REGION_ID,POPULATION) values (1007,'New York City',22,124434);
    Insert into SRC_OCIGGLL.SRC_CITY (CITY_ID,CITY,REGION_ID,POPULATION) values (1008,'Boston',22,275581);
    Insert into SRC_OCIGGLL.SRC_CITY (CITY_ID,CITY,REGION_ID,POPULATION) values (1009,'Washington D.C.',22,688002);
  6. In the source ATP deployment console, select the Extract name, and then select Statistics. Verify that SRC_OCIGGLL.SRC_CITY has 10 inserts.

  7. In the target Big Data OCI GoldenGate deployment console, select the Replicat name, and then select Statistics. Verify that SRC_OCIGGLL.SRC_CITY has 10 inserts.

  8. In the Oracle Cloud console, navigate to the Oracle Object Storage bucket and check its contents.