Discover OCI GoldenGate Data Transforms
Learn how to use OCI GoldenGate Data Replication and Data Transforms deployments together to load and transform data between two Autonomous AI Databases.
Before you begin
To successfully complete this quickstart you need:
-
A source Autonomous AI Database with sample data loaded, and supplemental logging enabled.
Tip:
You can download the OCI GoldenGate sample data, if you need sample data to work with.
-
Use the Autonomous AI Database's Database actions SQL tool to run the two scripts to create the user schema and tables.
-
Use the SQL tool to enable supplemental logging.
Follow the steps in Lab 1, Task 3: Load the ATP schema for more details.
-
-
Unlock the GGADMIN user on the source Autonomous AI Database instance
-
On the Autonomous AI Database Details page, select Database Users from the Database actions menu.
Tip: Use the Autonomous AI Database administrator credentials provided when you created the instance to log in, if prompted.
-
Locate the GGADMIN user, and then select Edit from its ellipsis (three dots) menu.
-
In the Edit User panel, enter a password, confirm that password, and then deselect Account is Locked.
-
Select Apply Changes.
-
Task 1: Set up the environment
-
Create a source OracleAutonomous AI Transaction Processing(ATP) connection.
-
Use the SQL tool to enable supplemental logging:
ALTER DATABASE ADD SUPPLEMENTAL LOG DATA -
Run the following query in the SQL tool to ensure that
support_mode=FULLfor all tables in the source database:select * from DBA_GOLDENGATE_SUPPORT_MODE where owner = 'SRC_OCIGGLL';
Task 2: Create the Integrated Extract
An Integrated Extract captures ongoing changes to source database.
-
On the deployment Details page, select Launch console.
-
If required, enter oggadmin for User name and the password you used when creating the deployment, and then select Sign In.
-
Add Transaction Data and a Checkpoint Table:
-
Open the navigation menu and then select DB Connections.
-
Select Connect to database SourceDB.
-
In the navigation menu, select Trandata, and then select Add Trandata (plus icon).
-
For Schema Name, enter
SRC_OCIGGLL, and then select Submit. -
To verify, enter
SRC_OCIGGLLinto the Search field and select Search. -
Open the navigation menu and then select DB Connections.
-
Select Connect to database TargetDB.
-
In the navigation menu, select Checkpoint, and then select Add Checkpoint (plus icon).
-
For Checkpoint Table, enter
"SRCMIRROR_OCIGGLL"."CHECKTABLE", and then select Submit.
-
-
Note: See additional extract parameter options for more information about parameters that you can use to specify source tables.
On the Extract Parameters page, append the following lines under
EXTTRAIL <trail-name>:-- Capture DDL operations for listed schema tables ddl include mapped -- Add step-by-step history of -- to the report file. Very useful when troubleshooting. ddloptions report -- Write capture stats per table to the report file daily. report at 00:01 -- Rollover the report file weekly. Useful when IE runs -- without being stopped/started for long periods of time to -- keep the report files from becoming too large. reportrollover at 00:01 on Sunday -- Report total operations captured, and operations per second -- every 10 minutes. reportcount every 10 minutes, rate -- Table list for capture table SRC_OCIGGLL.*; -
Check for long running transactions. Run the following script on your source database:
select start_scn, start_time from gv$transaction where start_scn < (select max(start_scn) from dba_capture);If the query returns any rows, then you must locate the transaction's SCN and then either commit or rollback the transaction.
Task 3: Export data using Oracle Data Pump (ExpDP)
Use Oracle Data Pump (ExpDP) to export data from the source database to Oracle Object Store.
-
Create an Oracle Object Store bucket.
Take note of the namespace and bucket name for use with the Export and Import scripts.
-
Create an Auth Token, and then copy and paste the token string to a text editor for later use.
-
Create a credential in your source database, replacing the
<user-name>and<token>with your Oracle Cloud account username and the token string you created in the previous step:BEGIN DBMS_CLOUD.CREATE_CREDENTIAL( credential_name => 'ADB_OBJECTSTORE', username => '<user-name>', password => '<token>' ); END; -
Run the following script in your source database to create the Export Data job. Ensure that you replace the
<region>,<namespace>, and<bucket-name>in Object Store URI accordingly.SRC_OCIGGLL.dmpis a file that will be created when this script runs.DECLARE ind NUMBER; -- Loop index h1 NUMBER; -- Data Pump job handle percent_done NUMBER; -- Percentage of job complete job_state VARCHAR2(30); -- To keep track of job state le ku$_LogEntry; -- For WIP and error messages js ku$_JobStatus; -- The job status from get_status jd ku$_JobDesc; -- The job description from get_status sts ku$_Status; -- The status object returned by get_status BEGIN -- Create a (user-named) Data Pump job to do a schema export. h1 := DBMS_DATAPUMP.OPEN('EXPORT','SCHEMA',NULL,'SRC_OCIGGLL_EXPORT','LATEST'); -- Specify a single dump file for the job (using the handle just returned) -- and a directory object, which must already be defined and accessible -- to the user running this procedure. DBMS_DATAPUMP.ADD_FILE(h1,'https://objectstorage.<region>.oraclecloud.com/n/<namespace>/b/<bucket-name>/o/SRC_OCIGGLL.dmp','ADB_OBJECTSTORE','100MB',DBMS_DATAPUMP.KU$_FILE_TYPE_URIDUMP_FILE,1); -- A metadata filter is used to specify the schema that will be exported. DBMS_DATAPUMP.METADATA_FILTER(h1,'SCHEMA_EXPR','IN (''SRC_OCIGGLL'')'); -- Start the job. An exception will be generated if something is not set up properly. DBMS_DATAPUMP.START_JOB(h1); -- The export job should now be running. In the following loop, the job -- is monitored until it completes. In the meantime, progress information is displayed. percent_done := 0; job_state := 'UNDEFINED'; while (job_state != 'COMPLETED') and (job_state != 'STOPPED') loop dbms_datapump.get_status(h1,dbms_datapump.ku$_status_job_error + dbms_datapump.ku$_status_job_status + dbms_datapump.ku$_status_wip,-1,job_state,sts); js := sts.job_status; -- If the percentage done changed, display the new value. if js.percent_done != percent_done then dbms_output.put_line('*** Job percent done = ' \|\| to_char(js.percent_done)); percent_done := js.percent_done; end if; -- If any work-in-progress (WIP) or error messages were received for the job, display them. if (bitand(sts.mask,dbms_datapump.ku$_status_wip) != 0) then le := sts.wip; else if (bitand(sts.mask,dbms_datapump.ku$_status_job_error) != 0) then le := sts.error; else le := null; end if; end if; if le is not null then ind := le.FIRST; while ind is not null loop dbms_output.put_line(le(ind).LogText); ind := le.NEXT(ind); end loop; end if; end loop; -- Indicate that the job finished and detach from it. dbms_output.put_line('Job has completed'); dbms_output.put_line('Final job state = ' \|\| job_state); dbms_datapump.detach(h1); END;
Task 4: Instantiate the target database using Oracle Data Pump (ImpDP)
Use Oracle Data Pump (ImpDP) to import data into the target database from the SRC_OCIGGLL.dmp that was exported from the source database.
-
Create a credential in your target database to access Oracle Object Store (using the same information in the preceding section).
BEGIN DBMS_CLOUD.CREATE_CREDENTIAL( credential_name => 'ADB_OBJECTSTORE', username => '<user-name>', password => '<token>' ); END; -
Run the following script in your target database to import data from the
SRC_OCIGGLL.dmp. Ensure that you replace the<region>,<namespace>, and<bucket-name>in Object Store URI accordingly:DECLARE ind NUMBER; -- Loop index h1 NUMBER; -- Data Pump job handle percent_done NUMBER; -- Percentage of job complete job_state VARCHAR2(30); -- To keep track of job state le ku$_LogEntry; -- For WIP and error messages js ku$_JobStatus; -- The job status from get_status jd ku$_JobDesc; -- The job description from get_status sts ku$_Status; -- The status object returned by get_status BEGIN -- Create a (user-named) Data Pump job to do a "full" import (everything -- in the dump file without filtering). h1 := DBMS_DATAPUMP.OPEN('IMPORT','FULL',NULL,'SRCMIRROR_OCIGGLL_IMPORT'); -- Specify the single dump file for the job (using the handle just returned) -- and directory object, which must already be defined and accessible -- to the user running this procedure. This is the dump file created by -- the export operation in the first example. DBMS_DATAPUMP.ADD_FILE(h1,'https://objectstorage.<region>.oraclecloud.com/n/<namespace>/b/<bucket-name>/o/SRC_OCIGGLL.dmp','ADB_OBJECTSTORE',null,DBMS_DATAPUMP.KU$_FILE_TYPE_URIDUMP_FILE); -- A metadata remap will map all schema objects from SRC_OCIGGLL to SRCMIRROR_OCIGGLL. DBMS_DATAPUMP.METADATA_REMAP(h1,'REMAP_SCHEMA','SRC_OCIGGLL','SRCMIRROR_OCIGGLL'); -- If a table already exists in the destination schema, skip it (leave -- the preexisting table alone). This is the default, but it does not hurt -- to specify it explicitly. DBMS_DATAPUMP.SET_PARAMETER(h1,'TABLE_EXISTS_ACTION','SKIP'); -- Start the job. An exception is returned if something is not set up properly. DBMS_DATAPUMP.START_JOB(h1); -- The import job should now be running. In the following loop, the job is -- monitored until it completes. In the meantime, progress information is -- displayed. Note: this is identical to the export example. percent_done := 0; job_state := 'UNDEFINED'; while (job_state != 'COMPLETED') and (job_state != 'STOPPED') loop dbms_datapump.get_status(h1, dbms_datapump.ku$_status_job_error + dbms_datapump.ku$_status_job_status + dbms_datapump.ku$_status_wip,-1,job_state,sts); js := sts.job_status; -- If the percentage done changed, display the new value. if js.percent_done != percent_done then dbms_output.put_line('*** Job percent done = ' \|\| to_char(js.percent_done)); percent_done := js.percent_done; end if; -- If any work-in-progress (WIP) or Error messages were received for the job, display them. if (bitand(sts.mask,dbms_datapump.ku$_status_wip) != 0) then le := sts.wip; else if (bitand(sts.mask,dbms_datapump.ku$_status_job_error) != 0) then le := sts.error; else le := null; end if; end if; if le is not null then ind := le.FIRST; while ind is not null loop dbms_output.put_line(le(ind).LogText); ind := le.NEXT(ind); end loop; end if; end loop; -- Indicate that the job finished and gracefully detach from it. dbms_output.put_line('Job has completed'); dbms_output.put_line('Final job state = ' \|\| job_state); dbms_datapump.detach(h1); END;
Task 5: Add and run a Non-integrated Replicat
-
On the Parameter File screen, replace
MAP *.*, TARGET *.*;with the following script:-- Capture DDL operations for listed schema tables ddl include mapped -- Add step-by-step history of ddl operations captured -- to the report file. Very useful when troubleshooting. ddloptions report -- Write capture stats per table to the report file daily. report at 00:01 -- Rollover the report file weekly. Useful when PR runs -- without being stopped/started for long periods of time to -- keep the report files from becoming too large. reportrollover at 00:01 on Sunday -- Report total operations captured, and operations per second -- every 10 minutes. reportcount every 10 minutes, rate -- Table map list for apply DBOPTIONS ENABLE_INSTANTIATION_FILTERING; MAP SRC_OCIGGLL.*, TARGET SRCMIRROR_OCIGGLL.*;Note:
DBOPTIONS ENABLE_INSTANTIATION_FILTERINGenables CSN filtering on tables imported using Oracle Data Pump. For more information, see DBOPTIONS Reference. -
Perform Inserts to the source database:
-
Return to the Oracle Cloud console and use the navigation menu to navigate back to Oracle AI Database, Autonomous AI Transaction Processing, and then SourceDB.
-
On the SourceDB Details page, select Database actions, and then select SQL.
-
Enter the following inserts, and then select Run Script:
Insert into SRC_OCIGGLL.SRC_CITY (CITY_ID,CITY,REGION_ID,POPULATION) values (1000,'Houston',20,743113); Insert into SRC_OCIGGLL.SRC_CITY (CITY_ID,CITY,REGION_ID,POPULATION) values (1001,'Dallas',20,822416); Insert into SRC_OCIGGLL.SRC_CITY (CITY_ID,CITY,REGION_ID,POPULATION) values (1002,'San Francisco',21,157574); Insert into SRC_OCIGGLL.SRC_CITY (CITY_ID,CITY,REGION_ID,POPULATION) values (1003,'Los Angeles',21,743878); Insert into SRC_OCIGGLL.SRC_CITY (CITY_ID,CITY,REGION_ID,POPULATION) values (1004,'San Diego',21,840689); Insert into SRC_OCIGGLL.SRC_CITY (CITY_ID,CITY,REGION_ID,POPULATION) values (1005,'Chicago',23,616472); Insert into SRC_OCIGGLL.SRC_CITY (CITY_ID,CITY,REGION_ID,POPULATION) values (1006,'Memphis',23,580075); Insert into SRC_OCIGGLL.SRC_CITY (CITY_ID,CITY,REGION_ID,POPULATION) values (1007,'New York City',22,124434); Insert into SRC_OCIGGLL.SRC_CITY (CITY_ID,CITY,REGION_ID,POPULATION) values (1008,'Boston',22,275581); Insert into SRC_OCIGGLL.SRC_CITY (CITY_ID,CITY,REGION_ID,POPULATION) values (1009,'Washington D.C.',22,688002); -
In the OCI GoldenGate Deployment Console, select the Extract name (UAEXT), and then select Statistics. Verify that SRC_OCIGGLL.SRC_CITY is listed with 10 inserts.
-
Go back to the Overview screen, select the Replicat name (REP), and then select Statistics. Verify thatSRCMIRROR_OCIGGLL.SRC_CITY is listed with 10 inserts
-
Task 6: Create the Data Transforms resources
-
Note: For example, for Autonomous AI Databases in US East (Ashburn) use the following values:
-
For Host, enter
adb.us-ashburn-1.oraclecloud.com:1522. -
For Subnet, select the same subnet as the deployment from the dropdown.
-
-
Create TRG_CUSTOMER in SRCMIRROR_OCIGGLL in your Autonomous AI Lakehouse (ADW) instance:
-
In the Oracle Cloud console, open the navigation menu, navigate to Oracle AI Database, and then select Autonomous AI Lakehouse.
-
On the Autonomous Databases page, select on your ADW instance.
-
On the ADW Autonomous AI Database details page, select Database actions, and then select SQL from the dropdown. If the Database actions menu takes too long to load, you can select Database actions directly, and then select SQL from the Database actions page.
-
Enter the following into the Worksheet, and then select Run Statement.
create table SRCMIRROR_OCIGGLL.TRG_CUSTOMER ( CUST_ID NUMBER(10,0) not null, DEAR VARCHAR2(4 BYTE), CUST_NAME VARCHAR2(50 BYTE), ADDRESS VARCHAR2(100 BYTE), CITY_ID NUMBER(10,0), PHONE VARCHAR2(50 BYTE), AGE NUMBER(3,0), AGE_RANGE VARCHAR2(50 BYTE), SALES_PERS VARCHAR2(50 BYTE), CRE_DATE DATE, UPD_DATE DATE, constraint PK_TRG_CUSTOMER primary key (CUST_ID) );
-
-
Launch the Data Transforms deployment console:
-
Navigate back to the Deployments page, and then select the deployment you created in Task 6.
-
On the Deployment details page, select Launch console.
-
Log in to the Data Transforms deployment console.
-
-
Create ADW Connection:
-
Open the navigation menu, select Connections, and then select Create Connection.
-
On the Select Type page, under Databases select Oracle, and then select Next.
-
On the Connection details page, complete the form fields as follows, and then select Create:
-
For Name, enter ADW_IAD.
-
Select Use Credential File.
-
For Wallet File, upload your (ADW) wallet file.
Note: To download your ADW wallet file, select Database Connection on the ADW details page.
-
From the Services dropdown, select <name>_low.
-
For User, enter ADMIN.
-
For Password, enter your ADW password.
-
-
-
Import Data Entity:
-
Open the navigation menu, select Data Entities, and then select Import Data Entity.
-
For Connection, select ADW_IAD from the dropdown.
-
For Schema, select SRCMIRROR_OCIGGLL from the dropdown.
-
Select Start.
-
-
Create Project:
-
Open the navigation menu, select Projects.
-
On the Projects page, select Create Project.
-
In the Create Project dialog, for Name, enter demo-pipeline, and then select Create.
-
Task 7: Create and run a workflow
-
Create Data Flow:
Note: Learn more about the Data Flow Editor.
-
Select the name of your project.
-
On the Project Details page, under Resources, select Data Flows, and then select Create Data Flow.
-
In the Create Data Flow dialog, for Name, enter Load TRG_CUSTOMER, and optionally, a description. Select Create. The design canvas opens.
-
On the Add a Schema dialog, complete the form fields as follows, and then select OK:
-
For Connection, select ADW_IAD from the dropdown.
-
For Schema, select SRCMIRROR_OCIGGLL from the dropdown.
-
-
Drag the following data entities and components onto the design canvas:
-
In the Data Entities panel, expand the SRCMIRROR_OCIGGLL schema. Drag the SRC_AGE_GROUP data entity to the design canvas.
-
In the Data Entities panel, expand the SRCMIRROR_OCIGGLL schema. Drag the SRC_SALES_PERSON data entity to the design canvas.
-
From the Data Transform toolbar, drag the Lookup component to the design canvas.
-
From the Data Transform toolbar, drag the Join component to the design canvas.
-
In the Data Entities panel, under SRCMIRROR_OCIGGLL, drag the SRC_CUSTOMER data entity to the design canvas.
-
-
Connect the following data entities to the Lookup component:
-
Select on the SRC_AGE_GROUP Connector icon and drag the icon to Lookup component.
-
Select on the SRC_CUSTOMER Connector icon and drag the icon to the Lookup component.
-
-
On the design canvas, select Lookup to open the Lookup panel. In the Lookup panel, switch to the Attributes tab and then paste the following query into Lookup Condition:
SRC_CUSTOMER.AGE between SRC_AGE_GROUP.AGE_MIN and SRC_AGE_GROUP.AGE_MAX -
Connect the following components to the Join component:
-
Select on the SRC_SALES_PERSON Connector icon and drag the icon to the Join component.
-
Select on the Lookup Connector icon and drag the icon to the Join component.
-
-
On the design canvas, select Join to open the Join panel. In the Join panel, switch to the Attributes tab and then paste the following query into Join Condition:
SRC_CUSTOMER.SALES_PERS_ID=SRC_SALES_PERSON.SALES_PERS_ID -
Drag the following data entity and component onto the design canvas:
-
In the Data Entities panel, under SRCMIRROR_OCIGGLL, drag the TRG_CUSTOMER data entity to the design canvas.
-
Select on the Join Connector icon and drag the icon to the TRG_CUSTOMER data entity.
-
On the design canvas, select TRG_CUSTOMER and expand the data entity.
-
-
On the Attributes tab, enable Key for CUST_ID, and disable Update for CRE_DATE and disable Insert for UPD_DATE.
-
On the Column Mapping tab, ensure the Name matches the Expression:
-
CUST_ID
SRC_CUSTOMER.CUSTID -
DEAR
CASE WHEN SRC_CUSTOMER.DEAR = 0 THEN 'Mr' WHEN SRC_CUSTOMER.DEAR = 1 THEN 'Mrs' ELSE 'Ms' END -
CUST_NAME
SRC_CUSTOMER.FIRST_NAME \|\| ' ' \|\| UPPER(SRC_CUSTOMER.LAST_NAME) -
SALES_PERS
SRC_SALES_PERSON.FIRST_NAME \|\| ' ' \|\|UPPER(SRC_SALES_PERSON.LAST_NAME) -
CRE_DATE
SYSDATE -
UPD_DATE
SYSDATE -
Use other mappings as-is.
-
-
On the Options tab, for Mode, select Incremental Update from the dropdown.
-
Collapse TRG_CUSTOMER.
-
Select Save Data Flow.
-
-
Create Workflow:
-
Select the name of your project, select Workflows, and then select Create Workflow.
-
For name, enter Orchestrate Data Warehouse Load. Select Create.
-
Drag the SQL icon on the design canvas.
-
Double select the SQL step in the editor to open the step properties page.
-
On the General tab, for name, enter Data Cleansing.
-
Select the Attributes tab, for Connection, select ADW_IAD from the dropdown.
-
For SQL, copy the following query:
delete from SRCMIRROR_OCIGGLL.TRG_CUSTOMER where CITY_ID > 110 -
Collapse SQL.
-
Under Data Flows, drag the TRG_CUSTOMER Data Flow to the design canvas.
-
Select on the Data Cleansing SQL workflow line and drag the ok (green arrow) icon to the TRG_CUSTOMER Data Flow.
-
Select Save Workflow, and then select Start Workflow.
-