Replicate data to Stream analytics
Learn to replicate data from OCI GoldenGate into Stream Analytics.
GoldenGate Stream Analytics began as a Complex Event Processing engine that evolved to run atop of runtime frameworks with Apache Spark and Apache Kafka. Stream Analytics can ingest a stream of data from any source such as a database, GoldenGate, Kafka, JMS, REST, or even a file system file. After the data is ingested, you can run analytics on live data.
Before you begin
To successfully complete this quickstart you need:
-
A source Autonomous AI Database with sample data loaded, and supplemental logging enabled.
Tip:
You can download the OCI GoldenGate sample data, if you need sample data to work with.
-
First, edit
SETUP_USERS_ATP.sqland modify theSRC_OCIGGLLuser's password to remove the special characters. -
Use the Autonomous AI Database's Database actions SQL tool to run the two scripts to create the user schema and tables.
-
Use the SQL tool to enable supplemental logging.
Follow the steps in Lab 1, Task 3: Load the ATP schema for more details.
-
-
Unlock the GGADMIN user on the source Autonomous AI Database instance
-
On the Autonomous AI Database Details page, select Database Users from the Database actions menu.
Tip: Use the Autonomous AI Database administrator credentials provided when you created the instance to log in, if prompted.
-
Locate the GGADMIN user, and then select Edit from its ellipsis (three dots) menu.
-
In the Edit User panel, enter a password, confirm that password, and then deselect Account is Locked.
-
Select Apply Changes.
-
Task 1: Create the OCI GoldenGate resources
-
Create the OCI GoldenGate deployment for Data replication.
-
Create a connection for the source database.
Task 2: Create the Stream Analytics resources
-
Create the Stream Analytics deployment.
-
Create a Kafka connection using the Kafka instance's public IP, and select Plaintext for Security protocol.
-
Create a GoldenGate connection.
-
Assign the connections to the Stream Analytics deployment.
Task 3: Create and run the pipelines
-
Launch the Stream Analytics deployment console.
-
Review the connections in the Stream Analytics deployment console.
-
In the Stream Analytics deployment console, select Catalog.
-
On the Catalog page, review the list of connections. You should see the GoldenGate connection, the Autonomous AI Database connection, and the Kafka connection.
-
-
Start the GoldenGate Big Data cluster:
-
In the OCI GoldenGate Stream Analytics deployment console, select System settings from the ossaadmin user menu.
-
In the System Setting dialog, select Manage Clusters, and then expand GGDB Cluster.
-
Select Start Cluster. Wait until the cluster status is Running, and then close the dialog window.
-
-
Update the GoldenGate connection credentials:
Although the GoldenGate connection is available in the Stream Analytics deployment console, the GoldenGate credentials don't carry over. Update the password and test the connection.
-
Select Catalog, and then select the GoldenGate connection.
-
In the Edit Connection dialog, select Next.
-
For GG Username, enter
oggadmin. -
For GG Password, select Change password, and then enter the password provided when you created the OCI GoldenGate deployment for Data Replication in Task 1.
-
Select Test connection. If successful, select Save.
-
-
Use the GoldenGate Extract to create and start GoldenGate Change Data.
Ensure that you use the Extract details provided in Task 1 on the GG Change Data Details page.
-
Update the Autonomous AI Database user name.
Database connections are created with the default user,
ggadmin. Update the username toSRC_OCIGGLL(if you used the sample data provided) to access its schema and tables.-
Select Catalog, and then select the Autonomous AI Database connection.
-
In the Edit Connection dialog, select Next.
-
For Username, enter
SRC_OCIGGLL. -
For Password, enter the
SRC_OCIGGLLpassword you modified in the Before you begin steps at the start of this quickstart. -
Select Test connection. If successful, select Save.
-
-
Use the Autonomous AI Database lookup tables to create References for Customers and Orders.
-
Use the Kafka connection to create Kafka Streams for Customers and Orders.
-
Use the Autonomous AI Database SQL tool to perform inserts on the source database.
For example, you can run the following inserts:
Insert into SRC_OCIGGLL.SRC_ORDERS (ORDER_ID,STATUS,CUST_ID,ORDER_DATE,CUSTOMER) values (11,'COM',101,to_date('16-AUG-2023','DD-MON-YYYY'),null); Insert into SRC_OCIGGLL.SRC_ORDERS (ORDER_ID,STATUS,CUST_ID,ORDER_DATE,CUSTOMER) values (12,'COM',102,to_date('16-AUG-2023','DD-MON-YYYY'),null); Insert into SRC_OCIGGLL.SRC_ORDERS (ORDER_ID,STATUS,CUST_ID,ORDER_DATE,CUSTOMER) values (13,'COM',103,to_date('16-AUG-2023','DD-MON-YYYY'),null); Insert into SRC_OCIGGLL.SRC_ORDERS (ORDER_ID,STATUS,CUST_ID,ORDER_DATE,CUSTOMER) values (14,'COM',104,to_date('16-AUG-2023','DD-MON-YYYY'),null); Insert into SRC_OCIGGLL.SRC_ORDERS (ORDER_ID,STATUS,CUST_ID,ORDER_DATE,CUSTOMER) values (15,'COM',105,to_date('16-AUG-2023','DD-MON-YYYY'),null); Insert into SRC_OCIGGLL.SRC_ORDERS (ORDER_ID,STATUS,CUST_ID,ORDER_DATE,CUSTOMER) values (16,'COM',106,to_date('16-AUG-2023','DD-MON-YYYY'),null); Insert into SRC_OCIGGLL.SRC_ORDERS (ORDER_ID,STATUS,CUST_ID,ORDER_DATE,CUSTOMER) values (17,'COM',107,to_date('16-AUG-2023','DD-MON-YYYY'),null); Insert into SRC_OCIGGLL.SRC_ORDERS (ORDER_ID,STATUS,CUST_ID,ORDER_DATE,CUSTOMER) values (18,'COM',201,to_date('16-AUG-2023','DD-MON-YYYY'),null); Insert into SRC_OCIGGLL.SRC_ORDERS (ORDER_ID,STATUS,CUST_ID,ORDER_DATE,CUSTOMER) values (19,'COM',202,to_date('16-AUG-2023','DD-MON-YYYY'),null); -
Create a pipeline that uses the Kafka stream created in Step 8.
-
Add a Query stage, and then add a Filter, to return only orders where the CUST_ID of the Orders stream match the CUSTID of the Customers stream.