7.2 Realtime Data Ingestion into Snowflake with Oracle GoldenGate for Distributed Applications and Analytics
Overview
This Quickstart covers a step-by-step process showing how to ingest real-time data into Snowflake with Oracle GoldenGate for Distributed Applications and Analytics (GG for DAA).
Snowflake is a cloud-based data warehousing platform that provides a fully managed service for storing, processing, and analysing data.
GG for DAA Snowflake handler uses the stage and merge data flow. It supports both external and internal staging options.
In stage and merge, the change data is staged in a temporary location in microbatches and eventually merged into to the target table. If external staging is selected, storage location is cloud storage service of the cloud provider (Azure Storage for Azure, S3 for AWS and Cloud Storage for GCP). If internal staging is selected, then external storage location will not be needed.
All replication process is automatically handled by Snowflake Stage and Merge Handler handler.
This quickstart covers the steps for internal staging. External staging requires a storage service and a Snowflake storage integration.
- Prerequisites for Internal Staging
- Install Dependency Files
- Create a Credential Store Entry
- Create a Replicat in Oracle GoldenGate for Distributed Applications and Analytics
Parent topic: Quickstarts
7.2.1 Prerequisites for Internal Staging
- A Snowflake account.
- Target schema and database in Snowflake
- JDBC URL for Snowflake access.
- Snowflake username/ password
In this Quickstart, a sample trail file (named tr), which is shipped with
GG for DAA is used. The sample trail file is located at
GG_HOME/opt/AdapterExamples/trail/
in your GG for DAA instance.
GG for DAA creates the target Snowflake tables automatically.
7.2.2 Install Dependency Files
GG for DAA uses Java SDK provided by Snowflake. You can download the SDKs using Dependency Downloader utility shipped with GG for DAA. Dependency downloader is a set of shell scripts that downloads dependency jar files from Maven and other repositories.
To install the required dependency files:
- In your GoldenGate for VM, go to dependency downloader utility. It
is located at
GG_HOME/opt/DependencyDownloader/
and locatesnowflake.sh.
- Execute
snowflake.sh
with the required version. You can check the version and reported vulnerabilities in https://mvnrepository.com/artifact/net.snowflake/snowflake-jdbc. This document uses 3.15.0 which is the latest version when this quickstart is published.Figure 7-1 Snowflake Ingestion - Install dependency
- A new directory gets created in
GG_HOME/opt/DependencyDownloader/dependencies
named as<snowflake_version>
. Make a note of this directory as it will be used in the replicat properties. For example:/u01/app/ogg/opt/DependencyDownloader/dependencies/snowflake_3.15.0
Figure 7-2 Snowflake jar new directory
7.2.3 Create a Credential Store Entry
To create a credential store entry for securing Snowflake username and password:
- In Administration Service, click DB Connections and then click Add DB Connection.
- Under Database, click Add Credential.
- Provide the following details:
- Credential Domain: Domain name of your choice
- Credential Alias: Alias name of your choice
- User ID: Snowflake Username
- Password: Snowflake Password
- Verify Password: Snowflake Password
Figure 7-3 Create a credential entry
7.2.4 Create a Replicat in Oracle GoldenGate for Distributed Applications and Analytics
To create a replicat in Oracle GoldenGate for Distributed Applications and Analytics (GG for DAA):
- In the Oracle GoldenGate for Distributed Applications and Analytics
UI, in the Administration Service tab, click the + sign to add a
replicat.
Figure 7-4 Click + in the Administration Service tab.
- Select the Classic Replicat Replicat Type and click
Next There are two different Replicat types available: Classic and
Coordinated. Classic Replicat is a single threaded process whereas Coordinated
Replicat is a multithreaded one that applies transactions in parallel.
Figure 7-5 Add a Replicat
- Enter the basic information, and click Next:
- Trail Name: Name of the required trail file (if using
sample trail, provide as
tr
) - Subdirectory: Provide as
GG_HOME/opt/AdapterExamples/trail/
if using the sample trail. - Target: Snowflake Data Warehouse
Figure 7-6 Add Replicat - Replicat Options
- Trail Name: Name of the required trail file (if using
sample trail, provide as
- Leave Managed Options as is and click Next.
- Enter Parameter File details and click Next. In the
Parameter File, you can specify source to target mapping. If you are using the
sample trail file (tr), then enter as follows:
MAP QASOURCE.*, TARGET <your_schema_name>.*;
If Coordinated Replicat is selected as the Replicat Type, then an additional parameter needs to be provided:
TARGETDB LIBFILE libggjava.so SET property=<ggbd-deployment_home>/etc/conf/ogg/your_replicat_name.properties
Figure 7-7 Enter parameter details
- In the next screen, update the properties only tagged as
TO DO
and click Create. Edit your Snowflake JDBC URL:#TODO: Edit JDBC ConnectionUrl gg.eventhandler.snowflake.connectionURL=jdbc:snowflake://<account_name>.snowflakecomputing.com/?<connection_params>
Edit Snowflake username and password:#TODO: Edit JDBC user name with alias and domain from step 2. gg.eventhandler.snowflake.UserName=ORACLEWALLETUSERNAME [alias domain] #TODO: Edit JDBC password with alias and domain from step 2. gg.eventhandler.snowflake.Password=ORACLEWALLETPASSWORD [alias domain]
Provide path to dependency jar files that you downloaded in prerequisites:#TODO: Edit to include the GCS Java SDK and BQ Java SDK. gg.classpath=/path/to/Snowflake-dependencies/*
Figure 7-8 Properties File
- If replicat starts successfully, it will be in running state. You
can go to action/details/statistics to see the replication statistics.
Figure 7-9 Replicat Statistics
Figure 7-10 Replicat Table Statistics
- You can go to Snowflake console and check the tables. It may take a short while for tables to be created and loaded.
Note:
- You can run an initial load with Snowflake replicat. For more details, see Using OCI GoldenGate for Snowflake Initial Load and Real-time Data Sync
- For all Snowflake handler configuration details, see Snowflake.