9.3 Snowflake

The Snowflake Replicat delivers changed data from Oracle GoldenGate trail files into Snowflake target tables.

This chapter describes Oracle GoldenGate for Snowflake Replication Best Practices.

9.3.1 Introduction

Oracle GoldenGate for Distributed Applications and Analytics (GoldenGate for DAA) supports replication into Snowflake, enabling changed data from GoldenGate trail files to be delivered into Snowflake target tables. Snowflake can run on AWS, Microsoft Azure, or Google Cloud Platform, and GoldenGate provides two replication options depending on the workload and latency requirements.

The first option is the Snowflake Stage and Merge HandlerThis is the general-purpose approach for Snowflake replication and supports workloads that include inserts, updates, deletes, and other change operations. GoldenGate formats trail records into staged files, places them in a staging location, and then applies the changes to Snowflake target tables using SQL operations such as COPY INTO and MERGE. The staging location can be a Snowflake internal stage or an external object store such as Amazon S3, Azure Data Lake Storage Gen2, or Google Cloud Storage.

The second option is the Snowflake Streaming Handler. This handler uses the Snowpipe Streaming API to load rows directly into Snowflake with lower latency and without an intermediate staging area. It is designed for insert-only workloads. If the replication use case includes updates or deletes, the Stage and Merge Handler is usually the better fit unless those operations are intentionally converted into insert records.

At a high level, use Stage and Merge when you need full change data capture behavior and broader workload support. Use Streaming when the workload is insert-only and the priority is lower-latency delivery into Snowflake.

Table 9-6 Differences between Stage and Merge, and Streaming Handlers

Details Stage and Merge Streaming Handler
Workload Any (insert, update, delete, truncate) Inserts-only
Latency Micro-batch Real-time
Snowflake Compute Customer-managed virtual warehouse Snowflake-managed serverless compute
API JDBC+SQL commands (PUT/COPY INTO/MERGE) Snowpipe Streaming using snowflake-ingest-sdk
Load method Requires intermediary loading step (internal or external stage). Direct loading into Snowflake

9.3.2 Architecture Overview

Oracle GoldenGate supports exactly-once semantics for both Snowflake Stage and Merge and Snowflake Streaming replication. For Stage and Merge, Oracle GoldenGate uses checkpointed micro-batch processing and merge application to recover without duplicating committed work. For Snowflake Streaming, Oracle GoldenGate uses Snowpipe Streaming for supported insert-only workloads, where offset-based recovery preserves exactly-once ingestion semantics.

Snowflake Stage and Merge Replication:

The Snowflake Stage and Merge Handler is the primary GoldenGate for DAA architecture for replicating full change data capture workloads into Snowflake. It supports insert, update, delete, and truncate operations — including key changes — by reading Oracle GoldenGate trail records, staging changed data in micro-batches, and then applying those changes to Snowflake target tables through merge processing.

Before data is staged, Oracle GoldenGate performs operation aggregation. Multiple operations on the same source record can be compressed into the final effective operation to reduce unnecessary work and improve apply performance. For example, an insert followed by several updates on the same row can be collapsed into a single insert containing the final row image. The operation aggregation window is 3 minutes by default and can be configured based on latency, memory, and throughput requirements.

The staged data is written as Avro files to either a Snowflake internal stage or an external cloud object store such as Amazon S3, Azure Data Lake Storage Gen2, or Google Cloud Storage. Staged files are rolled — that is, closed and made available for processing — on a configurable micro-batch interval. For external stages, Oracle GoldenGate creates external tables over the staged Avro files so Snowflake can read the staged records before applying them to the target tables.

The high-level flow is:

  1. Oracle GoldenGate Replicat reads changed records from trail files.
  2. Operation aggregation combines multiple changes on the same record where possible.
  3. Changed records are staged as Avro files in an internal or external stage.
  4. For external stages, external tables are created over the staged files.
  5. Snowflake merge operations apply the staged changes to the target tables.

This architecture is best suited for general-purpose Snowflake replication where source changes must preserve normal CDC semantics, including updates and deletes.

Snowflake Streaming

The Snowflake Streaming Handler provides a lower-latency architecture for insert-only replication into Snowflake. Instead of writing staged files and later merging them into target tables, Oracle GoldenGate reads records from the trail and streams them directly to Snowflake using the Snowpipe Streaming API.

Because this flow loads rows directly into Snowflake, it removes the intermediate staging step used by Stage and Merge. This makes it a better fit for append-only workloads where near real-time delivery is more important than full update/delete merge semantics.

For supported insert-only workloads, Oracle GoldenGate Snowflake Streaming supports exactly-once semantics by using Snowpipe Streaming and offset-based recovery behavior during retry and recovery scenarios.

The high-level flow is:

  1. Oracle GoldenGate Replicat reads records from trail files.
  2. The Snowflake Streaming Handler sends rows through the Snowpipe Streaming API.
  3. Snowflake ingests the records directly into the target table.

This architecture is best suited for insert-only or append-only workloads where near real-time latency is the priority.

9.3.3 Configuration Considerations and Best Practices

The configuration of Oracle GoldenGate Snowflake replication involves two core elements:

  • Configuration Properties: Key-value settings that govern how data is formatted, mapped, staged, and delivered from GoldenGate to Snowflake tables. These control the Snowflake JDBC connection, the number of concurrent JDBC connections to the target Snowflake database, the operation aggregation window, connection retry behavior, merge behavior (SQL MERGE or delete-and-insert processing), and the Snowflake table type (native Snowflake tables or Iceberg tables).
  • Parameters Configuration: Defines source-to-target table mappings and applies Oracle GoldenGate parameters for data reformatting use cases such as column filtering, token substitution, and SQL conversion.
Careful coordination between these two elements is essential. For example, a mismatch between a column mapping defined in the parameters file and the actual target table schema can cause apply errors at merge time that are difficult to diagnose without clear logging.

9.3.3.1 Snowflake Session and Connection Stability

Oracle GoldenGate uses Snowflake JDBC for connections. This connection might be interrupted for several reasons, and it can be controlled Snowflake JDBC options and replicat properties.

Implementation and Configuration

On the Snowflake JDBC level, it is recommended to useCLIENT_SESSION_KEEP_ALIVE parameter to keep the session active even if there is no activity. In the Snowflake JDBC string, it should be set to true. For example:
jdbc:snowflake://<account>.snowflakecomputing.com/?warehouse=<warehouse>&db=<database>&schema=<schema>&CLIENT_SESSION_KEEP_ALIVE=true

For Snowflake Stage and Merge replication, optionally you can set configuration properties for retries.

gg.eventhandler.snowflake.connectionRetries property allows you to set the number of times connections to the target data warehouse will be retried. gg.eventhandler.snowflake.connectionRetryIntervalSeconds. connectionRetries sets the delay in seconds between connection retry attempts.

For example:

gg.eventhandler.snowflake.connectionRetries=5
gg.eventhandler.snowflake.connectionRetryIntervalSeconds=30

Considerations and Best Practices

  • Set CLIENT_SESSION_KEEP_ALIVE=true in the Snowflake JDBC connection string. GoldenGate Replicat processes can run for long periods with limited Snowflake activity, especially during low-volume periods or while the operation aggregation buffer is filling. This setting helps keep the Snowflake JDBC session active and reduces the risk of session timeout before the next apply operation.
  • Configure connection retries for production deployments: Transient Snowflake connectivity issues, such as warehouse scaling events, brief network interruptions, or cloud platform maintenance, can cause temporary connection failures that do not require operator intervention. A starting point of 5 retries with a 30-second interval is appropriate for most environments, but these values should be adjusted based on observed Snowflake warehouse restart times and network stability.
  • Monitor GoldenGate Replicat reports, logs, and operational alerts for Snowflake connection errors and retry attempts: Repeated retries may indicate a persistent network, warehouse, authentication, or session-management issue. Configure alerting for Replicat abends so connectivity problems are detected quickly and do not silently increase replication lag.

9.3.3.2 Snowflake Authentication

Oracle GoldenGate supports both username-password and key-pair authentication for Snowflake Stage and Merge replication. For Snowflake Streaming replication, key-pair authentication is required.

Oracle GoldenGate supports both username-password and key-pair authentication for Snowflake Stage and Merge replication. For Snowflake Streaming replication, key-pair authentication is required.

Implementation and Configuration

Username-Password Authentication

Username-password authentication is supported for Snowflake Stage and Merge replication only. It is suitable for development and testing, but key-pair authentication is recommended for production deployments.

The Snowflake username and password are configured using the following Replicat properties:

gg.eventhandler.snowflake.UserName=<snowflake_username>
gg.eventhandler.snowflake.Password=<snowflake_password>

Never store credentials in plaintext in the Replicat properties file. Instead, store them securely in the Using Identities in Oracle GoldenGate Credential Store. You can create the Credential Store entry from the Oracle GoldenGate UI by navigating to Administration Service , DB Connections and then Add DB Connection or by using the Oracle GoldenGate Rest API.

When creating the entry, provide a Credential Domain and Credential Alias, enter the Snowflake username as the User ID, and the Snowflake password as the Password. Oracle GoldenGate stores these values securely.

Once stored, reference the credentials in the Replicat properties file using the domain and alias you defined:

gg.eventhandler.snowflake.UserName=ORACLEWALLETUSERNAME[alias domain_name]
gg.eventhandler.snowflake.Password=ORACLEWALLETPASSWORD[alias domain_name]

Where <alias> is the Credential Alias and <domain> is the Credential Domain you assigned when creating the entry.

Key-Pair Authentication

Key-pair authentication is supported for both Snowflake Stage and Merge and Snowflake Streaming replication and is the recommended authentication method for production deployments.

Oracle GoldenGate uses Snowflake-recommended PKCS#8 .p8 private keys. For supported cryptographic key types and algorithms, refer to the Snowflake key-pair authentication documentation.

For Snowflake Stage and Merge replication, key-pair authentication is configured in the Snowflake JDBC connection string. The Snowflake user and private-key file password can be referenced securely from the Using Identities in Oracle GoldenGate Credential Store. You can create the Credential Store entry from the Oracle GoldenGate UI by navigating to Administration Service , DB Connections and then Add DB Connection or by using the Oracle GoldenGate Rest API.

When creating the entry, provide a Credential Domain and Credential Alias. Enter the user as the User ID, and the private_key file password as the Password. Oracle GoldenGate stores these values securely.

Once stored, reference the credentials in the In Snowflake JDBC string:
jdbc:snowflake://<account_name>.snowflakecomputing.com/?warehouse=<warehouse-name>&db=<database-name>&role=<role>&
user=ORACLEWALLETUSERNAME[alias domain]
&private_key_file=/path/to/key/file/rsa_key.p8&
private_key_file_pwd=ORACLEWALLETPASSWORD[alias domain]

For more information on the configuration steps, see,Snowflake Key Pair Authentication and Snowflake Streaming Handler Key Pair Authentication.

Considerations and Best Practices

  • Use key-pair authentication for production deployments: Username-password authentication should be reserved for development and testing environments. Key-pair authentication eliminates password expiry as a source of replication outages and better aligns with enterprise security policies.
  • Restrict private-key file permissions: The private-key file should be readable only by the OS user running the GoldenGate Replicat process.
  • Never store plaintext credentials in the Replicat properties file: Use the Oracle GoldenGate Credential Store for username-password authentication. For key-pair authentication, ensure the referenced private-key file has restricted OS-level permissions.
  • Verify JDBC driver compatibility: Many key-pair authentication issues stem from JDBC driver version mismatches or unsupported key algorithms. Confirm that the Snowflake JDBC driver bundled with your GoldenGate installation supports the key type and algorithm before deploying to production.
  • Plan for zero-downtime ,Snowflake key rotation: Snowflake allows a secondary public key to be assigned to a user, enabling key rotation without interrupting replication. Register the new public key as the secondary key in Snowflake, update Oracle GoldenGate to reference the new private key, confirm replication, and then remove the old public key from Snowflake.

9.3.3.3 Precise Instantiation for Snowflake

Oracle GoldenGate supports precise instantiation for databases that supportOracle GoldenGate Initial Load Extract. For Snowflake targets, precise instantiation enables a seamless transition from initial load to CDC replication without downtime or data loss. The configuration depends on the source database type. This section covers Oracle Database as the source; for other supported source databases, contact an Oracle engineer or Oracle Support.

Implementation and Configuration

Oracle GoldenGate Snowflake Replicat supports automated table creation in Snowflake. During startup, the Replicat checks whether the target table exists. If the target table does not exist, Oracle GoldenGate automatically creates it using the source table definition and the default Oracle-to-Snowflake data type mapping. If the target Snowflake table already exists, the Replicat uses the existing target table definition and data mapping.

Precise instantiation from Oracle Database to Snowflake can be configured with the following steps using the Oracle GoldenGate UI with the following steps:

Create and run the Initial Load Extract using SQLPREDICATE “AS OF SCN” in the parameter file. For example:

EXTRACT <process_name> EXTFILE <trail_name> USERIDALIAS <alias> DOMAIN <domain>
TABLE <schema>.*; SQLPREDICATE “AS OF SCN <scn_value>
  • Create a Distribution Path for the Initial Load Extract.
  • Create and run the Initial Load Replicat for Snowflake.
    • Initial Load Replicat should use the trail file generated by the Initial Load Extract.
    • Use INSERTALLRECORDS Support to improve initial load performance.
  • Create the CDC Replicat for Snowflake using the trail file generated by CDC Extract.

    Create the CDC Replicat, but do not start it yet

  • Start the CDC replicat using Start with Options:
    • Start Position: After SCN
    • CSN: The SCN value used in Initial Load Extract

Considerations and Best Practices

  • Run the CDC Extract before the Initial Load Extract: The CDC Extract must be capturing changes before the Initial Load begins. This ensures no changes made during the initial load window are missed. Starting the CDC Extract after the initial load risks a gap between the instantiation SCN and the point where CDC begins, leading to data loss or requiring a restart.
  • Validate the instantiation SCN before proceeding: Before creating the Initial Load Extract, confirm that the SCN retrieved from SQL*Plus is usable for flashback query and that sufficient undo retention is available for the expected duration of the initial load. The Initial Load Extract uses this SCN to read a consistent point-in-time image of the source data, and the CDC Replicat must later be started from the same value to avoid gaps or duplicates. For more information, see Refer to Oracle GoldenGate documentationfor details.
  • Do not start the CDC Replicat until the Initial Load Replicat has finished: Starting the CDC Replicat before the Initial Load Replicat completes can apply change records before the corresponding base rows are loaded, which may cause missing-row conditions or incorrect target state. Monitor the Initial Load Replicat until it reaches end of trail before starting CDC.
  • Use automated table creation with caution: The auto-create feature is convenient but relies on Oracle-to-Snowflake default type mappings, which may not match your target schema requirements, for example, precision on numeric types, VARCHAR sizing, or timestamp handling. If your Snowflake tables have been pre-created with custom definitions, ensure the Replicat is pointed at an existing table and validate the column mapping before starting replication.
  • Document and retain the SCN value used for each instantiation: The SCN is the key reference point linking the initial load to CDC. Store it alongside the instantiation date, source table scope, and Oracle GoldenGate process names involved. This record is essential for troubleshooting, auditing, or repeating the instantiation for additional tables at a later stage.

9.3.4 Snowflake Schema Change Handling

Oracle GoldenGate Snowflake Streaming Replicat and Snowflake Stage Merge Replicat support propagation of source schema changes to target Snowflake tables with Automatic Schema Evolution functionality.

Automatic schema evolution is the capability to detect, manage, and apply changes to a database schema — such as tables, columns, indexes, or relationships — without requiring manual intervention.

Implementation and Configuration

Automatic Schema Evolution for Snowflake automatically creates and evolves Snowflake target tables based on the source database schema when AUTOSCHEMA is enabled.

When AUTOSCHEMA is enabled, Oracle GoldenGate automatically creates the target Snowflake using the source table definition and the default Oracle-to-Snowflake data type mapping. If your Snowflake tables have been pre-created with custom definitions, Oracle GoldenGate Snowflake Replicat will use the target Snowflake table definition for mapping.

As an alternative to automatic schema evolution, source schema changes can be monitored by enabling DDL capture in Extract and using DDL with EVENTACTIONS to report the DDL event, stop the process, or trigger an operational response. For example, DDL INCLUDE ALL EVENTACTIONS (REPORT, STOP) captures source DDL, writes it to the report file, and stops the process so the target schema can be reviewed before replication continues.

Considerations and Best Practices

  • Enable Automatic Schema Evolution carefully because schema changes can affect downstream applications: Source changes such as adding, renaming, or dropping columns may be propagated to Snowflake without manual review. Downstream consumers such as BI tools, dbt models, and data pipelines may fail if a column they depend on is changed or removed.
  • Use DDL with `EVENTACTIONS` as a controlled alternative to AUTOSCHEMA for production environments.
  • Align schema change procedures with Snowflake governance policies: In environments where schemas are managed through a data catalog, schema registry, or infrastructure-as-code process, automatically propagated schema changes may create drift from the governed definition.

9.3.5 LOB Type Replication

Oracle GoldenGate supports replication of Large Object (LOB) columns, including BLOB, CLOB, and NCLOB, from Oracle Database to Snowflake targets.

Implementation and Configuration

For LOB column replication to Snowflake, Oracle GoldenGateIntegrated Extract must be configured with TRANLOGOPTIONS FETCHPARTIALLOB in the Extract parameter file. Oracle GoldenGate Snowflake Replicat does not support partial LOB data. Without this option, transactions containing LOB columns that were not fully captured in the redo log can cause the Replicat to abend.

LOB columns that exceed the inline LOB threshold are processed out-of-band, which can lead to Replicat performance degradation. The gg.maxInlineLobSize property controls the maximum size, in bytes, of LOB data processed inline. The default value is 16000 bytes. Increasing this value allows larger LOB column updates to be processed within the standard micro-batch flow rather than individually out-of-band.

Before setting this value, check the trail files containing LOB data or review the source table definitions to determine the actual maximum LOB size in your dataset. For example, to process LOBs up to 24 MB inline: gg.maxInlineLobSize=24000000

For details about default Oracle LOB type mappings to Snowflake types, refer to Oracle-to-Snowflake data type mapping documentation.

Considerations and Best Practices

  • Always configure Oracle Extract with TRANLOGOPTIONS FETCHPARTIALLOB when replicating LOB columns.
  • Measure actual LOB sizes before configuring the Snowflake replicat with LOB type and tune it using gg.maxInlineLobSize: The inline LOB buffer is allocated per Replicat process, and setting it significantly higher than the actual maximum LOB size in your dataset wastes memory. Query the source table or inspect trail file statistics to determine the realistic maximum LOB size, then set gg.maxInlineLobSize to match or slightly exceed that value.
  • Validate target Snowflake column types for LOB data before go-live: BLOB data maps to BINARY(67108864) and CLOB/NCLOB map to STRING(134217728) by default. Verify that downstream consumers of the Snowflake table can handle these types correctly.
  • Exclude LOB columns from replication if they are not required at the target. LOB replication adds overhead to both the Extract (due to out-of-row fetching) and the Replicat (due to increased trail file size and apply time). If downstream consumers do not require LOB data, useCOLSEXCEPT in the Exract parameter file to exclude LOB columns from replication.
  • Consider whether LOB-heavy tables warrant a dedicated Extract and Replicat process to isolate their impact on the broader replication pipeline.

9.3.6 Snowflake Apache Iceberg Tables

Oracle GoldenGate Snowflake Streaming Replicat and Snowflake Stage Merge Replicat support Snowflake managed Apache Iceberg table type. Snowflake Open Catalog is supported Oracle GoldenGate Apache Iceberg Rest Catalog Replicat.

Implementation and Configuration

Snowflake Managed Iceberg Tables

For Snowflake-managed Apache Iceberg tables, the tableType property must be set to iceberg.

For example, in the Snowflake Stage and Merge Replicat:

gg.eventhandler.snowflake.tableType=iceberg

When tableType is set to iceberg, the following additional properties are required:

  • icebergExternalVolume: The name of the external volume that connects Snowflake to your external cloud storage for Iceberg tables. The external volume must be configured in Snowflake before any Iceberg tables can be created.
  • icebergBaseLocation: The base path in external storage where Iceberg table data and metadata will be written. Oracle GoldenGate appends the qualified table name to this path when creating each Iceberg table.

Example:

gg.eventhandler.snowflake.tableType=iceberg
 gg.eventhandler.snowflake.icebergExternalVolume=<external_volume_name>
 gg.eventhandler.snowflake.icebergBaseLocation=<base_location_path>

Snowflake Open Catalog

For replicating Apache Iceberg tables managed by Snowflake Open Catalog, use Compressed Update Handling. Snowflake Open Catalog does not require storage authentication at the Replicat level.

Configure the following properties in the Replicat properties file:

gg.eventhandler.iceberg.restCatalogUri=<open_catalog_rest_uri>

gg.eventhandler.iceberg.restCatalogProperties=/path/to/opencat.properties 
gg.classpath=/path/to/required/dependencies/*
Create a catalog.properties file (for example, opencat.properties) to be referenced in the Replicat properties file:
type=rest warehouse=<warehouse_name> 
credential=secret:key scope=PRINCIPAL_ROLE:ALL

For enhanced security, secure the catalog properties file using the encryption capabilities supported by Oracle GoldenGate Apache Iceberg Rest Catalog Replicat.

Considerations and Best Practices

  • Configure and validate the external volume before starting the Replicat: If the external volume is missing or misconfigured, the Replicat will fail to create Iceberg tables during startup. Validate the external volume in Snowflake using SHOW EXTERNAL VOLUMES and confirm that Snowflake has the necessary permissions to read and write to the underlying cloud storage location before running the Replicat for the first time.
  • Set icebergBaseLocation carefully — it cannot be changed after tables are created. Oracle GoldenGate appends the qualified table name to the base location path to construct the storage path for each Iceberg table. Once an Iceberg table has been created at a given path, changing the base location requires dropping and recreating the table and performing a full re-instantiation of the affected tables. Choose a base location path that is consistent with your cloud storage naming conventions and organisation structure before go-live.
  • For Open Catalog deployments, secure the catalog.properties file: Restrict file permissions so the file is readable only by the operating system user running the Replicat process. If the file contains catalog credentials, use Oracle GoldenGate-supported encryption or secret-management controls.
  • Monitor Iceberg metadata and snapshot growth: Iceberg tables generate metadata and snapshot files over time. Establish a maintenance strategy with the Snowflake administration team to manage metadata growth, snapshot retention, and storage costs for Oracle GoldenGate-replicated Iceberg tables

9.3.7 Handling PK Updates for Snowflake Stage and Merge Replicat

In Snowflake Stage and Merge Replicat, primary key updates require special attention because the key value used to identify the target row changes during the operation.

For primary key update operations, if the trail record is compressed or does not contain all required column values, Oracle GoldenGate may need to apply the PK update as a singleton operation — a row-by-row update applied outside the MERGE batch. This bypasses operation aggregation and significantly impacts performance in PK update-heavy workloads.

Implementation and Configuration

For uncompressed update records, configure the source extract usingLOGALLSUPCOLS for Oracle Database extracts andCOMPRESSUPDATES | NOCOMPRESSUPDATES for Non-Oracle Database extracts.

In the Snowflake Stage and Merge replicat properties, set the following properties:

gg.compressed.update=false
gg.validate.keyupdate=true

These settings will allow replicat to aggregate PK update operations and improve the performance, especially for PK update heavy workloads.

For workloads with frequent primary key updates, consider enabling SQL-based operation aggregation when the trail contains uncompressed update records. SQL Aggregation allows Oracle GoldenGate to aggregate operations using SQL before the Snowflake MERGE, which can improve throughput and reduce Replicat JVM memory pressure. To enable SQL based operation aggregation, set the following property:
gg.aggregate.operations.using.sql=true

Considerations and Best Practices

  • Ensure the trail format has uncompressed operations before starting the Replicat: If gg.compressed.update=false is set but the trail contains compressed updates, the Replicat will encounter column mapping mismatches at apply time. Validate the trail format using logdump before changing this property in a running deployment.
  • Enable gg.validate.keyupdate=true whenever PK updates are present in the source workload: Some source applications generate PK update operations where the key value does not actually change — for example, as a result of application-level row refresh logic. Without this property, these false PK updates are treated as genuine key changes and cannot be aggregated into the MERGE batch, degrading performance unnecessarily. This property corrects the operation type before aggregation, improving both accuracy and throughput.
  • Assess the source workload for PK update frequency before tuning: PK updates are relatively rare in most OLTP workloads. If PK updates represent a small fraction of total DML, the performance impact of singleton fallback may be negligible and the overhead of changing the trail format (re-running Extract with LOGALLSUPCOLS or NOCOMPRESSUPDATES) may not be justified. Measure the proportion of PK update operations in the trail using logdump or GoldenGate statistics before committing to uncompressed trail configuration.

9.3.8 Oracle GoldenGate Snowflake Performance Considerations

Oracle GoldenGate Snowflake replication performance depends on the selected delivery mode, Snowflake warehouse capacity, target table design, micro-batch sizing, connection parallelism, workload characteristics, and Oracle GoldenGate/Snowflake JDBC driver versions. Most performance tuning requirements are related to frequent small batches, stage-file backlog, throttling, LOB processing, insufficient JDBC connection concurrency, or metadata-query overhead in older software versions.

Implementation and Configuration

Snowflake Stage and Merge Micro-batch window control

Oracle GoldenGate Snowflake Stage and Merge Replicat is a micro-batch process. Change data from Oracle GoldenGate trail files is staged in micro-batches at a temporary staging location, either an internal Snowflake stage or an external cloud object store such as S3, ADLS Gen2, or GCS. The staged data is then merged into Snowflake target tables using SQL MERGE statements.

The micro-batch window is controlled by the gg.handler.snowflake.fileRollInterval property. The default value is 3 minutes.

For high-volume replication, increasing the micro-batch window is recommended. Small staged files result in many small Snowflake MERGE statements, which can increase both latency and cost because each MERGE may scan more data than the number of changed rows. A larger fileRollInterval allows Stage and Merge Replicat to aggregate more operations per batch, resulting in fewer and more efficient MERGE operations.

For example, to set a 15-minute micro-batch window:
gg.handler.snowflake.fileRollInterval=15m

As part of the micro-batch process, GoldenGate Snowflake Stage and Merge Replicat runs operation aggregation. Operation aggregation reduces the number of rows written to the stage file and merged into Snowflake by collapsing multiple operations on the same row within a micro-batch into a single net operation. For example, multiple updates to the same row within the batch window are collapsed into one operation, reducing the number of rows in the staged file and the number of rows processed by the MERGE statement.

Operation aggregation requires additional JVM heap allocation. Configure the JVM heap size in the replicat properties file to accommodate the expected volume of in-flight operations within the micro-batch window.

For example:

jvm.bootoptions=-Xmx8g -Xms8g

The appropriate heap size depends on the number of tables, the volume of changes per micro-batch, and the average row size. Monitor JVM garbage collection behavior after tuning, because GC pauses during micro-batch aggregation can appear as apply-lag spikes.

Considerations and Best Practices

  • Choose the appropriate Snowflake delivery mode that works best for your requirement : Use Snowflake Stage and Merge replication for mix or update intense workloads. For low-latency and insert only requirements, evaluate Snowflake Streaming replication when the workload and supported operations fit the use case.
  • Tune fileRollInterval based on workload volume and latency requirements to avoid frequent small MERGE operations.: The default 3-minute interval is a starting point, not a production target. For high-volume workloads, a longer interval — 10 to 30 minutes — produces larger, fewer stage files and more efficient MERGE operations. For low-latency requirements, a shorter interval reduces end-to-end replication lag but increases the number of MERGE statements and Snowflake warehouse compute consumption. Balance the two based on your SLA and Snowflake cost targets. Avoid values below 30 seconds in production — very short flush intervals generate excessive small Snowflake transactions, which can paradoxically increase latency under load.
  • Parallelize large initial loads by splitting the workload across multiple Initial Load Extract and Replicat processes.: This is particularly effective for large schemas or tables with high row counts. For configuration details, refer to the Precise Instantiation section.
  • Tune LOB handling intentionally: If your replicat includes LOB types, tune Oracle GoldenGate Snowflake Replicat to handle LOB type replication more efficiently. For more details, see LOB Type Replication.
  • Measure end-to-end latency. It is possible to measure the end-to-end latency between when a transaction was committed on the source and when it was committed on the target is a common requirement using extra columns on the target to store the operation type performed by GoldenGate, the source commit timestamp from the trail, and the target commit timestamp using the CURRENT_TIMESTAMP() function as its default value. For configuration details and sample codes, refer to measuring-the-endtoend-lag-using-the-snowflake-handler.
  • Handle PK updates carefully: For PK update-heavy workloads, always use uncompressed update records and SQL-based operation aggregation to enable efficient MERGE processing and avoid singleton row-by-row apply. For configuration information, see Handling PK Updates for Snowflake Stage and Merge Replicat.
  • Co-locate the Oracle GoldenGate Snowflake Replicat process and target Snowflake instance in the same cloud provider and in the same region: Co-locating Oracle GoldenGate Snowflake Replicat and the Snowflake instance in the same cloud provider and in the same region will optimize network latency.
  • Upgrade to GoldenGate for DAA 26ai to take advantage of optimized aggregation: The redesigned aggregation engine in 26ai delivers better throughput with lower resource consumption and enables larger batch windows without proportionally increasing JVM memory requirements.