22 Preparing the Database for Oracle GoldenGate — CDC Capture

Learn how to enable supplemental logging in the source database tables that are to be used for capture by the Extract for SQL Server and how to purge older CDC staging data.

You can learn more about CDC Capture with this Oracle By Example:

Using the Oracle GoldenGate for SQL Server CDC Capture Replication http://www.oracle.com/webfolder/technetwork/tutorials/obe/fmw/goldengate/12c/sql_cdcrep/sql_cdcrep.html.

Topics:

Enabling CDC Supplemental Logging

With the CDC Extract, the method of capturing change data is via SQL Server Change Data Capture tables, so it is imperative that you follow the procedures and requirements below, so that change data is correctly logged, maintained, and captured by Extract.

You will enable supplemental logging with the ADD TRANDATA command so that Extract can capture the information that is required to reconstruct transactions.

ADD TRANDATA must be issued for all tables that are to be captured by Oracle GoldenGate, and to do so requires that a valid schema be used in order to create the necessary Oracle GoldenGate tables and stored procedures.

Enabling supplemental logging for a CDC Extract does the following:

  • Enables SQL Server Change Data Capture at the database level, if it’s not already enabled.

    • EXECUTE sys.sp_cdc_enable_db

  • Creates a Change Data Capture staging table for each base table enabled with supplemental logging by running EXECUTE sys.sp_cdc_enable_table, and creates a trigger for each CDC table. The CDC table exists as part of the system tables within the database and has a naming convention like, cdc.OracleGG_basetableobjectid_CT.

  • Creates a tracking table of naming convention schema.OracleGGTranTables. This table is used to store transaction indicators for the CDC tables, and is populated when the trigger for a CDC table is fired. The table will be owned by the schema listed in the GLOBALS file’s, GGSCHEMA parameter.

  • Creates a unique fetch stored procedure for each CDC table, as well as several other stored procedures that are required for Extract to function. These stored procedures will be owned by the schema listed in the GLOBALS file’s, GGSCHEMA parameter.

  • Also, as part of enabling CDC for tables, SQL Server creates two jobs per database:

    cdc.dbname_capture

    cdc.dbname_cleanup

  • The CDC Capture job is the job that reads the SQL Server transaction log and populates the data into the CDC tables, and it is from those CDC tables that the Extract will capture the transactions. So it is of extreme importance that the CDC capture job be running at all times. This too requires that SQL Server Agent be set to run at all times and enabled to run automatically when SQL Server starts.

  • Important tuning information of the CDC Capture job can be found in CDC Capture Method Operational Considerations.

  • The CDC Cleanup job that is created by Microsoft does not have any dependencies on whether the Oracle GoldenGate Extract has captured data in the CDC tables or not. Therefore, extra steps need to be followed in order to disable or delete the CDC cleanup job immediately after TRANDATA is enabled, and to enable Oracle GoldenGate's own CDC cleanup job. See Retaining the CDC Table History Data for more information.

The following steps require a database user who is a member of the SQL Server System Administrators (sysadmin) role.

  1. In the source Oracle GoldenGate installation, ensure that a GLOBALS (all CAPS and no extension) file has been created with the parameter GGSCHEMA schemaname. Ensure that the schema name used has been created (CREATE SCHEMA schemaname) in the source database. This schema will be used by all subsequent Oracle GoldenGate components created in the database, therefore it is recommended to create a unique schema that is solely used by Oracle GoldenGate, such as ‘ogg’. It is recommended not to use the SQL Server schema cdc and to create a new schema specific to Oracle GoldenGate.
  2. On the source system, run GGSCI
  3. Issue the following command to log into the database:
    DBLOGIN SOURCEDB DSN [,{USERID user, PASSWORD password | USERIDALIAS alias}]

    Where:

    • SOURCEDB DSN is the name of the SQL Server data source.

    • USERID user is the database login and PASSWORD password is the password that is required if the data source connects via SQL Server authentication. Alternatively, USERIDALIAS alias is the alias for the credentials if they are stored in a credentials store. If using DBLOGIN with a DSN that is using Integrated Windows authentication, the connection to the database for the GGSCI session will be that of the user running GGSCI. In order to issue ADD TRANDATA or DELETE TRANDATA, this user must be a member of the SQL Server sysadmin server role.

  4. In GGSCI, issue the following command for each table that is, or will be, in the Extract configuration. You can use a wildcard to specify multiple table names.

    ADD TRANDATA owner.table

    ADD TRANDATA owner.*

    Optionally, you can designate the filegroup in which the SQL Server Change Data Capture staging tables will be placed, by using the FILEGROUP option with an existing filegroup name.

    ADD TRANDATA owner.table FILEGROUP cdctables

    See ADD TRANDATA

Purging CDC Staging Data

When enabling supplemental logging, data that is required by Extract to reconstruct transactions are stored in a series of SQL Server CDC system tables, as well Oracle GoldenGate objects that are used to track operations within a transaction. And as part of enabling supplemental logging, SQL Server will create its own Change Data Capture Cleanup job that runs nightly by default, and purges data older than 72 hours. The SQL Server CDC Cleanup job is unaware that an Extract may still require data from these CDC system tables and can remove that data before the Extract has a chance to capture it.

If data that Extract needs during processing has been deleted from the CDC system tables, then one of the following corrective actions might be required:

  • Alter Extract to capture from a later point in time for which CDC data is available (and accept possible data loss on the target).

  • Resynchronize the source and target tables, and then start the Oracle GoldenGate environment over again.

To remedy this situation, Oracle GoldenGate for SQL Server includes Oracle GoldenGate for SQL Server includes the ogg_cdc_cleanup_setup.bat program that is used to create an Oracle GoldenGate Cleanup job associated stored procedures and tables.

The Extract, upon startup, will expect, by default, that those Oracle GoldenGate Cleanup task objects exist and will stop if they do not. Extract will issue a warning if the SQL Server CDC Cleanup job exists alongside the Oracle GoldenGate Cleanup job.

The default checks by Extract for the Oracle GoldenGate CDC Cleanup task objects can be overwritten by using the TRANLOGOPTIONS NOMANAGECDCCLEANUP in the Extract, but this would only be recommended for development and testing purposes.

Use the following steps immediately after enabling supplemental logging and prior to starting the Extract, to create the Oracle GoldenGate CDC Cleanup job and associated objects. You can re-run these steps to re-enable this feature should any of the objects get manually deleted.

To create the Oracle GoldenGate CDC Cleanup job and objects:

The ogg_cdc_cleanup_setup file is located in the the home directory for Classic Architecture.

The script uses the Microsoft sqlcmd utility, so ensure that sqlcmd is installed on the system where Oracle GoldenGate is installed.

This requires an SQL Server authenticated database user who is a member of the SQL Server System Administrators (sysadmin) role. Windows authentication is not supported for the .bat script.

  1. Stop and disable the database’s SQL Server cdc.dbname_cleanup job from SQL Server Agent. Alternatively, you can drop it from the source database with the following command.

    EXECUTE sys.sp_cdc_drop_job 'cleanup'
  2. Run the ogg_cdc_cleanup_setup.bat file, providing the following variable values.

    For Windows:

    ogg_cdc_cleanup_setup.bat createJob userid password databasename servername\instancename schema

    For Linux:

    ./ogg_cdc_cleanup_setup.sh createJob userid password databasename "servername,port" schema

    In the preceding examples, USER ID and password should be a valid SQL Server login and password for a user, which has sysadmin rights. The databasename, servername\instancename, or servername, port, are the source database name, server, and instance, or server and TCP/IP port where SQL Server is running. If only the server name is listed, then the default instance will be connected to. The schema is the schema name listed in the GLOBALS file, with the GGSCHEMA parameter. This schema should be the same for all Oracle GoldenGate objects, including supplemental logging, checkpoint tables, heartbeat tables, and the Oracle GoldenGate CDC Cleanup job.

    For example:

    ogg_cdc_cleanup_setup.bat createJob ggsuser ggspword db1 server1\inst1 ogg
    When using server,port in the connection string, enclose the string in double quotes, for example:
    ogg_cdc_cleanup_setup.bat createJob login password source database "sql2016.samplestring.us-west-1.rds.amazonaws.com,1433" OGG schema name

The Oracle GoldenGate CDC Cleanup job when created, is scheduled to run every ten minutes, with a default retention period of seventy two hours. The job will not purge data for an Extract’s recovery checkpoint however, regardless of the retention period.

Additional information of the Oracle GoldenGate CDC Cleanup job can be found in CDC Capture Method Operational Considerations.

Enabling Bi-Directional Loop Detection

Loop detection is a requirement for bi-directional implementations of Oracle GoldenGate, so that an Extract for one source database does not recapture transactions sent by a Replicat from another source database.

With the CDC Extract capture method, by default, any transaction committed by a Replicat into a database where an Extract is configured, will recapture that transaction from the Replicat as long as supplemental logging is enabled for those tables that the Replicat is delivering to.

In order to ignore recapturing transactions that are applied by a Replicat, you must use the TRANLOGOPTIONS FILTERTABLE parameter for the CDC Extract. The table used as the filtering table will be the Oracle GoldenGate checkpoint table that you must create for the Replicat.

To create a Filter Table and enable Supplemental Logging:

The steps below require a database user who is a member of the SQL Server System Administrators (sysadmin) role.

  1. On the source system, run GGSCI

  2. Issue the following command to log into the database.

    DBLOGIN SOURCEDB DSN [,{USERID user, PASSWORD password | USERIDALIAS alias}]

    In the preceding example, the SOURCEDB DSN is the name of the SQL Server data source. The USERID user is the database login and PASSWORD password is the password that is required if the data source connects through SQL Server authentication. Alternatively, USERIDALIAS alias is the alias for the credentials if they are stored in a credentials store. If using DBLOGIN with a DSN that is using Integrated Windows authentication, the connection to the database for the GGSCI session is that of the user running GGSCI. In order to issue ADD TRANDATA or DELETE TRANDATA, this user must be a member of the SQL Server sysadmin server role.

  3. Create the Oracle GoldenGate checkpoint table that is used by the Replicat to deliver data to the source database.

    Example: ADD CHECKPOINTTABLE ogg.ggchkpt

    It is recommended that you use the same schema name as used in the GGSCHEMA parameter of the GLOBALS file.

  4. Enable supplemental logging for the newly created checkpoint table.

    Example: ADD TRANDATA ogg.ggchkpt

  5. Add the Replicat with the checkpoint table information.

    Example: ADD REPLICAT reptgt1, EXTTRAIL ./dirdat/e2,checkpointtable ogg.ggchkpt

  6. Configure the Extract with the IGNOREREPLICATES (on by default) and FILTERTABLE parameters, using the Replicat’s checkpoint table for the filtering table.

    TRANLOGOPTIONS IGNOREREPLICATES

    TRANLOGOPTIONS FILTERTABLE ogg.ggchkpt