13 Preparing and Configuring Your System for Oracle GoldenGate

Learn about how to prepare your system for running Oracle GoldenGate and how to configure it with your MySQL database.

Topics:

13.1 Ensuring Data Availability

Retain enough binary log data so that if you stop Extract or there is an unplanned outage, Extract can start again from its checkpoints. Extract must have access to the binary log that contains the start of the oldest uncommitted unit of work, and all binary logs thereafter. The recommended retention period is at least 24 hours worth of transaction data, including both active and archived information. You might need to do some testing to determine the best retention time given your data volume and business requirements.

If data that Extract needs during processing was not retained, either in active or backup logs, one of the following corrective actions might be required:

  • Alter Extract to capture from a later point in time for which binary log data is available (and accept possible data loss on the target).

  • Resynchronize the source and target tables, and then start the Oracle GoldenGate environment over again.

To determine where the Extract checkpoints are, use the INFO EXTRACT command. For more information, see INFO EXTRACT in Reference for Oracle GoldenGate

13.2 Setting Logging Parameters

To capture from the MySQL transaction logs, the Oracle GoldenGate Extract process must be able to find the index file. index file in turn contains the paths of all binary log files.

Note:

Extract expects that all of the table columns are in the binary log. As a result, only binlog_row_image set as full is supported and this is the default. Other values of binlog_row_image are not supported.

Extract checks the following parameter settings to get this index file path:

  1. Extract TRANLOGOPTIONS parameter with the ALTLOGDEST option: If this parameter specifies a location for the log index file, Extract accepts this location over any default that is specified in the MySQL Server configuration file. When ALTLOGDEST is used, the binary log index file must also be stored in the specified directory. This parameter should be used if the MySQL configuration file does not specify the full index file path name, specifies an incorrect location, or if there are multiple installations of MySQL on the same machine

    To specify the index file path with TRANLOGICOPTIONS with ALTLOGDEST, use the following command format on Windows:

    TRANLOGOPTIONS ALTLOGDEST "C:\\Program Files\\MySQL\\logs\\binlog.index"
    

    On Linux, use this format:

    TRANLOGOPTIONS ALTLOGDEST "/mnt/rdbms/mysql/data/logs/binlog.index" 
    
  2. The MySQL Server configuration file: The configuration file stores default startup options for the MySQL server and clients. On Windows, the name of the configuration file is my.ini. On other platforms, it is my.cnf. In the absence of TRANLOGOPTIONS with ALTLOGDEST, Extract gets information about the location of the log files from the configuration file; however, even with ALTLOGDEST, these Extract parameters must be set correctly:
    • binlog-ignore-db=oggddl: This prevents DDL logging history table entries in the binlog and is set in the my.cnf or my.ini file. This is required only if DDL replication is enabled.

    • log-bin: This parameter is used to enable binary logging. This parameter also specifies the location of the binary log index file and is a required parameter for Oracle GoldenGate, even if ALTLOGDEST is used. If log-bin is not specified, binary logging will be disabled and Extract returns an error.

    • log-bin-index: This parameter specifies the location of the binary log index. If it is not used, Extract assumes that the index file is in the same location as the log files. If this parameter is used and specifies a different directory from the one that contains the binary logs, the binary logs must not be moved once Extract is started.

    • binlog_format: This parameter sets the format of the logs. It must be set to the value of ROW, which directs the database to log DML statements in binary format. Any other log format (MIXED or STATEMENT) causes Extract to abend.

      Note:

      MySQL binary logging does not allow logging to be enabled or disabled for specific tables. It applies globally to all tables in the database.

    To locate the configuration file, Extract checks the MYSQL_HOME environment variable: If MYSQL_HOME is set, Extract uses the configuration file in the specified directory. If MYSQL_HOME is not set, Extract queries the information_schema.global_variables table to determine the MySQL installation directory. If a configuration file exists in that directory, Extract uses it.

13.3 Adding Host Names

Oracle GoldenGate gets the name of the database it is supposed to connect to from the SOURCEDB parameter. A successful connection depends on the localhost entry being properly configured in the system host file. To avoid issues that arise from improper local host configuration, you can use SOURCEDB in the following format:

SOURCEDB database_name@host_name

Where: database_name is the name of the MySQL instance, and host_name is the name or IP address of the local host. If using an unqualified host name, that name must be properly configured in the DNS database. Otherwise, use the fully qualified host name, for example myhost.company.com.

13.4 Setting the Session Character Set

The GGSCI, Extract and Replicat processes use a session character set when connecting to the database. For MySQL, the session character set is taken from the SESSIONCHARSET option of SOURCEDB and TARGETDB. Make certain you specify a session character set in one of these ways when you configure Oracle GoldenGate.

13.5 Preparing Tables for Processing

This section describes how to prepare the tables for processing. Table preparation requires these tasks:

13.5.1 Assigning Row Identifiers

Oracle GoldenGate requires some form of unique row identifier on the source and target tables to locate the correct target rows for replicated updates and deletes.

13.5.1.1 How Oracle GoldenGate Determines the Kind of Row Identifier to Use

Unless a KEYCOLS clause is used in the TABLE or MAP statement, Oracle GoldenGate selects a row identifier to use in the following order of priority:

  1. Primary key
  2. First unique key alphanumerically that does not contain a timestamp or non-materialized computed column.
  3. If none of the preceding key types exist (even though there might be other types of keys defined on the table) Oracle GoldenGate constructs a pseudo key of all columns that the database allows to be used in a unique key, excluding those that are not supported by Oracle GoldenGate in a key or those that are excluded from the Oracle GoldenGate configuration.

    Note:

    If there are other, non-usable keys on a table or if there are no keys at all on the table, Oracle GoldenGate logs an appropriate message to the report file. Constructing a key from all of the columns impedes the performance of Oracle GoldenGate on the source system. On the target, this key causes Replicat to use a larger, less efficient WHERE clause.

13.5.1.2 Tables with a Primary Key Derived from a Unique Index

In the absence of a primary key on a table, MySQL will promote a unique index to primary key if the indexed column is NOT NULL. If there are more than one of these not-null indexes, the first one that was created becomes the primary key. To avoid Replicat errors, create these indexes in the same order on the source and target tables.

For example, assume that source and target tables named ggvam.emp each have columns named first, middle, and last, and all are defined as NOT NULL. If you create unique indexes in the following order, Oracle GoldenGate will abend on the target because the table definitions do not match.

Source:

mysql> create unique index uq1 on ggvam.emp(first); 
mysql> create unique index uq2 on ggvam.emp(middle); 
mysql> create unique index uq3 on ggvam.emp(last); 

Target:

mysql> create unique index uq1 on ggvam.emp(last); 
mysql> create unique index uq2 on ggvam.emp(first); 
mysql> create unique index uq3 on ggvam.emp(middle); 

The result of this sequence is that MySQL promotes the index on the source "first" column to primary key, and it promotes the index on the target "last" column to primary key. Oracle GoldenGate will select the primary keys as identifiers when it builds its metadata record, and the metadata will not match. To avoid this error, decide which column you want to promote to primary key, and create that index first on the source and target.

13.5.1.3 How to Specify Your Own Key for Oracle GoldenGate to Use

If a table does not have one of the preceding types of row identifiers, or if you prefer those identifiers not to be used, you can define a substitute key if the table has columns that always contain unique values. You define this substitute key by including a KEYCOLS clause within the Extract TABLE parameter and the Replicat MAP parameter. The specified key will override any existing primary or unique key that Oracle GoldenGate finds.

13.5.2 Limiting Row Changes in Tables That Do Not Have a Key

If a target table does not have a primary key or a unique key, duplicate rows can exist. In this case, Oracle GoldenGate could update or delete too many target rows, causing the source and target data to go out of synchronization without error messages to alert you. To limit the number of rows that are updated, use the DBOPTIONS parameter with the LIMITROWS option in the Replicat parameter file. LIMITROWS can increase the performance of Oracle GoldenGate on the target system because only one row is processed.

13.5.3 Disabling Triggers and Cascade Constraints

Disable triggers, cascade delete constraints, and cascade update constraints on the target tables, or alter them to ignore changes made by the Oracle GoldenGate database user. Oracle GoldenGate replicates DML that results from a trigger or cascade constraint. If the same trigger or constraint gets activated on the target table, it becomes redundant because of the replicated version, and the database returns an error. Consider the following example, where the source tables are emp_src and salary_src and the target tables are emp_targ and salary_targ.

  1. A delete is issued for emp_src.

  2. It cascades a delete to salary_src.

  3. Oracle GoldenGate sends both deletes to the target.

  4. The parent delete arrives first and is applied to emp_targ.

  5. The parent delete cascades a delete to salary_targ.

  6. The cascaded delete from salary_src is applied to salary_targ.

  7. The row cannot be located because it was already deleted in step 5.

13.6 Changing the Log-Bin Location

Modifying the binary log location by using the log-bin variable in the MySQL configuration file might result in two different path entries inside the index file, which could result in errors. To avoid any potential errors, change the log-bin location by doing the following:

  1. Stop any new DML operations.
  2. Let the extract finish processing all of the existing binary logs. You can verify this by noting when the checkpoint position reaches the offset of the last log.
  3. After Extract finishes processing the data, stop the Extract group and, if necessary, back up the binary logs.
  4. Stop the MySQL database.
  5. Modify the log-bin path for the new location.
  6. Start the MySQL database.
  7. To clean the old log name entries from index file, use flush master or reset master (based on your MySQL version).
  8. Start Extract.

13.7 Configuring Bi-Directional Replication

In a bi-directional configuration, there are Extract and Replicat processes on both the source and target systems to support the replication of transactional changes on each system to the other system. To support this configuration, each Extract must be able to filter the transactions applied by the local Replicat, so that they are not recaptured and sent back to their source in a continuous loop. Additionally, AUTO_INCREMENT columns must be set so that there is no conflict between the values on each system.

  1. Configure Oracle GoldenGate for high availability or active-active replication according to the instructions in the Overview of Replicat in Administering Oracle GoldenGate
  2. To filter out Replicat operations in a bi-directional configuration so that the applied operations are not captured and looped back to the source again, take the following steps on each MySQL database:
    • Configure each Replicat process to use a checkpoint table. Replicat writes a checkpoint to this table at the end of each transaction. You can use one global checkpoint table or one per Replicat process See Overview of Replicat in Administering Oracle GoldenGate.

    • Specify the name of the checkpoint table with the FILTERTABLE option of the TRANLOGOPTIONS parameter in the Extract parameter file. The Extract process will ignore transactions that end with an operation to the specified table, which should only be those of Replicat.

      Note:

      Although optional for other supported databases as a means of enhancing recovery, the use of a checkpoint table is required for MySQL when using bi-directional replication (and likewise, will enhance recovery).

  3. Edit the MySQL server configuration file to set the auto_increment_increment and auto_increment_offset parameters to avoid discrepancies that could be caused by the bi-directional operations. The following illustrates these parameters, assuming two servers: ServerA and ServerB.

    ServerA:

    auto-increment-increment = 2
    auto-increment-offset = 1
    

    ServerB:

    auto-increment-increment = 2
    auto-increment-offset = 2
    

13.8 Capturing using a MySQL Replication Slave

You can configure a MySQL replication slave to capture the master's binary log events from the slave.

Typically, the transactions applied by the slave are logged into the relay logs and not into the slave's binlog. For the slave to write transactions in its binlog, that it receives from the master , you must start the replication slave with the log-slave-updates option as 1 in my.cnf in conjunction with the other binary logging parameters for Oracle GoldenGate. After the master's transactions are in the slave's binlog , you can set up a regular Oracle GoldenGate capture on the slave to capture and process the slave's binlog.

13.9 Establishing a Secure Database Connection to AWS Aurora MySQL

Oracle GoldenGate uses SSH tunneling to connect to the AWS Aurora instance using the .pem file and then runs the Oracle GoldenGate MySQL delivery from on-premise.

To set up SSH tunneling for a secure database connection to AWS Aurora MySQL, perform the following tasks:

Task1: Configuring SSH Tunneling

To configure SSH Tunneling:
ssh -i your_key.pem -v -f -L local port number:cluster end point:end 
point port number ec2 host name  > mysql_remote_forward.log 2>&1 

In the following example, note that the local port number is set to 3308 because the port number 3306 may be used by MySQL that's installed locally.

bash-4.1$ ssh -i "test.pem" -v -N -f -L 
3308:test-cluster.cluster-copyxiqzdjjl.us-west-2.rds.amazonaws.com:3306 
ec2-user@ec2-52-11-244-17.us-west-2.compute.amazonaws.com > 
mysql_remote_forward.log 2>&1 
If Oracle GoldenGate is running behind a proxy, then perform the following configuration:
ssh -i your_key.pem -v -f -L local port number:cluster end point:end point 
port number ec2 host name -o command to bypass proxy in double quotes> > 
mysql_remote_forward.log 2>&1 
For example:
bash-4.1$ ssh -i "test.pem" -v -N -f -L 3308:test-cluster.cluster-copyxiqzdjjl.us-west-2.rds.amazonaws.com:3306 
ec2-user@ec2-52-11-244-17.us-west-2.compute.amazonaws.com -o "ProxyCommand=nc 
-X connect -x www-proxy.us.oracle.com:80 %h %p" > mysql_remote_forward.log 2>&1

Task 2: Provide the Connection String in the Extract Parameter File

The remote user must have all the permissions. The following is used in both the source and target databases:
remotedb@127.0.0.1:3308  userid 
remote-user, password 
remote-user-password
Also, to prevent a connection timeout, add the following variable to ~/.ssh/config file, because AWS automatically disconnects connection after 60 secs of inactivity:
bash-4.1$ cat ~/.ssh/config 
ServerAliveInterval 50 

13.10 Other Oracle GoldenGate Parameters for MySQL

The following parameters may be of use in MySQL installations, and might be required if non-default settings are used for the MySQL database. Other Oracle GoldenGate parameters will be required in addition to these, depending on your intended business use and configuration.

Table 13-1 Other Parameters for Oracle GoldenGate for MySQL

Parameter Description

DBOPTIONS with CONNECTIONPORT port_number

Required to specify to the VAM the TCP/IP connection port number of the MySQL instance to which an Oracle GoldenGate process must connect if MySQL is not running on the default of 3306.

DBOPTIONS CONNECTIONPORT 3307 

DBOPTIONS with HOST host_id

Specifies the DNS name or IP address of the system hosting MySQL to which Replicat must connect.

DBOPTIONS with ALLOWLOBDATATRUNCATE

Prevents Replicat from abending when replicated LOB data is too large for a target MySQL CHAR, VARCHAR, BINARY or VARBINARY column.

SOURCEDB with USERID and PASSWORD

Specifies database connection information consisting of the database, user name and password to use by an Oracle GoldenGate process that connects to a MySQL database. If MySQL is not running on the default port of 3306, you must specify a complete connection string that includes the port number: SOURCEDB dbname@hostname:port, USERID user, PASSWORD password.Example:

SOURCEDB mydb@mymachine:3307, USERID myuser, PASSWORD mypassword

If you are not running the MySQL database on port 3306, you must also specify the connection port of the MySQL database in the DBLOGIN command when issuing commands that affect the database through GGSCI:

DBLOGIN SOURCEDB dbname@hostname:port, USERID user, PASSWORD password

For example:

GGSCI> DBLOGIN SOURCEDB mydb@mymachine:3307, USERID myuser, PASSWORD mypassword 

SQLEXEC

To enable Replicat to bypass the MySQL connection timeout, configure the following command in a SQLEXEC statement in the Replicat parameter file.

SQLEXEC "select CURRENT_TIME();" EVERY n MINUTES

Where: n is the maximum interval after which you want Replicat to reconnect. The recommended connection timeout 31536000 seconds (365 days).

13.11 Positioning Extract to a Specific Start Point

You can position the ADD EXTRACT and ALTER EXTRACT commands to a specific start point in the transaction logs with the following command.

{ADD | ALTER EXTRACT} group, VAM, LOGNUM log_num, LOGPOS log_pos

  • group is the name of the Oracle GoldenGate Extract group for which the start position is required.

  • log_num is the log file number. For example, if the required log file name is test.000034, this value is 34. Extract will search for this log file.

  • log_pos is an event offset value within the log file that identifies a specific transaction record. Event offset values are stored in the header section of a log record. To position at the beginning of a binlog file, set the log_pos as 4. The log_pos 0 or 1 are not valid offsets to start reading and processing.

In MySQL logs, an event offset value can be unique only within a given binary file. The combination of the position value and a log number will uniquely identify a transaction record and cannot exceed a length of 37. Transactional records available after this position within the specified log will be captured by Extract. In addition, you can position an Extract using a timestamp.