17 Tuning the Performance of Oracle GoldenGate

This chapter contains suggestions for improving the performance of Oracle GoldenGate components.

Topics:

17.1 Using Multiple Process Groups

Typically, only one Extract group is required to efficiently capture from a database. However, depending on the redo (transactional) values, or the data and operation types, you may find that you are required to add one or more Extract group to the configuration.

Similarly, only one Replicat group is typically needed to apply data to a target database if using Replicat in coordinated mode. (See About Coordinated Replicat Mode for more information.) However, even in some cases when using Replicat in coordinated mode, you may be required to use multiple Replicat groups. If you are using Replicat in classic mode and your applications generate a high transaction volume, you probably will need to use parallel Replicat groups.

Because each Oracle GoldenGate component — Extract, data pump, trail, Replicat — is an independent module, you can combine them in ways that suit your needs. You can use multiple trails and parallel Extract and Replicat processes (with or without data pumps) to handle large transaction volume, improve performance, eliminate bottlenecks, reduce latency, or isolate the processing of specific data.

Figure 17-1 shows some of the ways that you can configure Oracle GoldenGate to improve throughput speed and overcome network bandwidth issues.

Figure 17-1 Load-balancing configurations that improve performance

Load-balancing configurations that improve performance shows some of the ways that you can configure Oracle GoldenGate to improve throughput speed and overcome network bandwidth issues.
The image labels imply the following:
  • A: Parallel Extracts divide the load. For example, by schema or to isolate tables that generate fetches.

  • B: A data pump with local trail can be used for filtering, conversion, and network false tolerance.

  • C: Multiple data pumps work around network per-process bandwidth limitations to enable TCP/IP throughput. Divide the TABLE parameter statements among them.

  • D: Parallel Replicats increase throughput to the database. Any trail can be read by one or more Replicats. Divide MAP statements among them.

17.1.1 Considerations for Using Multiple Process Groups

Before configuring multiple processing groups, review the following considerations to ensure that your configuration produces the desired results and maintains data integrity.

17.1.1.1 Maintaining Data Integrity

Not all workloads can be partitioned across multiple groups and still preserve the original transaction atomicity. You must determine whether the objects in one group will ever have dependencies on objects in any other group, transactional or otherwise. For example, tables for which the workload routinely updates the primary key cannot easily be partitioned in this manner. DDL replication (if supported for the database) is not viable in this mode, nor is the use of some SQLEXEC or EVENTACTIONS features that base their actions on a specific record.

If your tables do not have any foreign- key dependencies or updates to primary keys, you may be able to use multiple processes. Keep related DML together in the same process stream to ensure data integrity.

17.1.1.2 Number of Groups

The number of concurrent Extract and Replicat process groups that can run on a system depends on how much system memory is available. Each Classic Extract and Replicat process needs approximately 25-55 MB of memory or more, depending on the size of the transactions and the number of concurrent transactions. The Oracle GoldenGate GGSCI command interface fully supports up to 5,000 concurrent Extract and Replicat groups (combined) per instance of Oracle GoldenGate Manager. At the supported level, all groups can be controlled and viewed in full with GGSCI commands such as the INFO and STATUS commands. Beyond the supported level, group information is not displayed and errors may occur. Oracle GoldenGate recommends keeping the number of Extract and Replicat groups (combined) at a more manageable level, such as 100 or below, in order to manage your environment effectively. The maximum number of groups is controlled by the MAXGROUPS parameter, whose default is 1000.

For Windows Server environments, the number of process groups that can be run are tightly coupled to the ‘non-interactive’ Windows desktop heap memory settings. The default settings for Windows desktop heap may be enough to run very small numbers of process groups, but as you approach larger amounts of process groups, more than 60 or so, you will either need to adjust the ‘non-interactive’ value of the SharedSection field in the registry, based on this information from Microsoft (Windows desktop heap memory), or increase the number of Oracle GoldenGate homes and spread the total number of desired process groups across these homes.

Note:

For more information on modifying the Windows Desktop Heap memory, review the following Oracle Knowledge Base document (Doc ID 2056225.1).
17.1.1.3 Memory

The system must have sufficient swap space for each Oracle GoldenGate Extract and Replicat process that will be running. To determine the required swap space:

  1. Start up one Extract or Replicat.
  2. Run GGSCI.
  3. View the report file and find the line PROCESS VM AVAIL FROM OS (min).
  4. Round up the value to the next full gigabyte if needed. For example, round up 1.76GB to 2 GB.
  5. Multiply that value by the number of Extract and Replicat processes that will be running. The result is the maximum amount of swap space that could be required

See the CACHEMGR parameter in Reference for Oracle GoldenGate for more information about how memory is managed.

17.1.1.4 Isolating Processing-Intensive Tables

You can use multiple process groups to support certain kinds of tables that tend to interfere with normal processing and cause latency to build on the target. For example:

  • Extract may need to perform a fetch from the database because of the data type of the column, because of parameter specifications, or to perform SQL procedures. When data must be fetched from the database, it affects the performance of Extract. You can get fetch statistics from the STATS EXTRACT command if you include the STATOPTIONS REPORTFETCH parameter in the Extract parameter file. You can then isolate those tables into their own Extract groups, assuming that transactional integrity can be maintained.

  • In its classic mode, Replicat process can be a source of performance bottlenecks because it is a single-threaded process that applies operations one at a time by using regular SQL. Even with BATCHSQL enabled (see Reference for Oracle GoldenGate) Replicat may take longer to process tables that have large or long-running transactions, heavy volume, a very large number of columns that change, and LOB data. You can then isolate those tables into their own Replicat groups, assuming that transactional integrity can be maintained.

17.1.2 Using Parallel Replicat Groups on a Target System

This section contains instructions for creating a configuration that pairs one Extract group with multiple Replicat groups. Although it is possible for multiple Replicat processes to read a single trail (no more than three of them to avoid disk contention) it is recommended that you pair each Replicat with its own trail and corresponding Extract process.

For detailed instructions on configuring change synchronization, see Configuring Online Change Synchronization.

Topics:

17.1.2.1 To Create the Extract Group

Note:

This configuration includes Extract data-pumps.

  1. On the source, use the ADD EXTRACT command to create a primary Extract group.
  2. On the source, use the ADD EXTTRAIL command to specify as many local trails as the number of Replicat groups that you will be creating. All trails must be associated with the primary Extract group.
  3. On the source create a data-pump Extract group.
  4. On the source, use the ADD RMTTRAIL command to specify as many remote trails as the number of Replicat groups that you will be creating. All trails must be associated with the data-pump Extract group.
  5. On the source, use the EDIT PARAMS command to create Extract parameter files, one for the primary Extract and one for the data pump, that contain the parameters required for your database environment. When configuring Extract, do the following:
    • Divide the source tables among different TABLE parameters.

    • Link each TABLE statement to a different trail. This is done by placing the TABLE statements after the EXTTRAIL or RMTTRAIL parameter that specifies the trail you want those statements to be associated with.

17.1.2.2 To Create the Replicat Groups
  1. On the target, create a Replicat checkpoint table. For instructions, see Creating a Checkpoint Table. All Replicat groups can use the same checkpoint table.
  2. On the target, use the ADD REPLICAT command to create a Replicat group for each trail that you created. Use the EXTTRAIL argument of ADD REPLICAT to link the Replicat group to the appropriate trail.
  3. On the target, use the EDIT PARAMS command to create a Replicat parameter file for each Replicat group that contains the parameters required for your database environment. All MAP statements for a given Replicat group must specify the same objects that are contained in the trail that is linked to that group.
  4. In the Manager parameter file on the target system, use the PURGEOLDEXTRACTS parameter to control the purging of files from the trails.

17.1.3 Using Multiple Extract Groups with Multiple Replicat Groups

Multiple Extract groups write to their own trails. Each trail is read by a dedicated Replicat group.

For detailed instructions on configuring change synchronization, see Configuring Online Change Synchronization.

17.1.3.1 To Create the Extract Groups

Note:

This configuration includes data pumps.

  1. On the source, use the ADD EXTRACT command to create the primary Extract groups.
  2. On the source, use the ADD EXTTRAIL command to specify a local trail for each of the Extract groups that you created.
  3. On the source create a data-pump Extract group to read each local trail that you created.
  4. On the source, use the ADD RMTTRAIL command to specify a remote trail for each of the data-pumps that you created.
  5. On the source, use the EDIT PARAMS command to create an Extract parameter file for each primary Extract group and each data-pump Extract group.
17.1.3.2 To Create the Replicat Groups
  1. On the target, create a Replicat checkpoint table. For instructions, see Creating a Checkpoint Table. All Replicat groups can use the same checkpoint table.
  2. On the target, use the ADD REPLICAT command to create a Replicat group for each trail. Use the EXTTRAIL argument of ADD REPLICAT to link the group to the trail.
  3. On the target, use the EDIT PARAMS command to create a Replicat parameter file for each Replicat group. All MAP statements for a given Replicat group must specify the same objects that are contained in the trail that is linked to the group.
  4. In the Manager parameter files on the source system and the target system, use the PURGEOLDEXTRACTS parameter to control the purging of files from the trails.

17.2 Splitting Large Tables Into Row Ranges Across Process Groups

You can use the @RANGE function to divide the rows of any table across two or more Oracle GoldenGate processes. It can be used to increase the throughput of large and heavily accessed tables and also can be used to divide data into sets for distribution to different destinations. Specify each range in a FILTER clause in a TABLE or MAP statement.

@RANGE is safe and scalable. It preserves data integrity by guaranteeing that the same row will always be processed by the same process group.

It might be more efficient to use the primary Extract or a data pump to calculate the ranges than to use Replicat. To calculate ranges, Replicat must filter through the entire trail to find data that meets the range specification. However, your business case should determine where this filtering is performed.

Figure 17-2 Dividing rows of a table between two Extract groups

Description of Figure 17-2 follows
Description of "Figure 17-2 Dividing rows of a table between two Extract groups"

Figure 17-3 Dividing rows of a table between two Replicat groups

Description of Figure 17-3 follows
Description of "Figure 17-3 Dividing rows of a table between two Replicat groups"

17.3 Configuring Oracle GoldenGate to Use the Network Efficiently

Inefficiencies in the transfer of data across the network can cause lag in the Extract process and latency on the target. If not corrected, it can eventually cause process failures.

When you first start a new Oracle GoldenGate configuration:

  1. Establish benchmarks for what you consider to be acceptable lag and throughput volume for Extract and for Replicat. Keep in mind that Extract will normally be faster than Replicat because of the kind of tasks that each one performs. Over time you will know whether the difference is normal or one that requires tuning or troubleshooting.

  2. Set a regular schedule to monitor those processes for lag and volume, as compared to the benchmarks. Look for lag that remains constant or is growing, as opposed to occasional spikes. Continuous, excess lag indicates a bottleneck somewhere in the Oracle GoldenGate configuration. It is a critical first indicator that Oracle GoldenGate needs tuning or that there is an error condition.

To view volume statistics, use the STATS EXTRACT or STATS REPLICAT command. To view lag statistics, use the LAG EXTRACT or LAG REPLICAT command.

Topics:

17.3.1 Detecting a Network Bottleneck that is Affecting Oracle GoldenGate

To detect a network bottleneck that is affecting the throughput of Oracle GoldenGate, follow these steps.

  1. Issue the following command to view the ten most recent Extract checkpoints. If you are using a data-pump Extract on the source system, issue the command for the primary Extract and also for the data pump.
    INFO EXTRACT group, SHOWCH 10
    
  2. Look for the Write Checkpoint statistic. This is the place where Extract is writing to the trail.
    Write Checkpoint #1
    
    GGS Log Trail
    Current Checkpoint (current write position):
       Sequence #: 2
       RBA: 2142224
       Timestamp: 2011-01-09 14:16:50.567638
       Extract Trail: ./dirdat/eh
    
  3. For both the primary Extract and data pump:
    • Determine whether there are more than one or two checkpoints. There can be up to ten.

    • Find the Write Checkpoint n heading that has the highest increment number (for example, Write Checkpoint #8) and make a note of the Sequence, RBA, and Timestamp values. This is the most recent checkpoint.

  4. Refer to the information that you noted, and make the following validation:
    • Is the primary Extract generating a series of checkpoints, or just the initial checkpoint?

    • If a data pump is in use, is it generating a series of checkpoints, or just one?

  5. Issue INFO EXTRACT for the primary and data pump Extract processes again.
    • Has the most recent write checkpoint increased? Look at the most recent Sequence, RBA, and Timestamp values to see if their values were incremented forward since the previous INFO EXTRACT command.

  6. Issue the following command to view the status of the Replicat process.
    SEND REPLICAT group, STATUS
    
    • The status indicates whether Replicat is delaying (waiting for data to process), processing data, or at the end of the trail (EOF).

There is a network bottleneck if the status of Replicat is either in delay mode or at the end of the trail file and either of the following is true:

  • You are only using a primary Extract and its write checkpoint is not increasing or is increasing too slowly. Because this Extract process is responsible for sending data across the network, it will eventually run out of memory to contain the backlog of extracted data and abend.

  • You are using a data pump, and its write checkpoint is not increasing, but the write checkpoint of the primary Extract is increasing. In this case, the primary Extract can write to its local trail, but the data pump cannot write to the remote trail. The data pump will abend when it runs out of memory to contain the backlog of extracted data. The primary Extract will run until it reaches the last file in the trail sequence and will abend because it cannot make a checkpoint.

Note:

Even when there is a network outage, Replicat will process in a normal manner until it applies all of the remaining data from the trail to the target. Eventually, it will report that it reached the end of the trail file.

17.3.2 Working Around Bandwidth Limitations by Using Data Pumps

Using parallel data pumps may enable you to work around bandwidth limitations that are imposed on a per-process basis in the network configuration. You can use parallel data pumps to send data to the same target system or to different target systems. Data pumps also remove TCP/IP responsibilities from the primary Extract, and their local trails provide fault tolerance.

17.3.3 Increasing the TCP/IP Packet Size

Use the TCPBUFSIZE option of the RMTHOST parameter to control the size of the TCP socket buffer that Extract maintains. By increasing the size of the buffer, you can send larger packets to the target system. See Reference for Oracle GoldenGate for more information.

Use the following steps as a guideline to determine the optimum buffer size for your network.

  1. Use the ping command from the command shell obtain the average round trip time (RTT), shown in the following example:
    C:\home\ggs>ping ggsoftware.com
    Pinging ggsoftware.com [192.168.116.171] with 32 bytes of data:
    Reply from 192.168.116.171: bytes=32 time=31ms TTL=56
    Reply from 192.168.116.171: bytes=32 time=61ms TTL=56
    Reply from 192.168.116.171: bytes=32 time=32ms TTL=56
    Reply from 192.168.116.171: bytes=32 time=34ms TTL=56
    Ping statistics for 192.168.116.171:
        Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
    Approximate round trip times in milli-seconds:
        Minimum = 31ms, Maximum = 61ms, Average = 39ms
    
  2. Multiply that value by the network bandwidth. For example, if average RTT is .08 seconds, and the bandwidth is 100 megabits per second, then the optimum buffer size is:
    0.08 second * 100 megabits per second = 8 megabits
    
  3. Divide the result by 8 to determine the number of bytes (8 bits to a byte). For example:
    8 megabits / 8 = 1 megabyte per second
    

    The required unit for TCPBUFSIZE is bytes, so you would set it to a value of 1000000.

The maximum socket buffer size for non-Windows systems is usually limited by default. Ask your system administrator to increase the default value on the source and target systems so that Oracle GoldenGate can increase the buffer size configured with TCPBUFSIZE.

17.4 Eliminating Disk I/O Bottlenecks

I/O activity can cause bottlenecks for both Extract and Replicat.

  • A regular Extract generates disk writes to a trail and disk reads from a data source.

  • A data pump and Replicat generate disk reads from a local trail.

  • Each process writes a recovery checkpoint to its checkpoint file on a regular schedule.

17.4.1 Improving I/O performance Within the System Configuration

If there are I/O waits on the disk subsystems that contain the trail files, put the trails on the fastest disk controller possible.

Check the RAID configuration. Because Oracle GoldenGate writes data sequentially, RAID 0+1 (striping and mirroring) is a better choice than RAID 5, which uses checksums that slow down I/O and are not necessary for these types of files.

17.4.2 Improving I/O Performance Within the Oracle GoldenGate Configuration

You can improve I/O performance by making configurations changes within Oracle GoldenGate. Try increasing the values of the following parameters.

  • Use the CHECKPOINTSECS parameter to control how often Extract and Replicat make their routine checkpoints.

    Note:

    CHECKPOINTSECS is not valid for an integrated Replicat on an Oracle database system.

  • Use the GROUPTRANSOPS parameter to control the number of SQL operations that are contained in a Replicat transaction when operating in its normal mode. Increasing the number of operations in a Replicat transaction improves the performance of Oracle GoldenGate by reducing the number of transactions executed by Replicat, and by reducing I/O activity to the checkpoint file and the checkpoint table, if used. Replicat issues a checkpoint whenever it applies a transaction to the target, in addition to its scheduled checkpoints.

    Note:

    GROUPTRANSOPS is not valid for an integrated Replicat on an Oracle database system, unless the inbound server parameter parallelism is set to 1.

  • Use the EOFDELAY or EOFDELAYCSECS parameter to control how often Extract, a data pump, or Replicat checks for new data after it has reached the end of the current data in its data source. You can reduce the system I/O overhead of these reads by increasing the value of this parameter.

Note:

Increasing the values of these parameters improves performance, but it also increases the amount of data that must be reprocessed if the process fails. This has an effect on overall latency between source and target. Some testing will help you determine the optimal balance between recovery and performance.

17.5 Managing Virtual Memory and Paging

Because Oracle GoldenGate replicates only committed transactions, it stores the operations of each transaction in a managed virtual-memory pool known as a cache until it receives either a commit or a rollback for that transaction. One global cache operates as a shared resource of an Extract or Replicat process. The Oracle GoldenGate cache manager takes advantage of the memory management functions of the operating system to ensure that Oracle GoldenGate processes work in a sustained and efficient manner. The CACHEMGR parameter controls the amount of virtual memory and temporary disk space that is available for caching uncommitted transaction data that is being processed by Oracle GoldenGate.

When a process starts, the cache manager checks the availability of resources for virtual memory, as shown in the following example:

CACHEMGR virtual memory values (may have been adjusted)CACHESIZE: 32GCACHEPAGEOUTSIZE (normal): 8M PROCESS VM AVAIL FROM OS (min): 63.97GCACHESIZEMAX (strict force to disk): 48G

If the current resources are not sufficient, a message like the following may be returned:

2013-11-11 14:16:22 WARNING OGG-01842 CACHESIZE PER DYNAMIC DETERMINATION (32G) LESS THAN RECOMMENDED: 64G (64bit system)vm found: 63.97GCheck swap space. Recommended swap/extract: 128G (64bit system).

If the system exhibits excessive paging and the performance of critical processes is affected, you can reduce the CACHESIZE option of the CACHEMGR. parameter. You can also control the maximum amount of disk space that can be allocated to the swap directory with the CACHEDIRECTORY option. For more information about CACHEMGR, see Reference for Oracle GoldenGate.

17.6 Optimizing Data Filtering and Conversion

Heavy amounts of data filtering or data conversion add processing overhead. The following are suggestions for minimizing the impact of this overhead on the other processes on the system.

  • Avoid using the primary Extract to filter and convert data. Keep it dedicated to data capture. It will perform better and is less vulnerable to any process failures that result from those activities. The objective is to make certain the primary Extract process is running and keeping pace with the transaction volume.

  • Use Replicat or a data-pump to perform filtering and conversion. Consider any of the following configurations:

    • Use a data pump on the source if the system can tolerate the overhead. This configuration works well when there is a high volume of data to be filtered, because it uses less network bandwidth. Only filtered data gets sent to the target, which also can help with security considerations.

    • Use a data pump on an intermediate system. This configuration keeps the source and target systems free of the overhead, but uses more network bandwidth because unfiltered data is sent from the source to the intermediate system.

    • Use a data pump or Replicat on the target if the system can tolerate the overhead, and if there is adequate network bandwidth for sending large amounts of unfiltered data.

  • If you have limited system resources, a least-best option is to divide the filtering and conversion work between Extract and Replicat.

17.7 Tuning Replicat Transactions

Replicat uses regular SQL, so its performance depends on the performance of the target database and the type of SQL that is being applied (inserts, versus updates or deletes). However, you can take certain steps to maximize Replicat efficiency.

17.7.1 Tuning Coordination Performance Against Barrier Transactions

In a coordinated Replicat configuration, barrier transactions such as updates to the primary key cause an increased number of commits to the database, and they interrupt the benefit of the GROUPTRANSOPS feature of Replicat. When there is a high number of barrier transactions in the overall workload of the coordinated Replicat, using a high number of threads can actually degrade Replicat performance.

To maintain high performance when large numbers of barrier transactions are expected, you can do the following:

  • Reduce the number of active threads in the group. This reduces the overall number of commits that Replicat performs.

  • Move the tables that account for the majority of the barrier transactions, and any tables with which they have dependencies, to a separate coordinated Replicat group that has a small number of threads. Keep the tables that have minimal barrier transactions in the original Replicat group with the higher number of threads, so that parallel performance is maintained without interruption by barrier transactions.

  • (Oracle RAC) In a new Replicat configuration, you can increase the PCTFREE attribute of the Replicat checkpoint table. However, this must be done before Replicat is started for the first time. The recommended value of PCTFREE is 90.

17.7.2 Applying Similar SQL Statements in Arrays

Use the BATCHSQL parameter to increase the performance of Replicat. BATCHSQL causes Replicat to organize similar SQL statements into arrays and apply them at an accelerated rate. In its normal mode, Replicat applies one SQL statement at a time.

When Replicat is in BATCHSQL mode, smaller row changes will show a higher gain in performance than larger row changes. At 100 bytes of data per row change, BATCHSQL has been known to improve the performance of Replicat by up to 300 percent, but actual performance benefits will vary, depending on the mix of operations. At around 5,000 bytes of data per row change, the benefits of using BATCHSQL diminish.

The gathering of SQL statements into batches improves efficiency but also consumes memory. To maintain optimum performance, use the following BATCHSQL options:

BATCHESPERQUEUE 
BYTESPERQUEUE 
OPSPERBATCH 
OPSPERQUEUE 

As a benchmark for setting values, assume that a batch of 1,000 SQL statements at 500 bytes each would require less than 10 megabytes of memory.

You can use BATCHSQL with the BATCHTRANSOPS option to tune array sizing. BATCHTRANSOPS controls the maximum number of batch operations that can be grouped into a transaction before requiring a commit. The default for non-integrated Replicat is 1000. The default for integrated Replicat is 50. If there are many wait dependencies when using integrated Replicat, try reducing the value of BATCHTRANSOPS. To determine the number of wait dependencies, view the TOTAL_WAIT_DEPS column of the V$GG_APPLY_COORDINATOR database view in the Oracle database.

See Reference for Oracle GoldenGate for additional usage considerations and syntax.

17.7.3 Preventing Full Table Scans in the Absence of Keys

If a target table does not have a primary key, a unique key, or a unique index, Replicat uses all of the columns to build its WHERE clause. This is, essentially, a full table scan.

To make row selection more efficient, use a KEYCOLS clause in the TABLE and MAP statements to identify one or more columns as unique. Replicat will use the specified columns as a key. The following example shows a KEYCOLS clause in a TABLE statement:

TABLE hr.emp, KEYCOLS (FIRST_NAME, LAST_NAME, DOB, ID_NO);

For usage guidelines and syntax, see the TABLE and MAP parameters in Reference for Oracle GoldenGate.

17.7.4 Splitting Large Transactions

If the target database cannot handle large transactions from the source database, you can split them into a series of smaller ones by using the Replicat parameter MAXTRANSOPS. See Reference for Oracle GoldenGate for more information.

Note:

MAXTRANSOPS is not valid for an integrated Replicat on an Oracle database system.

17.7.5 Adjusting Open Cursors

The Replicat process maintains cursors for cached SQL statements and for SQLEXEC operations. Without enough cursors, Replicat must age more statements. By default, Replicat maintains as many cursors as allowed by the MAXSQLSTATEMENTS parameter. You might find that the value of this parameter needs to be increased. If so, you might also need to adjust the maximum number of open cursors that are permitted by the database. See Reference for Oracle GoldenGate for more information.

17.7.6 Improving Update Speed

Excessive block fragmentation causes Replicat to apply SQL statements at a slower than normal speed. Reorganize heavily fragmented tables, and then stop and start Replicat to register the new object ID.

17.7.7 Set a Replicat Transaction Timeout

Use the TRANSACTIONTIMEOUT parameter to prevent an uncommitted Replicat target transaction from holding locks on the target database and consuming its resources unnecessarily. You can change the value of this parameter so that Replicat can work within existing application timeouts and other database requirements on the target.

TRANSACTIONTIMEOUT limits the amount of time that Replicat can hold a target transaction open if it has not received the end-of-transaction record for the last source transaction in that transaction. By default, Replicat groups multiple source transactions into one target transaction to improve performance, but it will not commit a partial source transaction and will wait indefinitely for that last record. The Replicat parameter GROUPTRANSOPS controls the minimum size of a grouped target transaction.

The following events could last long enough to trigger TRANSACTIONTIMEOUT:

  • Network problems prevent trail data from being delivered to the target system.

  • Running out of disk space on any system, preventing trail data from being written.

  • Collector abends (a rare event).

  • Extract abends or is terminated in the middle of writing records for a transaction.

  • An Extract data pump abends or is terminated.

  • There is a source system failure, such as a power outage or system crash.

See Reference for Oracle GoldenGate for more information.

17.8 Using Healthcheck Scripts to Monitor and Troubleshoot

Oracle GoldenGate Healthcheck script provides database site information for Oracle Databases to allow monitoring and troubleshooting.

The Healtcheck script gathers all replication related configuration and performance information from a database in one single run. Within the scripts, there are many queries regarding the database instance and the database specific information from Extract and Replicat. You can run the script periodically to obtain the latest database side performance information regarding replication.

The output is one of the key information that is needed for support for a qualitative analysis of the replication environment.

Topics:

17.8.1 Installing, Running, and Uninstalling Healthcheck Scripts

The Healthcheck script is available for Oracle GoldenGate Classic and Microservices.

The Healtcheck directory contains three files to install, run and deinstall the Healthcheck script. Once the PL/SQL package is installed with ogghc_install.sql, you can frequently run the ohgghc_run.sql script to generate an output file. You can deinstall the Healthcheck script with the ogghc_deinstall.sql script.

The Healthcheck script is located in:
  • $OGG_HOME/lib/sql/healthcheck for MA

  • $OGG_HOME/healthcheck for CA

To gather information, Oracle recommends you to install and run the Healthcheck as a SYS user. However, you can also install and run the script as an Oracle GoldenGate Administration User. In this case, some system information is not available. The Healthcheck output displays the information that requires SYS privileges.

17.8.2 How to Deal with Healthcheck Information?

The output file of the Healthcheck script contains the instance name and a timestamp. By default, information about Integrated Extract and Replicat is gathered. However, you can retrieve information from the legacy Oracle GoldenGate schema or database profile. For this reason, you have to take out the argument of the EXCLUDE_TAG parameter.

Depending of the amount of information being queried, the run time of the script varies (in minutes).

You can eliminate a query that takes too long to process using the Healthcheck script and run another query in a parallel session to get the output.

17.8.3 Components of Healthcheck Information

The Healthcheck script generates an HTML file with JSON objects and HTML code, which you can view using a web browser.

The output of the Healthcheck script contains the following sections:
  • Overview

  • Extract

  • Replicat

  • Table Statistics and Errors

  • Tools

  • Report Map (Legacy)

Each of these sections contain menus and sub-menus depending on the type of data available.

The following table describes the sections and the data available in those sections based on the query used to generate the Healthcheck output.
Menu Description

Overview

The Overview section contains information about the following:
  • General Findings

  • Database, Extract and Replicat Summary

  • Capture Parameters

  • Apply Parameters

Database

Main Menu (The Main menu already contains the query information)

Database Objects

This sub-menu option displays information about the following:
  • Tables Not Supported By Oracle GoldenGate Integrated Capture

  • Instantiation SCNs for Apply Schema and Database (DDL)

Database Details

This sub-menu has a detailed database related various connections and services along with key Oracle GoldenGate parameters and the manual or modified database parameters. The second part shows the basic information about the components, software and patch level of the database. The information is distributed amongst the following sections.
  • Connection Information

  • Key init.ora parameters

  • Database dictionary and fixed table statistics

  • Software edition

  • Registry information including History

  • Replication bundled patch information

  • NLS Database parameters

  • Registered log files for capture

  • Current Database incarnation

  • Standby redo longs

  • GoldenGate Administrator User

Replication SQL Analysis

This section provides a complete log of Oracle GoldenGate related information from the session at run time of the script and active session history. This contains complete details about Waits, Events, IO, Contention and SQL.

Objects Instantiation

This sub-menu provides details about the schemas and table-level supplemental logging for Oracle GoldenGate:
  • Schemas prepared for capture

  • Table-level supplemental log groups enabled for capture

Extract

Main Menu

Extract Details

The Extract details covered in this sub-menu are:
  • Capture Runtime Information

  • Capture Transaction Processing

  • Capture Processing Information

  • Logminer Session Statistics

  • Capture Rules & Rulesets

Extract Performance

This sub-menu displays information about the Extract performance depending on the type of Extract being used. It displays the progress of the Extract which includes the following details:

  • Capture Name - Name of the Extract

  • Client Name - Name of the client where the Extract is running

  • Client Status - Status of the client where the Extract is running

  • Processed Low SCN - Processed SCN value of the Extract

  • Oldest SCN - Oldest SCN value of the Extract

Extract Logminer

This sub-menu contains information mainly used for debugging issues.

Replicat

Main Menu

Replicat Performance

The Replicat performance provides the following information:
  • Apply Progress

  • GoldenGate Inbound Progress Table

  • Information about Apply Progress table

  • Apply Network Receiver (ANR)

  • Apply Reader

  • Apply Reader - Dequeue Information

  • Apply Coordinator

  • Apply Coordinator Watermarks

  • Open GoldenGate Apply Transactions

  • Open GoldenGate Apply Transactions -Details

  • Apply Server Transactions ordered by Server_id

  • Apply Server Statstics - Summary

  • Apply Server Statistics - Details

  • Apply Server Statistics - Auto Tuning

  • Apply Server Wait Events

  • Apply Server Session Events

  • Apply Reader Processes

  • Apply Coordinator Processes

  • Apply Server Processes

CDR

This sub-menu provides a detailed log of the Replicat error handlers.

Apply Handler

This sub-menu contains information about the Replicat name, DDL handler, and precommit handler.

Error Management

Main Menu

Error Management Details This sub-menu has details about the Oracle GoldenGate table statistics sorted by table. It includes information such as server name, source table owner, source table name, destination table owner, destination table name, total operations, inserts, updates, deletes, insert/update/delete collisions, REPERROR discards, REPERROR ignores, WAIT dependencies, and CDR related updates.

Tools

Main Menu

History

This sub-menu is dependent on the type of query you have run and displays subscriber history, Extract, and Replicat history.

Report Map

Main Menu (Legacy).

It provides information on all the queries, which includes details such as:

  • Run time of the query

  • Number of Returned Rows

  • Information if the query has succeeded or failed

  • Information if this is internal information that is only visible if the script is run as SYS

  • Disabled Queries

The hyperlinks directs you to the appropriate query.

Hints/Description

This sub-menu is a map of all the activities logged in the Healthcheck report.

Alerts

This sub-menu provides a log of the alerts from the general findings about the database, Extract, and Replicat along with general system information.

Truncates

This sub-menu displays the Oracle GoldenGate related information from the V$ACTIVE_SESSSION_HISTORY and V$SQLAREA database views.

Note:

You can view only partial results for V$ACTIVE_SESSSION_HISTORY , as the original size of the query exceeds the maximum limit.
Config
This sub-menu allows you to add rules to the following sections that are available on this page:
  • Column Rules: There can be any number of rules on a single table and they will be applied one after the other in the order they appear in the rules table. It can be an expression using the values of the columns in a row.

  • Menu Items Exclusions: This option is used to exclude any menu item from the Healthcheck output. Click Add to set the rule.
  • Group By: This section is used to group by keys and aggregates. You can go to any statistic and click G to add a group by sorting.

JS Errors

Main Menu. This page displays debugging information for this framework.

Note:

It is visible only in case of errors.