Tuning the HA for Oracle GoldenGate Fault Monitors

Language:

Fault monitoring for the HA for Oracle GoldenGate data service is provided by the fault monitor for the Oracle GoldenGate instance.

Each fault monitor is contained in a resource whose resource type is shown in the following table.

Table 3 Resource Type for the Fault Monitors of HA for Oracle GoldenGate

Component	Resource Type
Oracle GoldenGate	`ORCL.GoldenGate`

System properties and extension properties of the resource types control the behavior of the fault monitors. The default values of these properties determine the preset behavior of the fault monitors. The preset behavior should be suitable for most Oracle Solaris Cluster installations. Therefore, you should tune the fault monitors only if you need to modify this preset behavior.

Tuning these fault monitors involves the following tasks:

Setting the interval between fault monitor probes
Setting the timeout for fault monitor probes
Defining the criteria for persistent faults
Specifying the failover behavior of a resource

Perform these tasks when you register and configure HA for Oracle GoldenGate, as described in Registering and Configuring HA for Oracle GoldenGate.

For detailed information about these tasks, see Tuning Fault Monitors for Oracle Solaris Cluster Data Services in Oracle Solaris Cluster Data Services Planning and Administration Guide .

Operation of the Fault Monitor for the Oracle GoldenGate Resource Type

To determine whether the Oracle GoldenGate process is operating correctly, the fault monitor for the Oracle GoldenGate resource type probes these resources periodically.

The probe checks if the manager process is in the process table. If this is true, then the probe opens a socket to the manager port. If not, then the probe declares the resource as faulted.

If the manager is OK, the probe checks if there are extract or replicat processes running. If not, it checks the status of these processes with the following command:

# su - ggowner -c "echo 'info all' | base directory/ggsci"

If the status of every configured extract or replicat process is abended, the probe returns with the value of the property DEGRADED_CODE.

The number of tolerated consecutive DEGRADED_CODE probe return within the retry_interval seconds is calculated by the number 100 divided by DEGRADED_CODE. If this number is greater than the retry_interval divided by thorough_probe_interval, then the probe return values of DEGRADED_CODE will be tolerated forever.

Tuning the Process Monitoring

The HA for Oracle GoldenGate agent can be tuned to run with process monitoring or without process monitoring.

You can tune the process monitoring by setting the pmf_managed property to either true or false.

The benefit of running with process monitoring is that if the whole process tree collapses, a resource action happens immediately. The benefit of running without process monitoring is that Oracle Solaris Cluster can take control of an Oracle GoldenGate instance that is already running.

Tuning the Failover Behavior

The default Failover_mode mode for an Oracle GoldenGate resource is soft. This default mode indicates that the resource can trigger a failover to another cluster node if probe errors happen too often and will translate into an outage for the underlying database as well. If this is not the desired behavior, consider setting the Failover_mode of the Oracle GoldenGate resource to log_only or resource_restart.

For more generic information of fault monitors, see Tuning Fault Monitors for Oracle Solaris Cluster Data Services in Oracle Solaris Cluster Data Services Planning and Administration Guide .

The Oracle GoldenGate agent inherits the genuine behavior of the ORCL.gds resource type. For more information about the ORCL.gds resource type, see Oracle Solaris Cluster Generic Data Service (GDS) Guide .