Sun Cluster Data Service for SWIFTAlliance Access Guide for Solaris OS

Understanding the Sun Cluster HA for Alliance Access Fault Monitor

This section describes the Sun Cluster HA for Alliance Access fault monitor's probing algorithm or functionality, and states the conditions, messages, and recovery actions associated with unsuccessful probing.

For conceptual information on fault monitors, see the Sun Cluster Concepts Guide.

Resource Properties

The Sun Cluster HA for Alliance Access fault monitor uses the same resource properties as resource type SUNW.gds, refer to the SUNW.gds(5) man page for a complete list of resource properties used.

Probing Algorithm and Functionality

By default, the HA agent provides a fault monitor for the DCE component only when using Alliance Access 5.5. The fault monitoring for Alliance Access is switched off by default. If the Alliance Access application fails, the agent will not restart the Alliance Access application automatically. This behavior was explicitly requested by SWIFT. It will enable you to operate the application in a way that the probe does not interfere with the normal behavior of some Alliance Access features like:

operator manually triggering the Alliance Access restart function , for example, to run Alliance Access in housekeeping mode.
automatic or scheduled Alliance Access restart, for example, to run database backup and other maintenance or end-of-day processes.
any graceful Alliance Access restart or recovery, in case of a Alliance Access transient local error.

The HA agent will update the resource status message to output Degraded - SAA Instance offline.

If an automatic failover occurs with default setting, it is most likely that there was a DCE problem. The Alliance Access application will cause a failover only when it does not start on the current node.

The HA agent provides an option to turn on fault monitoring for Alliance Access at registration time. However, this option is not recommended by SWIFT. The optional probing checks for the existence of the Alliance Access instance by calling the alliance command that is part of the application and by evaluating its return code. If the Alliance Access instance is not running, return code 100 is sent to SUNW.gds, which in turn will perform an automatic restart depending on the configuration of the resource properties.