Sun Cluster 3.0 Data Services Developers' Guide

Overview of the Sample Application

The sample data service starts, stops, restarts and switches the DNS application among the nodes of the cluster in response to cluster events such as administrative action, application failure, or node failure.

Application restart is managed by the SC 3.0 Process Monitor Facility (PMF). If application deaths exceed the failure count within the failure time window, the resource group containing the application resource is automatically failed over to another node.

The sample data service provides fault monitoring in the form of a PROBE method. that uses the nslookup command to ensure that the data service is healthy. If the probe detects a hung DNS data service, it tries to correct the situation by restarting the DNS application locally. If this does not improve the situation and the probe repeatedly detects problems with the data service, then the probe attempts to fail over the data service to another node in the cluster.

Specifically, the sample application includes:

A resource type registration file that defines the static properties of the data service.
A START callback method invoked by the RGM to start the in.named daemon when the resource group containing the HA-DNS data service is brought online or when the HA-DNS resource is enabled.
A STOP callback method invoked by the RGM to stop the in.named daemon when the resource group containing HA-DNS goes offline or the resource is disabled.
A fault monitor to check the reliability of the data service by verifying that the DNS server is running. The fault monitor is implemented by a user-defined PROBE method and started and stopped by MONITOR_START and MONITOR_STOP callback methods.
A VALIDATE callback method invoked by the RGM to validate that the configuration directory for the data service is accessible.
An UPDATE callback method invoked by the RGM to restart the fault monitor when the system administrator changes the value of a resource property.