The Sun Cluster 3.0 system enables applications to be run and administered as highly available and scalable resources (data services). The cluster facility known as the Resource Group Manager, or RGM, provides the mechanism for high availability. The elements that form the programming interface to this facility include the following.
A set of callback methods the RGM uses to control an application on the cluster
API commands and functions that callback methods can use to access information about the elements in the cluster
Process management facilities for monitoring and restarting processes on the cluster
The RGM runs as a daemon on each cluster node and automatically starts and stops resources on selected nodes according to pre-configured policies. The RGM makes a resource highly available in the event of a node failure or reboot by stopping the resource on the affected node and starting it on another. The RGM also automatically starts and stops resource-specific monitors that can detect resource failures and relocate failing resources onto another node or can monitor other aspects of resource performance.
The RGM supports both failover resources, which can be online on at most one node at a time, and scalable resources, which can be online on multiple nodes simultaneously.