| Skip Navigation Links | |
| Exit Print View | |
|   | Oracle Solaris Cluster Data Services Developer's Guide Oracle Solaris Cluster | 
1. Overview of Resource Management
3. Resource Management API Reference
6. Data Service Development Library
Managing Configuration Properties
Starting and Stopping a Data Service
Accessing Network Address Information
Debugging the Resource Type Implementation
Enabling Highly Available Local File Systems
8. Sample DSDL Resource Type Implementation
9. Solaris Cluster Agent Builder
12. Cluster Reconfiguration Notification Protocol
A. Sample Data Service Code Listings
B. DSDL Sample Resource Type Code Listings
C. Requirements for Non-Cluster Aware Applications
D. Document Type Definitions for the CRNP
The DSDL absorbs much of the complexity of implementing a fault monitor by providing a predetermined model. A Monitor_start method starts the fault monitor, under the control of the PMF, when the resource starts on a node. The fault monitor runs in a loop as long as the resource is running on the node.
The high-level logic of a DSDL fault monitor is as follows:
The scds_fm_sleep() function uses the Thorough_probe_interval property to determine the amount of time between probes. Any application process failures that are detected by the PMF during this interval lead to a restart of the resource.
The probe itself returns a value that indicates the severity of failures, from 0, no failure, to 100 complete failure.
The probe return value is sent to the scds_action() function, which maintains a cumulative failure history within the interval of the Retry_interval property.
The scds_action() function determines what to do in the event of a failure, as follows:
If the cumulative failure is below 100, do nothing.
If the cumulative failure reaches 100 (complete failure), restart the data service. If Retry_interval is exceeded, reset the history.
If the number of restarts exceeds the value of the Retry_count property, within the time specified by Retry_interval, fail over the data service.