JavaScript is required to for searching.
Skip Navigation Links
Exit Print View
Oracle Solaris Cluster Data Services Developer's Guide     Oracle Solaris Cluster
search filter icon
search icon

Document Information


1.  Overview of Resource Management

2.  Developing a Data Service

3.  Resource Management API Reference

4.  Modifying a Resource Type

5.  Sample Data Service

6.  Data Service Development Library

7.  Designing Resource Types

Resource Type Registration File

Validate Method

Start Method

Stop Method

Monitor_start Method

Monitor_stop Method

Monitor_check Method

Update Method

Description of Init, Fini, and Boot Methods

Designing the Fault Monitor Daemon

8.  Sample DSDL Resource Type Implementation

9.  Solaris Cluster Agent Builder

10.  Generic Data Service

11.  DSDL API Functions

12.  Cluster Reconfiguration Notification Protocol

A.  Sample Data Service Code Listings

B.  DSDL Sample Resource Type Code Listings

C.  Requirements for Non-Cluster Aware Applications

D.  Document Type Definitions for the CRNP

E. Application


Designing the Fault Monitor Daemon

Resource type implementations that use the DSDL typically have a fault monitor daemon that carries out the following responsibilities:

The DSDL utilities are designed so that the main loop of the fault monitor daemon can be represented by the pseudo code at the end of this section.

Keep the following factors in mind when you implement a fault monitor with the DSDL:

In most cases, you can implement the application-specific health check action in a separate stand-alone utility (svc_probe(), for example). You can integrate it with the following generic main loop.

for (;;) {
   /* sleep for a duration of thorough_probe_interval between
   *  successive probes.
   (void) scds_fm_sleep(scds_handle,
   /* Now probe all ipaddress we use. Loop over
   * 1. All net resources we use.
   * 2. All ipaddresses in a given resource.
   * For each of the ipaddress that is probed,
   * compute the failure history. 
   probe_result = 0;
   /* Iterate through the all resources to get each
   * IP address to use for calling svc_probe()
   for (ip = 0; ip < netaddr->num_netaddrs; ip++) {
   /* Grab the hostname and port on which the
   * health has to be monitored.
   hostname = netaddr->netaddrs[ip].hostname;
   port = netaddr->netaddrs[ip].port_proto.port;
   * HA-XFS supports only one port and
   * hence obtaint the port value from the
   * first entry in the array of ports.
   ht1 = gethrtime();
   /* Latch probe start time */
   probe_result = svc_probe(scds_handle, hostname, port, timeout);
   * Update service probe history,
   * take action if necessary.
   * Latch probe end time.
   ht2 = gethrtime();
   /* Convert to milliseconds */
   dt = (ulong_t)((ht2 - ht1) / 1e6);
   * Compute failure history and take
   * action if needed
   (void) scds_fm_action(scds_handle,
   probe_result, (long)dt);
   }       /* Each net resource */
   }       /* Keep probing forever */