Sun Cluster Data Services Developer's Guide for Solaris OS

Chapter 7 Designing Resource Types

This chapter explains the typical usage of the DSDL in designing and implementing resource types. This chapter also focuses on designing the resource type to validate the resource configuration, and to start, stop, and monitor the resource. This chapter finally describes how to use the DSDL to implement the resource type callback methods.

Refer to the rt_callbacks(1HA) man page for additional information.

You need access to the resource's property settings to complete these tasks. The DSDL utility scds_initialize() gives you a uniform way to access the resource properties. This function is designed to be called at the beginning of each callback method. This utility function retrieves all the properties for a resource from the cluster framework and makes it available to the family of scds_getname() functions.

This chapter covers the following topics:

The RTR File

The Resource Type Registration (RTR) file is an important component of a resource type. This file specifies the details about the resource type to Sun Cluster. These details include information such as the properties that are needed by the implementation, the data types of those properties, the default values of those properties, the file system path for the callback methods for the resource type implementation, and various settings for the system-defined properties.

The sample RTR file that is shipped with the DSDL should suffice for most resource type implementations. All you need to do is edit some basic elements such as the resource type name and the pathname of the resource type callback methods. If a new property is needed to implement the resource type, you can declare it as an extension property in the Resource Type Registration (RTR) file of the resource type implementation, and then access the new property using the DSDL scds_get_ext_property() utility.

The `Validate` Method

The Validate method of a resource type implementation is called by the RGM under one of the following two conditions:

A new resource of the resource type is being created
A property of the resource or resource group is being updated

These two scenarios can be distinguished by the presence of the command line option -c (creation) or -u (update) passed to the Validate method of the resource.

The Validate method is called on each node of a set of nodes, where the set of nodes is defined by the value of the resource type property INIT_NODES. If INIT_NODES is set to RG_PRIMARIES, Validate is called on each node that can host (be a primary of) the resource group containing the resource. If INIT_NODES is set to RT_INSTALLED_NODES, Validate is called on each node where the resource type software is installed, typically all nodes in the cluster. The default value of INIT_NODES is RG_PRIMARIES (see rt_reg(4). At the point the Validate method is called, the RGM has not yet created the resource (in the case of creation callback) or has not yet applied the updated value(s) of the properties being updated (in the case of update callback). The purpose of the Validate callback method of a resource type implementation is to check that the proposed resource settings (as specified by the proposed property settings on the resource) are acceptable to the resource type.

Note –

If you are using local file systems managed by HAStoragePlus, you use the scds_hasp_check to check the state of the HAStoragePlus resource, This information is obtained from the state (online or otherwise) of all SUNW.HAStoragePlus(5) resources that the resource depends upon using Resource_dependencies or Resource_dependencies_weak system properties defined for the resource. See scds_hasp_check(3HA) for a complete list of status codes returned from the scds_hasp_check call.

The DSDL function scds_initialize() takes care of these situations in the following manner:

In the case of resource creation, it parses the proposed resource properties, as passed on the command line. The proposed values of resource properties are thus available to the resource type developer as if the the resource were already created in the system.
In the case of resource or resource group update, the proposed values of the properties being updated by the administrator are read in from the command line, and the remaining properties (whose values are not being updated) are read in from Sun Cluster using the Resource Management API. A resource type developer using the DSDL need not concern himself with all these housekeeping tasks. The validation of a resource can be done as if all the properties of the resource were available to the developer.

Suppose the function that implements the validation of a resource's properties is called svc_validate() which uses the scds_get_name() family of functions to look at the property it is interested in validating. Assuming that an acceptable resource setting is represented by a 0 return code from this function, the Validate method of the resource type can thus be represented by the following code fragment:

int
main(int argc, char *argv[])
{
   scds_handle_t handle;
   int rc;

   if (scds_initialize(&handle, argc, argv)!= SCHA_ERR_NOERR) {
   return (1);   /* Initialization Error */
   }
   rc = svc_validate(handle);
   scds_close(&handle);
   return (rc);
}

The the validation function should also log the reason for the failure of the validation of resource. Leaving out that detail (see the next chapter for a more realistic treatment of a validation function), a simple example svc_validate() function can then be implemented as:

int
svc_validate(scds_handle_t handle)
{
   scha_str_array_t *confdirs;
   struct stat    statbuf;
   confdirs = scds_get_confdir_list(handle);
   if (stat(confdirs->str_array[0], &statbuf) == -1) {
   return (1);   /* Invalid resource property setting */
   }
   return (0);   /* Acceptable setting */
}

The resource type developer thus has to concern himself with only the implementation of the svc_validate() function. A typical example for a resource type implementation could be to ensure that an application configuration file named app.conf exists under the Confdir_list property. That can be conveniently implemented by a stat() system call on the appropriate pathname derived from the Confdir_list property.

The `Start` Method

The Start callback method of a resource type implementation is called by the RGM on a chosen cluster node to start the resource. The resource group name, the resource name, and resource type name are passed on the command line. The Start method is expected to perform the actions needed to start up a data service resource on the cluster node. Typically this involves retrieving the resource properties, locating the application specific executables and/or configuration files, and launching the application with appropriate command line arguments.

With the DSDL, the resource configuration is already retrieved by the scds_initialize() utility. The startup action for the application can be contained in a function svc_start(). Another function, svc_wait(), can be called to verify that the application actually starts. The simplified code for the Start method becomes:

int
main(int argc, char *argv[])
{
   scds_handle_t handle;

   if (scds_initialize(&handle, argc, argv)!= SCHA_ERR_NOERR) {
   return (1);   /* Initialization Error */
   }
   if (svc_validate(handle) != 0) {
   return (1);   /* Invalid settings */
   }
   if (svc_start(handle) != 0) {
   return (1);   /* Start failed */
   }
   return (svc_wait(handle));
}

This start method implementation calls svc_validate() to validate the resource configuration. If it fails, either the resource configuration and application configuration do not match, or there is currently a problem on this cluster node with regard to the system. For example, a global file system needed by the resource may currently not be available on this cluster node. In this case, it is futile to even attempt to start the resource on this cluster node. It is better to let the RGM attempt to start the resource on a different node. Note however that the above assumes svc_validate() is sufficiently conservative (so that it checks only for resources on the cluster node that are absolutely needed by the application) or else the resource might fail to start up on all cluster nodes and thus land in START_FAILED state. See scswitch(1M) and the Sun Cluster Data Services Planning and Administration Guide for Solaris OS for an explanation of this state.

The svc_start() function must return 0 for a successful startup of the resource on the node. If the startup function encountered a problem, it must return non-zero. Upon failure of this function, the RGM attempts to start the resource on a different cluster node.

To leverage the DSDL as much as possible, the svc_start() function can use the scds_pmf_start() utility to start the application under the Process Management Facility (PMF). This utility also leverages the failure callback action feature of PMF (see the -a action flag in pmfadm(1M)) to implement process failure detection.

The `Stop` Method

The Stop callback method of a resource type implementation is called by the RGM on a cluster node to stop the application. The callback semantics for the Stop method demands that:

The Stop method must be idempotent because the Stop method can be called by the RGM even if the Start method did not complete successfully on the node. Thus the Stop method must succeed (exit zero) even if the application is not currently running on the cluster node and there is no work for it to do.
If the Stop method of the resource type fails (exits non-zero) on a cluster node, the resource being stopped would end up in the STOP_FAILED state. Depending upon the Failover_mode setting on the resource, this may lead to a hard rebooting of the cluster node by the RGM. Thus it is important to design the Stop method so that it tries very hard to really stop the application, even by a hard and abrupt killing of the application (for example, using SIGKILL) if the application otherwise fails to terminate. It should also make sure that it does so in a timely fashion, because the framework treats expiry of Stop_timeout as a stop failure, and puts the resource in STOP_FAILED state.

The DSDL utility scds_pmf_stop() should suffice for most applications as it first attempts to softly (via SIGTERM) stop the application (it assumes that it was started under PMF via scds_pmf_start()) followed by a delivering a SIGKILL to the process. See PMF Functions for details about this utility.

Following the model of the code we have been using so far, assuming that the application specific function to stop the application is called svc_stop() (whether the implementation of svc_stop() uses the scds_pmf_stop() is besides the point here, and would depend upon whether or not the application was started under PMF via the Start method)) the Stop method can be implemented as

if (scds_initialize(&handle, argc, argv)!= SCHA_ERR_NOERR)
{
   return (1);   /* Initialization Error */
}
return (svc_stop(handle));

The svc_validate() method is not used in the implementation of the Stop method, because even if the system currently has a problem, the Stop method should attempt to stop the application on this node.

The `Monitor_start` Method

The RGM calls the Monitor_start method to start a fault monitor for the resource. Fault monitors monitor the health of the application being managed by the resource. Resource type implementations typically implement a fault monitor as a separate daemon which runs in the background. The Monitor_start callback method is used to launch this daemon with the appropriate arguments.

Because the monitor daemon itself is prone to failures (for example, it could die, leaving the application unmonitored) you should use the PMF to start the monitor daemon. The DSDL utility scds_pmf_start() has built in support for starting fault monitors. This utility uses the relative pathname (relative to the RT_basedir for the location of the resource type callback method implementations) of the monitor daemon program. It uses the Monitor_retry_interval and Monitor_retry_count extension properties managed by the DSDL to prevent unlimited restarts of the daemon. It imposes the same command line syntax as defined for all callback methods (that is, -R resource -G resource_group -T resource_type) onto the monitor daemon, although the monitor daemon is never called directly by the RGM. It allows the monitor daemon implementation itself to leverage the scds_initialize() utility to set up its own environment. The main effort is in designing the monitor daemon itself.

The `Monitor_stop` Method

The RGM calls the Monitor_stop method to stop the fault monitor daemon that was started via the Monitor_start method. Failure of this callback method is treated in exactly the same fashion as failure of the Stop method; therefore the Monitor_stop method must be idempotent and robust like the Stop method.

If you use the scds_pmf_start() utility to start the fault monitor daemon, use the scds_pmf_stop() utility to stop it.

The `Monitor_check` Method

The Monitor_check callback method on a resource is invoked on a node for the specified resource to ascertain whether the cluster node is capable of mastering the resource (that is, can the application(s) being managed by the resource be run successfully on the node?). Typically this situation involves making sure that all the system resources needed by the application are indeed available on the cluster node. As discussed in The Validate Method, the function svc_validate() implemented by the developer is intended to ascertain at least that.

Depending upon the specific application being managed by the resource type implementation, the Monitor_check method can be written to do some additional tasks. The Monitor_check method must be implemented so that it does not conflict with other methods running concurrently. For developers using the DSDL it is recommended that the Monitor_check method leverage the svc_validate() function written for the purpose of implementing application specific validation of resource properties.

The `Update` Method

The RGM calls the Update method of a resource type implementation to apply any changes that were made by the system administrator to the configuration of the active resource. The Update method is only called on nodes (if any) where the resource is currently online.

The changes that have just been made to the resource configuration are guaranteed to be acceptable to the resource type implementation because the RGM runs the Validate method of the resource type before it runs the Update method. The Validate method is called before the resource or resource group properties are changed and the Validate method can veto the proposed changes. The Update method is called after the changes have been applied to give the active (online) resource the opportunity to take notice of the new settings.

As a resource type developer, you need to cautiously decide the properties that you want to be able to update dynamically and mark those with the TUNABLE = ANYTIME setting in the RTR file. Typically, you can specify that you want to be able to dynamically update any property of a resource type implementation that the fault monitor daemon uses, provided that the Update method implementation at least restarts the monitor daemon.

Possible candidates are as follows:

Thorough_Probe_Interval
Retry_Count
Retry_Interval
Monitor_retry_count
Monitor_retry_interval
Probe_timeout

These properties affect the way a fault monitor daemon checks the health of the service, how often the daemon performs checks, the history interval that the daemon uses to keep track of the errors, and the restart thresholds that are set by PMF. To implement updates of these properties the utility scds_pmf_restart() is provided in the DSDL.

If you need to be able to dynamically update a resource property, but the modification of that property might affect the running application, you need to implement the appropriate actions so that the updates to that property are correctly applied to any running instances of the application. Currently there is no way to facilitate this via the DSDL. Update is not passed the modified properties on the command line (as is Validate).

The `Init`, `Fini`, and `Boot` Methods

These methods are one time action methods as defined by the Resource Management API specifications. The sample implementation that is included with the DSDL does not illustrate the use of these methods. However, all the facilities in the DSDL are available to these methods as well, should a resource type developer have a need for these methods. Typically, the Init and the Boot methods would be exactly the same for a resource type implementation to implement a one time action. The Fini method typically would perform an action which undoes the action of the Init or Boot methods.

Designing the Fault Monitor Daemon

Resource type implementations using the DSDL typically have a fault monitor daemon with the following responsibilities.

Periodically monitoring the health of the application being managed. This particular aspect of a monitor daemon is heavily application dependent and could vary widely from resource type to resource type. The DSDL has some built in utility functions to perform health checks for simple TCP based services. Applications implementing ASCII based protocols such as HTTP, NNTP, IMAP, and POP3 can be implemented using these utilities.

Keeping track of the problems encountered by the application using the resource properties Retry_interval and Retry_count. Upon complete failures of the application, deciding whether the PMF action script should restart the service or whether the application failures have accumulated so rapidly that a failover could be considered. The DSDL utilities scds_fm_action() and scds_fm_sleep() are intended to aid you in implementing this mechanism.

Taking appropriate actions (typically either restarting the application or attempting a failover of the containing resource group). The DSDL utility scds_fm_action() implements such an algorithm. It computes the current accumulation of probe failures in the past Retry_interval seconds for this purpose.

Updating the resource state so that application health state is available to the scstat command as well as to the cluster management GUI.

The DSDL utilities are designed so the main loop of the fault monitor daemon can be represented by the following pseudo code.

For fault monitors implemented using the DSDL:

The detection of application process death by scds_fm_sleep() is fairly rapid because the process death notification via PMF is asynchronous. Contrast that with a case where a fault monitor wakes up every so often to check on service health and finds the application dead. The fault detection time is reduced significantly, thereby increasing the availability of the service.
If the RGM rejects the attempt to fail over the service via the scha_control(3HA) API, scds_fm_action() resets (forgets) its current failure history. The reason is that the failure history is already above Retry_count, and if the monitor daemon wakes up in the next iteration and is unable to successfully complete its health check of the daemon, it would again attempt to invoke the scha_control() call, which would probably still be rejected, as the situation which led to its rejection in the last iteration is still valid. Resetting the history ensures that the fault monitor at least attempts to correct the situation locally (for example, via application restart) in the next iteration.
scds_fm_action() does not reset application failure history in case of restart failures, as one would typically like to try scha_control() soon if the situation doesn't correct itself.
The utility scds_fm_action() updates the resource status to SCHA_RSSTATUS_OK, SCHA_RSSTATUS_DEGRADED or SCHA_RSSTATUS_FAULTED depending upon the failure history. This status is thus available to cluster system management.

In most cases, the application specific health check action can be implemented in a separate stand-alone utility (svc_probe(), for example) and integrated with this generic main loop.

for (;;) { 

   / * sleep for a duration of thorough_probe_interval between
   *  successive probes. */
   (void) scds_fm_sleep(scds_handle,
   scds_get_rs_thorough_probe_interval(scds_handle));

   /* Now probe all ipaddress we use. Loop over
   * 1. All net resources we use.
   * 2. All ipaddresses in a given resource.
   * For each of the ipaddress that is probed,
   * compute the failure history. */
   probe_result = 0;
   /* Iterate through the all resources to get each
    * IP address to use for calling svc_probe() */
   for (ip = 0; ip < netaddr->num_netaddrs; ip++) {
   /* Grab the hostname and port on which the
   * health has to be monitored.
   */
   hostname = netaddr->netaddrs[ip].hostname;
   port = netaddr->netaddrs[ip].port_proto.port;
   /*
   * HA-XFS supports only one port and
   * hence obtaint the port value from the
   * first entry in the array of ports.
   */
   ht1 = gethrtime(); /* Latch probe start time */
   probe_result = svc_probe(scds_handle, 

   hostname, port, timeout);
   /*
   * Update service probe history,
   * take action if necessary.
   * Latch probe end time.
   */
   ht2 = gethrtime();
   /* Convert to milliseconds */
   dt = (ulong_t)((ht2 - ht1) / 1e6);

   /*
   * Compute failure history and take
   * action if needed
   */
   (void) scds_fm_action(scds_handle,
   probe_result, (long)dt);
   }       /* Each net resource */
   }       /* Keep probing forever */

Chapter 7 Designing Resource Types

The RTR File

The Validate Method

The Start Method

The Stop Method

The Monitor_start Method

The Monitor_stop Method

The Monitor_check Method

The Update Method

The Init, Fini, and Boot Methods