Sun Cluster Data Services Developer's Guide for Solaris OS

Chapter 7 Designing Resource Types

This chapter explains the typical use of the Data Service Development Library (DSDL) in designing and implementing resource types. This chapter also focuses on designing the resource type to validate the resource configuration, and to start, stop, and monitor the resource. In addition, this chapter describes how to use the DSDL to implement the resource type callback methods.

See the rt_callbacks(1HA) man page for additional information.

You need access to the resource's property settings to complete these tasks. The DSDL utility scds_initialize() provides a uniform way to access these resource properties. This function is designed to be called at the beginning of each callback method. This utility function retrieves all the properties for a resource from the cluster framework and makes it available to the family of scds_getname() functions.

This chapter covers the following topics:

Resource Type Registration File

The Resource Type Registration (RTR) file specifies the details about the resource type to the Sun Cluster software.

Details include information as follows:

Properties that are needed by the implementation
The data types and default values of those properties
The file system path for the callback methods for the resource type implementation
Various settings for the system-defined properties

The sample RTR file that is shipped with the DSDL is sufficient for most resource type implementations. You need only edit some basic elements, such as the resource type name and the path name of the resource type callback methods. If a new property is needed to implement the resource type, you can declare it as an extension property in the RTR file of the resource type implementation, and access the new property by using the DSDL scds_get_ext_property() utility.

`Validate` Method

The purpose of the Validate callback method of a resource type implementation is to check that the proposed resource settings (as specified by the proposed property settings on the resource) are acceptable to the resource type.

The Validate method of a resource type implementation is called by the Resource Group Manager (RGM) under one of the following two conditions:

A new resource of the resource type is being created
A property of the resource or resource group is being updated

These two scenarios can be distinguished by the presence of the command-line option -c (create) or -u (update) that is passed to the Validate method of the resource.

The Validate method is called on each node of a set of nodes or in each zone, where the set of nodes or zones is defined by the value of the resource type property Init_nodes. If Init_nodes is set to RG_PRIMARIES, Validate is called on each node or zone that can host (be a primary of) the resource group that contains the resource. If Init_nodes is set to RT_INSTALLED_NODES, Validate is called on each node or zone where the resource type software is installed, typically all nodes or zones in the cluster.

The default value of Init_nodes is RG_PRIMARIES (see the rt_reg(4) man page). At the point the Validate method is called, the RGM has not yet created the resource (in the case of creation callback) or has not yet applied the updated values of the properties that are being updated (in the case of update callback).

Note –

If you are using local file systems that are managed by the HAStoragePlus resource type, you use the scds_hasp_check() function to check the state of that resource type. This information is obtained from the state (online or otherwise) of all SUNW.HAStoragePlus resources on which the resource depends by using the Resource_dependencies or Resource_dependencies_weak system properties that are defined for the resource. See the scds_hasp_check(3HA) man page for a complete list of status codes that are returned by the scds_hasp_check() function.

The DSDL function scds_initialize() handles these situations in the following manner:

If the resource is being created, scds_initialize() parses the proposed resource properties, as they are passed on the command line. The proposed values of resource properties are therefore available to you as though the resource was already created in the system.
If the resource or resource group is being updated, the proposed values of the properties that are being updated by the cluster administrator are read in from the command line. The remaining properties (whose values are not being updated) are read in from Sun Cluster by using the Resource Management API. If you are using the DSDL, you do not need to concern yourself with these tasks. You can validate a resource as if all the properties of the resource were available.

Suppose the function that implements the validation of a resource's properties is called svc_validate(), which uses the scds_get_name() family of functions to look at the property to be validated. Assuming that an acceptable resource setting is represented by a 0 return code from this function, the Validate method of the resource type can thus be represented by the following code fragment:

int
main(int argc, char *argv[])
{
   scds_handle_t handle;
   int rc;

   if (scds_initialize(&handle, argc, argv)!= SCHA_ERR_NOERR) {
   return (1);   /* Initialization Error */
   }
   rc = svc_validate(handle);
   scds_close(&handle);
   return (rc);
}

The validation function should also log the reason why the validation of the resource failed. However, by leaving out that detail (Chapter 8, Sample DSDL Resource Type Implementation contains a more realistic treatment of a validation function), you can implement a simpler example svc_validate() function, as follows:

int
svc_validate(scds_handle_t handle)
{
   scha_str_array_t *confdirs;
   struct stat    statbuf;
   confdirs = scds_get_confdir_list(handle);
   if (stat(confdirs->str_array[0], &statbuf) == -1) {
   return (1);   /* Invalid resource property setting */
   }
   return (0);   /* Acceptable setting */
}

Thus, you must concern yourself with only the implementation of the svc_validate() function.

`Start` Method

The Start callback method of a resource type implementation is called by the RGM on a chosen cluster node or zone to start the resource. The resource group name, the resource name, and resource type name are passed on the command line. The Start method performs the actions that are needed to start a data service resource in the cluster node or zone. Typically this involves retrieving the resource properties, locating the application specific executable file, configuration files, or both, and starting the application with the correct command-line arguments.

With the DSDL, the resource configuration is already retrieved by the scds_initialize() utility. The startup action for the application can be contained in a function svc_start(). Another function, svc_wait(), can be called to verify that the application actually starts. The simplified code for the Start method is as follows:

int
main(int argc, char *argv[])
{
   scds_handle_t handle;

   if (scds_initialize(&handle, argc, argv)!= SCHA_ERR_NOERR) {
   return (1);   /* Initialization Error */
   }
   if (svc_validate(handle) != 0) {
   return (1);   /* Invalid settings */
   }
   if (svc_start(handle) != 0) {
   return (1);   /* Start failed */
   }
   return (svc_wait(handle));
}

This start method implementation calls svc_validate() to validate the resource configuration. If it fails, either the resource configuration and application configuration do not match or there is currently a problem on this cluster node or zone with regard to the system. For example, a cluster file system that is needed by the resource might currently not be available on this cluster node or zone. In this case, it is futile to attempt to start the resource on this cluster node or zone. It is better to let the RGM attempt to start the resource on a different node or zone.

Note, however, that the preceding statement assumes that svc_validate() is sufficiently conservative, checking only for resources on the cluster node or zone that are required by the application. Otherwise, the resource might fail to start on all cluster nodes or zones and thus enter a START_FAILED state. See the Sun Cluster Data Services Planning and Administration Guide for Solaris OS for an explanation of this state.

The svc_start() function must return 0 for a successful startup of the resource on the node or zone. If the startup function encounters a problem, it must return nonzero. Upon failure of this function, the RGM attempts to start the resource on a different cluster node or zone.

To take advantage of the DSDL as much as possible, the svc_start() function can call the scds_pmf_start() utility to start the application under the Process Monitor Facility (PMF). This utility also uses the failure callback action feature of the PMF to detect process failure. See the description of the -a action argument in the pmfadm(1M) man page for more information.

`Stop` Method

The Stop callback method of a resource type implementation is called by the RGM on a cluster node or zone to stop the application.

The callback semantics for the Stop method demand the following factors:

The Stop method must be idempotent because the Stop method can be called by the RGM even if the Start method did not complete successfully on the node or zone. Thus, the Stop method must succeed (exit zero) even if the application is not currently running on the cluster node or zone and there is no work for it to do.
If the Stop method of the resource type fails (exits nonzero) on a cluster node or zone, the resource that is being stopped enters the STOP_FAILED state. Depending on the Failover_mode setting on the resource, this condition might lead the RGM to perform a hard reboot of the cluster node.

Thus, you must design the Stop method so that this method definitely stops the application. You might even need to resort to using SIGKILL to kill the application abruptly if the application otherwise fails to terminate.

You must also ensure that this method stops the application in a timely fashion because the framework treats expiry of the Stop_timeout property as a stop failure, and consequently puts the resource in a STOP_FAILED state.

The DSDL utility scds_pmf_stop() should suffice for most applications as it first attempts to softly stop the application with SIGTERM. This function then delivers a SIGKILL to the process. This function assumes that the application was started under the PMF with scds_pmf_start(). See PMF Functions for details about this utility.

Assuming that the application-specific function that stops the application is called svc_stop(), implement the Stop method as follows:

if (scds_initialize(&handle, argc, argv)!= SCHA_ERR_NOERR)
{
   return (1);   /* Initialization Error */
}
return (svc_stop(handle));

Whether or not the implementation of the preceding svc_stop() function includes the scds_pmf_stop() function is irrelevant. Your decision to include the scds_pmf_stop() function depends on whether or not the application was started under the PMF through the Start method.

The svc_validate() method is not used in the implementation of the Stop method because, even if the system is currently experiencing a problem, the Stop method should attempt to stop the application on this node or zone.

`Monitor_start` Method

The RGM calls the Monitor_start method to start a fault monitor for the resource. Fault monitors monitor the health of the application that is being managed by the resource. Resource type implementations typically implement a fault monitor as a separate daemon that runs in the background. The Monitor_start callback method is used to start this daemon with the correct arguments.

Because the monitor daemon itself is prone to failures (for example, it could die, leaving the application unmonitored), you should use the PMF to start the monitor daemon. The DSDL utility scds_pmf_start() has built-in support for starting fault monitors. This utility uses the path name that is relative to the RT_basedir for the location of the resource type callback method implementations of the monitor daemon program. This utility uses the Monitor_retry_interval and Monitor_retry_count extension properties that are managed by the DSDL to prevent unlimited restarts of the daemon.

This utility also imposes the same command-line syntax as defined for all callback methods (that is, -R resource -G resource-group -T resource-type) onto the monitor daemon, although the monitor daemon is never called directly by the RGM. Finally, this utility also allows the monitor daemon implementation itself to enable the scds_initialize() utility to set up its own environment. The main effort is in designing the monitor daemon itself.

`Monitor_stop` Method

The RGM calls the Monitor_stop method to stop the fault monitor daemon that was started with the Monitor_start method. Failure of this callback method is treated in exactly the same fashion as failure of the Stop method. Therefore, the Monitor_stop method must be idempotent and just as robust as the Stop method.

If you use the scds_pmf_start() utility to start the fault monitor daemon, use the scds_pmf_stop() utility to stop it.

`Monitor_check` Method

The RGM runs the Monitor_check callback method on a resource on a node or zone for the specified resource to ascertain whether the cluster node or zone is capable of mastering the resource. In other words, the RGM runs this method to determine whether the application that is being managed by the resource can run successfully on the node or zone.

Typically, this situation involves ensuring that all the system resources that are required by the application are indeed available on the cluster node or zone. As discussed in Validate Method, the function svc_validate() that you implement is intended to ascertain at least that.

Depending on the specific application that is being managed by the resource type implementation, the Monitor_check method can be written to carry out additional tasks. The Monitor_check method must be implemented so that it does not conflict with other methods that are running concurrently. If you are using the DSDL, the Monitor_check method should call the svc_validate() function, which implements application-specific validation of resource properties.

`Update` Method

The RGM calls the Update method of a resource type implementation to apply any changes that were made by the cluster administrator to the configuration of the active resource. The Update method is only called on nodes or zones (if any) where the resource is currently online.

The changes that have just been made to the resource configuration are guaranteed to be acceptable to the resource type implementation because the RGM runs the Validate method of the resource type before it runs the Update method. The Validate method is called before the resource or resource group properties are changed, and the Validate method can veto the proposed changes. The Update method is called after the changes have been applied to give the active (online) resource the opportunity to take notice of the new settings.

You must carefully determine the properties that you want to be able to update dynamically, and mark those with the TUNABLE = ANYTIME setting in the RTR file. Typically, you can specify that you want to be able to dynamically update any property of a resource type implementation that the fault monitor daemon uses. However, the implementation of the Update method must at least restart the monitor daemon.

Possible properties that you can use are as follows:

Thorough_probe_interval
Retry_count
Retry_interval
Monitor_retry_count
Monitor_retry_interval
Probe_timeout

These properties affect the way a fault monitor daemon checks the health of the service, how often the daemon performs checks, the history interval that the daemon uses to keep track of the errors, and the restart thresholds that are set by the PMF. To implement updates of these properties, the utility scds_pmf_restart() is provided in the DSDL.

If you need to be able to dynamically update a resource property, but the modification of that property might affect the running application, you need to implement the correct actions. You must ensure that the updates to that property are correctly applied to any running instances of the application. Currently, you cannot use the DSDL to dynamically update a resource property in this way. You cannot pass the modified properties to Update on the command line (as you can with Validate).

Description of `Init`, `Fini`, and `Boot` Methods

These methods are one-time action methods as defined by the Resource Management API specifications. The sample implementation that is included with the DSDL does not illustrate the use of these methods. However, all the facilities in the DSDL are available to these methods as well, should you need these methods. Typically, the Init and the Boot methods would be exactly the same for a resource type implementation to implement a one-time action. The Fini method typically would perform an action that undoes the action of the Init or Boot methods.

Designing the Fault Monitor Daemon

Resource type implementations that use the DSDL typically have a fault monitor daemon that carries out the following responsibilities:

Periodically monitors the health of the application that is being managed. This particular responsibility of a monitor daemon largely depends on the particular application and can vary widely from resource type to resource type. The DSDL contains some built-in utility functions that perform health checks for simple TCP-based services. You can use these utilities to implement applications that use ASCII-based protocols, such as HTTP, NNTP, IMAP, and POP3.
Keeps track of the problems that are encountered by the application by using the resource properties Retry_interval and Retry_count. When the application fails completely, the fault monitor needs to determine whether the PMF action script should restart the service or whether the application failures have accumulated so rapidly that a failover needs to be carried out. The DSDL utilities scds_fm_action() and scds_fm_sleep() are intended to aid you in implementing this mechanism.
Takes action, typically either restarting the application or attempting a failover of the containing resource group. The DSDL utility scds_fm_action() implements this algorithm. This utility computes the current accumulation of probe failures in the past number of Retry_interval seconds for this purpose.
Updates the resource state so that the state of the application's health is available to the Sun Cluster administrative commands, as well as to the cluster management GUI.

The DSDL utilities are designed so that the main loop of the fault monitor daemon can be represented by the pseudo code at the end of this section.

Keep the following factors in mind when you implement a fault monitor with the DSDL:

scds_fm_sleep() detects the death of an application process rapidly because notification of the application process's death through the PMF is asynchronous. Thus, the fault detection time is reduced significantly, thereby increasing the availability of the service. A fault monitor might otherwise wake up every so often to check on a service's health and find that the application process has died.
If the RGM rejects the attempt to fail over the service with the scha_control API, scds_fm_action() resets, or forgets, its current failure history. This function resets its current failure history because its history already exceeds Retry_count. If the monitor daemon wakes up in the next iteration and is unable to successfully complete its health check of the daemon, the monitor daemon again attempts to call the scha_control() function. That call is probably rejected once again, as the situation that led to its rejection in the last iteration is still valid. Resetting the history ensures that the fault monitor at least attempts to correct the situation locally (for example, through restarting the application) in the next iteration.
scds_fm_action() does not reset application failure history in case of restart failures, as you would typically like to issue scha_control() quickly thereafter if the situation does not correct itself.
The utility scds_fm_action() updates the resource status to SCHA_RSSTATUS_OK, SCHA_RSSTATUS_DEGRADED, or SCHA_RSSTATUS_FAULTED depending on the failure history. This status is consequently available to cluster system management.

In most cases, you can implement the application-specific health check action in a separate stand-alone utility (svc_probe(), for example). You can integrate it with the following generic main loop.

for (;;) {
   /* sleep for a duration of thorough_probe_interval between
   *  successive probes.
   */
   (void) scds_fm_sleep(scds_handle,
   scds_get_rs_thorough_probe_interval(scds_handle));
   /* Now probe all ipaddress we use. Loop over
   * 1. All net resources we use.
   * 2. All ipaddresses in a given resource.
   * For each of the ipaddress that is probed,
   * compute the failure history. 
   */
   probe_result = 0;
   /* Iterate through the all resources to get each
   * IP address to use for calling svc_probe()
   */
   for (ip = 0; ip < netaddr->num_netaddrs; ip++) {
   /* Grab the hostname and port on which the
   * health has to be monitored.
   */
   hostname = netaddr->netaddrs[ip].hostname;
   port = netaddr->netaddrs[ip].port_proto.port;
   /*
   * HA-XFS supports only one port and
   * hence obtaint the port value from the
   * first entry in the array of ports.
   */
   ht1 = gethrtime();
   /* Latch probe start time */
   probe_result = svc_probe(scds_handle, hostname, port, timeout);
   /*
   * Update service probe history,
   * take action if necessary.
   * Latch probe end time.
   */
   ht2 = gethrtime();
   /* Convert to milliseconds */
   dt = (ulong_t)((ht2 - ht1) / 1e6);
   /*
   * Compute failure history and take
   * action if needed
   */
   (void) scds_fm_action(scds_handle,
   probe_result, (long)dt);
   }       /* Each net resource */
   }       /* Keep probing forever */

Chapter 7 Designing Resource Types

Resource Type Registration File

Validate Method

Start Method

Stop Method

Monitor_start Method

Monitor_stop Method

Monitor_check Method

Update Method

Description of Init, Fini, and Boot Methods