JavaScript is required to for searching.
Skip Navigation Links
Exit Print View
Oracle Solaris Cluster Data Services Developer's Guide     Oracle Solaris Cluster 4.0
search filter icon
search icon

Document Information

Preface

1.  Overview of Resource Management

2.  Developing a Data Service

3.  Resource Management API Reference

RMAPI Access Methods

RMAPI Shell Commands

RMAPI Resource Commands

Resource Type Command

Resource Group Commands

Cluster Command

C Functions

Resource Functions

Resource Type Functions

Resource Group Functions

Cluster Functions

Utility Function

RMAPI Callback Methods

Arguments That You Can Provide to Callback Methods

Callback Method Exit Codes

Control and Initialization Callback Methods

Administrative Support Methods

Net-Relative Callback Methods

Monitor Control Callback Methods

4.  Modifying a Resource Type

5.  Sample Data Service

6.  Data Service Development Library

7.  Designing Resource Types

8.  Sample DSDL Resource Type Implementation

9.  Oracle Solaris Cluster Agent Builder

10.  Generic Data Service

11.  DSDL API Functions

12.  Cluster Reconfiguration Notification Protocol

A.  Sample Data Service Code Listings

B.  DSDL Sample Resource Type Code Listings

C.  Requirements for Non-Cluster-Aware Applications

D.  Document Type Definitions for the CRNP

E.  CrnpClient.java Application

Index

RMAPI Callback Methods

Callback methods are the key elements that are provided by the API for implementing a resource type. Callback methods enable the RGM to control resources in the cluster in the event of a change in cluster membership, such as the failure of a node.


Note - The callback methods are executed by the RGM with superuser or the greatest RBAC role permissions because the client programs control HA services in the cluster system. Install and administer these methods with restrictive file ownership and permissions. Specifically, give these methods a privileged owner, such as bin or root, and do not make them writable.


This section describes callback method arguments and exit codes.

Callback methods in the following categories are described:


Note - This section provides brief descriptions of the callback methods, including the point at which the method is run and the expected effect on the resource. However, the rt_callbacks(1HA) man page is the definitive reference for the callback methods.


Arguments That You Can Provide to Callback Methods

The RGM runs callback methods, as follows:

method -R resource-name -T type-name -G group-name

The method is the path name of the program that is registered as the Start, Stop, or other callback. The callback methods of a resource type are declared in its registration file.

All callback method arguments are passed as flagged values, as follows:

Use the arguments with access functions to retrieve information about the resource.

The Validate method is called with additional arguments that include the property values of the resource and resource group on which it is called.

The scha_calls(3HA) man page contains more information.

Callback Method Exit Codes

All callback methods have the same exit codes. These exit codes are defined to specify the effect of the method invocation on the resource state. The scha_calls(3HA) man page describes these exit codes in more detail.

The two major categories of exit codes are as follows:

The RGM also handles abnormal failures of callback method execution, such as timeouts and core dumps.

Method implementations must output failure information by using syslog() on each node. Output written to stdout or stderr is not guaranteed to be delivered to the user, although it is currently displayed on the console of the local node.

Control and Initialization Callback Methods

The primary control and initialization callback methods start and stop a resource. Other methods execute initialization and termination code on a resource.

Start

The RGM runs this method on a cluster node when the resource group that contains the resource is brought online on that node. This method activates the resource on that node.

A Start method should not exit until the resource that it activates has been started and is available on the local node. Therefore, before exiting, the Start method should poll the resource to determine that it has started. In addition, you should set a sufficiently long timeout value for this method. For example, particular resources, such as database daemons, take more time to start, and thus require that the method have a longer timeout value.

The way in which the RGM responds to failure of the Start method depends on the setting of the Failover_mode property.

The Start_timeout property in the resource type registration (RTR) file sets the timeout value for a resource's Start method.

Stop

The RGM runs this required method on a cluster node when the resource group that contains the resource is brought offline on that node. This method deactivates the resource if it is active.

A Stop method should not exit until the resource that it controls has completely stopped all its activity on the local node and has closed all file descriptors. Otherwise, because the RGM assumes the resource has stopped when, in fact, it is still active, data corruption can result. The safest way to avoid data corruption is to terminate all processes on the local node that is related to the resource.

Before exiting, the Stop method should poll the resource to determine that it has stopped. In addition, you should set a sufficiently long timeout value for this method. For example, particular resources, such as database daemons, take more time to stop, and thus require that the method have a longer timeout value.

If an RGM method callback times out, the method's process tree is killed by a SIGABRT signal (not a SIGTERM signal). As a result, all members of the process group generate a core dump file in the /var/cluster/core directory or in a subdirectory of the /var/cluster/core directory on the node on which the method exceeded its timeout. This core dump file is generated to enable you to determine why your method exceeded its timeout.


Note - Avoid writing data service methods that create a new process group. If your data service method must create a new process group, write a signal handler for the SIGTERM and SIGABRT signals. Also, ensure that your signal handler forwards the SIGTERM or SIGABRT signal to the child process group or groups before the signal handler terminates the process. Writing a signal handler for these signals increases the likelihood that all processes that are spawned by your method are correctly terminated.


The way in which the RGM responds to failure of the Stop method depends on the setting of the Failover_mode property. See Resource Properties.

The Stop_timeout property in the RTR file sets the timeout value for a resource's Stop method.

Init

The RGM runs this optional method to perform a one-time initialization of the resource when the resource becomes managed. The RGM runs this method when its resource group is switched from an unmanaged to a managed state or when the resource is created in a resource group that is already managed. The method is called on nodes that are identified by the Init_nodes resource property.

Fini

The RGM executes the Fini method to clean up after a resource when that resource is no longer managed by the RGM. The Fini method usually undoes any initializations that were performed by the Init method.

The RGM executes Fini on each node on which the resource becomes unmanaged when the following situations arise:

  • The resource group that contains the resource is switched to an unmanaged state. In this case, the RGM executes the Fini method on all nodes in the node list.

  • The resource is deleted from a managed resource group. In this case, the RGM executes the Fini method on all nodes in the node list.

  • A node is deleted from the node list of the resource group that contains the resource. In this case, the RGM executes the Fini method on only the deleted node.

A “node list” is either the resource group's Nodelist or the resource type's Installed_nodes list. Whether “node list” refers to the resource group's Nodelist or the resource type's Installed_nodes list depends on the setting of the resource type's Init_nodes property. The Init_nodes property can be set to RG_PRIMARIES or RT_INSTALLED_NODES. For most resource types, Init_nodes is set to RG_PRIMARIES, the default. In this case, both the Init and Fini methods are executed on the nodes that are specified in the resource group's Nodelist.

The type of initialization that the Init method performs defines the type of cleanup that the Fini method that you implement needs to perform, as follows:

  • Cleanup of node-specific configuration.

  • Cleanup of cluster-wide configuration.

The Fini method that you implement needs to determine whether to perform only cleanup of node-specific configuration or cleanup of both node-specific and cluster-wide configuration.

When a resource becomes unmanaged on only a particular node, the Fini method can clean up local, node-specific configuration. However, the Fini method must not clean up global, cluster-wide configuration, because the resource remains managed on other nodes. If the resource becomes unmanaged cluster-wide, the Fini method can perform cleanup of both node-specific and global configuration. Your Fini method code can distinguish these two cases by determining whether the resource group's node list contains the local node on which your Fini method is executing.

If the local node appears in the resource group's node list, the resource is being deleted or is moving to an unmanaged state. The resource is no longer active on any node. In this case, your Fini method needs to clean up any node-specific configuration on the local node as well as cluster-wide configuration.

If the local node does not appear in the resource group's node list, your Fini method can clean up node-specific configuration on the local node. However, your Fini method must not clean up cluster-wide configuration. In this case, the resource remains active on other nodes.

You must also code the Fini method so that it is idempotent. In other words, even if the Fini method has cleaned up a resource during a previous execution, subsequent calls to the Fini method exit successfully.

Boot

The RGM runs this optional method, which is similar to Init, to initialize the resource on nodes that join the cluster after the resource group that contains the resource has already been put under the management of the RGM. The RGM runs this method on nodes that are identified by the Init_nodes resource property. The Boot method is called when the node joins or rejoins the cluster as a result of being booted or rebooted.

If the Global_zone resource type property equals TRUE, methods execute in the global-cluster voting node even if the resource group that contains the resource is configured to run in a global-cluster non-voting node.


Note - Failure of the Init, Fini, or Boot methods causes an error message to be written to the system log. However, management of the resource by the RGM is not otherwise affected.


Administrative Support Methods

Administrative actions on resources include setting and changing resource properties. The Validate and Update callback methods enable a resource type implementation to carry out these administrative actions.

Validate

The RGM calls this optional method when a resource is created and when the cluster administrator updates the properties of the resource or its containing resource group. This method is called on the set of cluster nodes that are identified by the Init_nodes property of the resource's type. The Validate method is called before the creation or the update is applied. A failure exit code from the method on any node causes the creation or the update to be canceled.

Validate is called only when resource or resource group properties are changed by the cluster administrator. This method is not called when the RGM sets properties, nor when a monitor sets the Status and Status_msg resource properties.

Update

The RGM runs this optional method to notify a running resource that properties have been changed. The RGM runs Update after an administrative action succeeds in setting properties of a resource or its group. This method is called on nodes where the resource is online. The method uses the API access functions to read property values that might affect an active resource and to adjust the running resource accordingly.


Note - Failure of the Update method causes an error message to be written to the system log. However, management of the resource by the RGM is not otherwise affected.


Net-Relative Callback Methods

Services that use network address resources might require that start or stop steps be carried out in a particular order relative to the network address configuration. The following optional callback methods, Prenet_start and Postnet_stop, enable a resource type implementation to carry out special startup and shutdown actions before and after a related network address is configured or unconfigured.

Prenet_start

This optional method is called to carry out special startup actions before network addresses in the same resource group are configured.

Postnet_stop

This optional method is called to carry out special shutdown actions after network addresses in the same resource group are configured down.

Monitor Control Callback Methods

A resource type implementation optionally can include a program to monitor the performance of a resource, report on its status, or take action when a resource fails. The Monitor_start, Monitor_stop, and Monitor_check methods support the implementation of a resource monitor in a resource type implementation.

Monitor_start

This optional method is called to start a monitor for the resource after the resource is started.

Monitor_stop

This optional method is called to stop a resource's monitor before the resource is stopped.

Monitor_check

This optional method is called to assess the reliability of a node before a resource group is relocated to that node. You must implement the Monitor_check method so that it does not conflict with the concurrent running of another method.