scha_control - Oracle® Solaris Cluster Reference Manual

Language:

scha_control(3HA)

Name

scha_control, scha_control_zone - resource and resource group control request functions

Synopsis

cc [flags…] –I/usr/cluster/include file –L/usr/cluster/lib 
     –l scha#include <scha.h> scha_err_t scha_control(const char *
     tag, const char *rgname, const char *rname);

 scha_err_t scha_control_zone(const char *tag, const char *
     rgname, const char *rname, const char *zonename);

Description

The scha_control () and scha_control_zone() functions each provide an interface to request the restart or relocation of a resource or a resource group that is under the control of the Resource Group Manager (RGM). Use these functions in resource monitors.

Use the scha_control_zone() function only for resource types whose Global_zone property is set to TRUE. This function is not needed if the Global_zone property is set to FALSE. For more information, see the rt_properties(5) man page. The scha_control_zone () function is called in the global zone. The zonename argument specifies the name of the zone cluster in which the resource group is configured.

The setting of the Failover_mode property of the indicated resource might suppress the requested scha_control () or scha_control_zone() action. If Failover_mode is RESTART_ONLY, only SCHA_RESOURCE_RESTART is permitted. Other requests, including SCHA_GIVEOVER, SCHA_CHECK_GIVEOVER, SCHA_RESTART, and SCHA_CHECK_RESTART, return the SCHA_ERR_CHECKS exit code and the requested giveover or restart action is not executed, producing only a syslog message. If the Retry_count and Retry_interval properties are set on the resource, the number of resource restarts is limited to Retry_count attempts within the Retry_interval. If Failover_mode is LOG_ONLY, any scha_control() or scha_control_zone () giveover, restart, or disable request returns the SCHA_ERR_CHECKS exit code and the requested giveover or restart action is not executed, producing only a syslog message.

`tag` Arguments

The tag argument indicates whether the request is to restart or relocate the resource or resource group. This argument should be a string value that is defined by one of the following macros, which are defined in scha_tags.h:

SCHA_CHANGE_STATE_OFFLINE

Requests that the proxy resource that is named by the rname argument be brought offline on the local node. A proxy resource is an Oracle Solaris Cluster resource that imports the state of a resource from another cluster such as Oracle Clusterware. Oracle Clusterware is a platform-independent set of system services for cluster environments. This change in state reflects, in the context of the Oracle Solaris Cluster software, the change in state of the external resource.

When you change the state of a proxy resource with this tag argument, methods of the proxy resource are not executed.

If a fault occurs on a “depended-on” resource on a node, and the resource cannot recover, the monitor brings that resource on that node offline. The monitor brings the resource offline by calling the scha_control() or scha_control_zone() function with the SCHA_RESOURCE_DISABLE request. The monitor also brings all of the depended-on resource's offline-restart dependents offline by triggering a restart on them. When the cluster administrator resolves the fault and re-enables the depended-on resource, the monitor brings the depended-on resource's offline-restart dependents back online as well.

SCHA_CHANGE_STATE_ONLINE

Requests that the proxy resource that is named by the rname argument be brought online on the local node. A proxy resource is an Oracle Solaris Cluster resource that imports the state of a resource from another cluster such as Oracle Clusterware. This change in state reflects, in the context of the Oracle Solaris Cluster software, the change in state of the external resource.

When you change the state of a proxy resource with this tag argument, methods of the proxy resource are not executed.

SCHA_CHECK_GIVEOVER

Performs all the same validity checks that would be done for a SCHA_GIVEOVER of the resource group named by the rgname argument, but does not actually relocate the resource group.

SCHA_CHECK_RESTART

Performs all the same validity checks that would be done for a SCHA_RESTART request of the resource group named by the rgname argument, but does not actually restart the resource group.

The SCHA_CHECK_GIVEOVER and SCHA_CHECK_RESTART requests are intended to be used by resource monitors that take direct action upon resources, for example, killing and restarting processes, rather than invoking the scha_control() or scha_control_zone() function to perform a giveover or restart. If the check fails, the monitor should sleep and restart its probes rather than invoke its failover actions. See ERRORS.

The rgname argument is the name of the resource group that is to be restarted or relocated. If the group is not online on the node where the request is made, the request is rejected.

The rname argument is the name of a resource in the resource group. Presumably this is the resource whose monitor is making the scha_control() or scha_control_zone() request. If the named resource is not in the resource group, the request is rejected.

The exit code of the command indicates whether the requested action was rejected. If the request is accepted, the function does not return until the resource group or resource has completed going offline and back online. The fault monitor that called the scha_control () or scha_control_zone() function might be stopped as a result of the resource group's going offline and so might never receive the return status of a successful request.

SCHA_GIVEOVER

Requests that the resource group named by the rgname argument be brought offline on the local node, and online again on a different node of the RGM's choosing. Note that, if the resource group is currently online on two or more nodes and there are no additional available nodes on which to bring the resource group online, it can be taken offline on the local node without being brought online elsewhere. The request might be rejected depending on the result of various checks. For example, a node might be rejected as a host because the group was brought offline due to a SCHA_GIVEOVER request on that node within the interval specified by the Pingpong_interval property.

If the cluster administrator configures the RG_affinities properties of one or more resource groups and if you issue a scha_control GIVEOVER request on one resource group, more than one resource group might be relocated. The RG_affinities property is described in rg_properties(5).

The MONITOR_CHECK method is called before the resource group that contains the resource is relocated to a new node as the result of a call to the scha_control() or scha_control_zone() function or the issuing of the scha_control or scha_control_zone() command from a fault monitor. See the scha_control(1HA) man page.

The MONITOR_CHECK method may be called on any node that is a potential new master for the resource group. The MONITOR_CHECK method is intended to assess whether a node is running well enough to run a resource. The MONITOR_CHECK method must be implemented in such a way that it does not conflict with the running of another method concurrently.

Failure of the MONITOR_CHECK method vetoes the relocation of the resource group to the node where the callback was invoked.

SCHA_IGNORE_FAILED_START

Requests that failure of the currently executing Prenet_start or Start method should not cause a failover of the resource group, despite the setting of the Failover_mode property.

In other words, this request overrides the recovery action that is normally taken for a resource for which the Failover_Mode property is set to SOFT or HARD when that resource fails to start. Normally, the resource group fails over to a different node. Instead, the resource behaves as if Failover_Mode is set to NONE. The resource enters the START_FAILED state, and the resource group ends up in the ONLINE_FAULTED state, if no other errors occur.

This request is meaningful only when it is called from a Start or Prenet_start method that subsequently exits with a nonzero status or times out. This request is valid only for the current invocation of the Start or Prenet_start method. The scha_control() or scha_control_zone () function should be called with this request in a situation in which the Start method has determined that the resource cannot start successfully on another node. If this request is called by any other method, the error SCHA_ERR_INVAL is returned. This request prevents the “ping pong” failover of the resource group that would otherwise occur. See the scha_calls(3HA) man page for a description of the SCHA_ERR_INVAL error code.

SCHA_RESOURCE_DISABLE

Disables the resource that is named by the rname argument on the node on which the scha_control() or scha_control_zone() function is called.

If a fault occurs on a “depended-on” resource on a node and if the resource cannot recover, the monitor brings that resource on that node offline. The monitor brings the resource offline by calling the scha_control() or scha_control_zone () function with the SCHA_RESOURCE_DISABLE request. The monitor also brings all of the depended-on resource's offline-restart dependents offline by triggering a restart on them. When the cluster administrator resolves the fault and re-enables the depended-on resource, the monitor brings the depended-on resource's offline-restart dependents back online as well.

SCHA_RESOURCE_IS_RESTARTED

Requests that the resource restart counter for the resource named by the rname argument be incremented on the local node, without actually restarting the resource.

A resource monitor that restarts a resource directly without calling the scha_control() or scha_control_zone () function with the SCHA_RESOURCE_RESTART request (for example, using the pmfadm(1M) command) can use this request to notify the RGM that the resource has been restarted. This fact is reflected in subsequent calls to the scha_resource_get() function with NUM_RESOURCE_RESTARTS queries.

If the resource's type fails to declare the Retry_interval standard property, the SCHA_RESOURCE_IS_RESTARTED request of the scha_control() or scha_control_zone() function is not permitted and the scha_control() or scha_control_zone () function returns error code 13 (SCHA_ERR_RT).

SCHA_RESOURCE_RESTART

Requests that the resource named by the rname argument be brought offline and online again on the local node, without stopping any other resources in the resource group. The resource is stopped and started by applying the following sequence of methods to it on the local node:

MONITOR_STOP
STOP
START
MONITOR_START

If the resource type does not declare a STOP and START method, the resource is restarted using POSTNET_STOP and PRENET_START instead:

MONITOR_STOP
POSTNET_STOP
PRENET_START
MONITOR_START

If the resource's type does not declare a MONITOR_STOP and MONITOR_START method, only the STOP and START methods or the POSTNET_STOP and PRENET_START methods are invoked to perform the restart. The resource's type must declare a START and STOP method. See the scha_calls(3HA) man page for a description of the SCHA_ERR_RT error code.

If a method invocation fails while restarting the resource, the RGM might set an error state, or relocate the resource group, or reboot the node, depending on the setting of the Failover_mode property of the resource. For additional information, see the Failover_mode property in r_properties(5).

A resource monitor using this request to restart a resource can use the NUM_RESOURCE_RESTARTS query of scha_resource_get () to keep count of recent restart attempts.

Resource types that have PRENET_START or POSTNET_STOP methods need to use the SCHA_RESOURCE_RESTART request with care. Only the MONITOR_STOP, STOP, START, and MONITOR_START methods are applied to the resource. Network address resources on which this resource depends are not restarted and remain online.

SCHA_RESTART

Requests that the resource group named by the rgname argument be brought offline, then online again, without forcing relocation to a different node. The request may ultimately result in relocating the resource group if a resource in the group fails to restart. A resource monitor using this request to restart a resource group can use the NUM_RG_RESTARTS query of scha_resource_get() to keep count of recent restart attempts.

Return Values

These functions return the following values:

0: The function succeeded.
nonzero: The function failed.

Errors

SCHA_ERR_NOERR: The function succeeded.
SCHA_ERR_CHECKS: The request was rejected. The checks on relocation failed.

See the scha_calls(3HA) man page for a description of other error codes.

Normally, a fault monitor that receives an error code from the scha_control() or the scha_control_zone() function should sleep for awhile and then restart its probes. These functions must do so because some error conditions resolve themselves after awhile. An example of such an error condition is the failover of a global device service, which causes disk resources to become temporarily unavailable. After the error condition has resolved, the resource itself might become healthy again. If not, a subsequent scha_control () or scha_control_zone() request might succeed.

Files

/usr/cluster/include/scha.h: Include file
/usr/cluster/lib/libscha.so: Library

Attributes

See attributes(5) for descriptions of the following attributes:

ATTRIBUTE TYPE	ATTRIBUTE VALUE
Availability	`ha-cluster/developer/api`
Interface Stability	Evolving