Oracle® Solaris Cluster Reference Manual

Exit Print View

Updated: July 2014, E39662-01
 
 

scds_fm_action (3HA)

Name

scds_fm_action - take action after probe completion function

Synopsis

cc [flags…] –I /usr/cluster/include file –L /usr/cluster/lib 
  –l dsdev#include <rgm/libdsdev.h> scha_err_t 
  scds_fm_action(scds_handle_t handle, int probe_status, 
  long elapsed_milliseconds);

Description

The scds_fm_action() function uses the probe_status of the data service in conjunction with the past history of failures to take one of the following actions:

  • Restart the application.

  • Fail over the resource group.

  • Do nothing.

Use the value of the input probe_status argument to indicate the severity of the failure. For example, you might consider a failure to connect to an application as a complete failure, but a failure to disconnect as a partial failure. In the latter case you would have to specify a value for probe_status between 0 and SCDS_PROBE_COMPLETE_FAILURE.

The DSDL defines SCDS_PROBE_COMPLETE_FAILURE as 100. For partial probe success or failure, use a value between 0 and SCDS_PROBE_COMPLETE_FAILURE.

The DSDL defines SCDS_PROBE_IMMEDIATE_FAILOVER as 201.Unless the Failover_mode property is set to RESTART_ONLY or LOG_ONLY, this probe status triggers an immediate failover of the resource group. To force an immediate failover attempt without first attempting a restart, use the special SCDS_PROBE_IMMEDIATE_FAILOVER value. For more information about the Failover_mode property, see the r_properties(5)

Successive calls to scds_fm_action() compute a failure history by summing the value of the probe_status input parameter over the time interval defined by the Retry_interval property of the resource. Any failure history older than Retry_interval is purged from memory and is not used towards making the restart or failover decision.

The scds_fm_action() function uses the following algorithm to choose which action to take:

Restart

If the accumulated history of failures reaches SCDS_PROBE_COMPLETE_FAILURE, scds_fm_action() restarts the resource by calling the STOP method of the resource followed by the START method. It ignores any PRENET_START or POSTNET_STOP methods defined for the resource type.

The status of the resource is set to SCHA_RSSTATUS_DEGRADED by making a scha_resource_setstatus() call, unless the resource is already set.

If the restart attempt fails because the START or STOP methods of the resource fail, a scha_control() is called with the GIVEOVER option to fail the resource group over to another node. If the scha_control () call succeeds, the resource group is failed over to another cluster node, and the call to scds_fm_action() never returns.

Upon a successful restart, failure history is purged. Another restart is attempted only if the failure history again accumulates to SCDS_PROBE_COMPLETE_FAILURE.

Failover

If the number of restarts attempted by successive calls to scds_fm_action() reaches the Retry_count value defined for the resource, a failover is attempted by making a call to scha_control () with the GIVEOVER option.

The status of the resource is set to SCHA_RSSTATUS_FAULTED by making a scha_resource_setstatus() call, unless the resource is already set.

If the scha_control() call fails, the entire failure history maintained by scds_fm_action() is purged.

If the scha_control() call succeeds, the resource group is failed over to another cluster node, and the call to scds_fm_action() never returns.

A probe can trigger an immediate failover attempt without any restarts, by specifying a probe_status value of SCDS_PROBE_IMMEDIATE_FAILOVER.

No Action

If the accumulated history of failures remains below SCDS_PROBE_COMPLETE_FAILURE , no action is taken. In addition, if the probe_status value is 0, which indicates a successful check of the service, no action is taken, irrespective of the failure history.

The status of the resource is set to SCHA_RSSTATUS_OK by making a scha_resource_setstatus() call, unless the resource is already set.

Parameters

The following parameters are supported:

handle

The handle that is returned from scds_initialize(3HA).

probe_status

A number you specify between 0 and SCDS_PROBE_COMPLETE_FAILURE or SCDS_PROBE_IMMEDIATE_FAILOVER that indicates the status of the data service.

  • A value of 0 implies that the recent data service check was successful.

  • A value of SCDS_PROBE_COMPLETE_FAILURE means complete failure and implies that the service has completely failed. You can also supply a value in between 0 and SCDS_PROBE_COMPLETE_FAILURE that implies a partial failure of the service.

  • A value of SCDS_PROBE_IMMEDIATE_FAILURE triggers a failover of the resource group without any restarts, unless the Failover_mode property is set to RESTART_ONLY or LOG_ONLY. For more information about the Failover_mode property, see the r_properties(5)

elapsed_milliseconds

The time, in milliseconds, to complete the data service check. This value is reserved for future use.

Return Values

The scds_fm_action() function returns the following values:

0

The function succeeded.

nonzero

The function failed.

Errors

SCHA_ERR_NOERR

No action was taken, or a restart was successfully attempted.

SCHA_ERR_FAIL

A failover attempt was made but it did not succeed.

SCHA_ERR_NOMEM

System is out of memory.

Files

/usr/cluster/include/rgm/libdsdev.h

Include file

/usr/cluster/lib/libdsdev.so

Library

Attributes

See attributes (5) for descriptions of the following attributes:

ATTRIBUTE TYPE
ATTRIBUTE VALUE
Availability
ha-cluster/developer/api
Interface Stability
Evolving

See also

scds_fm_sleep(3HA), scds_initialize(3HA), scha_calls(3HA), scha_control(3HA), scds_fm_print_probes(3HA), scha_resource_setstatus(3HA), attributes (5)