Sun Cluster 3.0 Data Services Installation and Configuration Guide

Clearing the STOP_FAILED Error Flag on Resources

When the Failover_mode resource property is NONE or SOFT and the STOP of a resource fails, the individual resource goes into the STOP_FAILED state and the resource group into the ERROR_STOP_FAILED state. You cannot bring a resource group in this state on any node online, nor can you edit it (create or delete resources, or change resource group or resource properties).

How to Clear the STOP_FAILED Error Flag on Resources

To complete this procedure, you must supply the following information:

For additional information, see scswitch(1M).

Perform this procedure from any cluster node.

  1. Become superuser on a node in the cluster.

  2. Identify which resources have gone into the STOP_FAILED state and on which nodes.

    # scstat -g
  3. Manually stop the resources and their monitors on the nodes on which they are in STOP_FAILED state.

    This step might require killing processes or running resource type-specific commands or other commands.

  4. Manually set the state of these resources to OFFLINE on all the nodes on which they were manually stopped.

    # scswitch -c -h nodelist -j resource-name -f STOP_FAILED

    Clears the flag.

    -h nodelist

    Specifies the node names on which the resource was running.

    -j resource-name

    Specifies the name of the resource to take offline.


    Specifies the flag name.

  5. Check the resource group state on the nodes where the STOP_FAILED flag was cleared in Step 4. It should now be OFFLINE or ONLINE.

    # scstat -g

    If the resource group remains in the ERROR_STOP_FAILED state, as shown by scstat -g, take the resource group offline on those nodes where it is still in the ERROR_STOP_FAILED state by using the following scswitch command.

    # scswitch -F -g resource-group-name


    Takes the resource group offline on all nodes that can master the group.

    -g resource-group-name

    Specifies the name of the resource group to take offline.

    This situation can occur if the resource group was being switched offline when the STOP method failure occurred and the resource that failed to stop had a dependency on other resources in the resource group. Otherwise, the resource group reverts to the ONLINE or OFFLINE state automatically after you have run the command in Step 4 on all STOP_FAILED resources.

    Now you can switch the resource group to the ONLINE state.