Sun Cluster 3.1 Data Service Planning and Administration Guide

How to Clear the STOP_FAILED Error Flag on Resources

To complete this procedure, you must supply the following information.

See the scswitch(1M) man page for additional information.


Note –

Perform this procedure from any cluster node.


  1. Become superuser on a cluster member.

  2. Identify which resources have gone into the STOP_FAILED state and on which nodes.


    # scstat -g
    
  3. Manually stop the resources and their monitors on the nodes on which they are in STOP_FAILED state.

    This step might require that you kill processes or run commands that are specific to resource types or other commands.

  4. Manually set the state of these resources to OFFLINE on all of the nodes on which you manually stopped the resources.


    # scswitch -c -h nodelist -j resource -f STOP_FAILED
    
    -c

    Clears the flag.

    -h nodelist

    Specifies the node names on which the resource was running.

    -j resource

    Specifies the name of the resource to switch offline.

    -f STOP_FAILED

    Specifies the flag name.

  5. Check the resource group state on the nodes where you cleared the STOP_FAILED flag in Step 4.

    The resource group state should now be OFFLINE or ONLINE.


    # scstat -g
    

    The command scstat -g indicates whether the resource group remains in the ERROR_STOP_FAILED state. If the resource group is still in the ERROR_STOP_FAILED state, then run the following scswitch command to switch the resource group offline on the appropriate nodes.


    # scswitch -F -g resource-group
    

    -F

    Switches the resource group offline on all of the nodes that can master the group.

    -g resource-group

    Specifies the name of the resource group to switch offline.

    This situation can occur if the resource group was being switched offline when the STOP method failure occurred and the resource that failed to stop had a dependency on other resources in the resource group. Otherwise, the resource group reverts to the ONLINE or OFFLINE state automatically after you have run the command in Step 4 on all of the STOP_FAILED resources.

    Now you can switch the resource group to the ONLINE state.