Sun Cluster 3.0 12/01 Hardware Guide

Testing Cluster Interconnect and Network Adapter Failover Group Redundancy

This section provides the procedure for testing cluster interconnect and Network Adapter Failover (NAFO) group redundancy.

How to Test Cluster Interconnects

  1. Disconnect one of the cluster transport cables from a primary node that masters a device group.

    Messages appear on the consoles of each node, and error messages appear in the /var/adm/messages file. If you run the scstat(1M) command, the Sun Cluster software assigns a faulted status to the cluster transport path you disconnected. This fault does not result in a failover.

  2. Disconnect the remaining cluster transport cable from the primary node you identified in Step 1.

    Messages appear on the consoles of each node, and error messages appear in the /var/adm/messages file. If you run the scstat command, the Sun Cluster software assigns a faulted status to the cluster transport path you disconnected. This action causes the primary node to go down, resulting in a partitioned cluster.

    For conceptual information on failure fencing or split brain, see the Sun Cluster 3.0 12/01 Concepts document.

  3. On another node, run the scstat command to verify that the secondary node took ownership of the device group mastered by the primary.


    # scstat
    
  4. Reconnect all cluster transport cables.

  5. Boot the initial primary, which you identified in Step 1, into cluster mode.


    {0} ok boot
    
  6. Verify that the Sun Cluster software assigned a path online status to each cluster transport path you reconnected in Step 4.


    # scstat
    

    If you have the device group failback option enabled, skip Step 7 because the system boot process moves ownership of the device group back to the initial primary. Otherwise, go to Step 7 to move ownership of the device group back to the initial primary. Use the scconf -p command to determine if your device group has the device group failback option enabled.

  7. If you do not have the device group failback option enabled, move ownership of the device group back to the initial primary.


    # scswitch -S -h nodename
    

How to Test Network Adapter Failover Groups

Perform this procedure on each node.

  1. Identify the current active network adapter.


    # pnmstat -l
    
  2. Disconnect one public network cable from the current active network adapter.

    Error messages appear in the node's console. This action causes a NAFO failover to a backup network adapter.

  3. From the master console, verify that the Sun Cluster software failed over to the backup NAFO adapter.

    A NAFO failover occurred if the backup NAFO adapter displays an active status.


    # pnmstat -l
    
  4. Reconnect the public network cable, and wait for the initial network adapter to come online.

  5. Switch over all IP addresses that are hosted by the active network adapter to the initial network adapter, and make the initial network adapter the active network adapter.


    # pnmset switch adapter