Sun Cluster 3.0-3.1 Hardware Administration Manual for Solaris OS

Chapter 9 Recabling and Replacing Disk Devices

This chapter contains the procedures for recabling and replacing disk devices. This chapter provides the following procedures.

Replacing a Failed Boot Disk

This section contains the procedure about how to replace a failed boot disk.

How to Replace a Failed Boot Disk in a Running Cluster

Use this procedure to replace a failed boot disk that is mirrored and that is on a node in a running cluster. Use this procedure for both Solstice DiskSuite/Solaris Volume Manager and VERITAS Volume Manager. This procedure also assumes that one mirror is available. This procedure defines Node N as the node on which you are replacing the failed boot disk.

If you do not have a mirror that is available, see your Sun Cluster system administration documentation to restore data on the boot disk.

Is Node N up and running?
- If no, skip to Step 5
- If yes, proceed to Step 2

Is your disk drive hot-pluggable?
- If yes, skip to Step 5
- If no, proceed to Step 3

Determine the resource groups and device groups running on Node N.

Record this information because you use this information in Step 6 of this procedure to return resource groups and device groups to these nodes.
# scstat
For more information, see your Sun Cluster system administration documentation.

Move all resource groups and device groups off Node N.
# scswitch -S -h from-node
For more information, see your Sun Cluster system administration documentation.

Replace the failed boot disk by using the procedure that is outlined in your volume manager documentation.

For more information, see your Solstice DiskSuite/Solaris Volume Manager or VERITAS Volume Manager documentation. For the procedure about how to replace a boot disk, see Recovering From Boot Problems in Solaris Volume Manager Administration Guide.

If you moved all resource groups and device groups off Node N in Step 4, return the resource groups and device groups that you identified in Step 3 to Node N.
# scswitch -z -g resource-group -h nodename # scswitch -z -D device-group-name -h nodename
For more information, see your Sun Cluster system administration documentation.

Moving a Disk Cable

You can move a disk cable to a different host adapter on the same bus because of a failed host adapter. However, the best solution is to replace the failed host adapter rather than recable the disk cable to a different host adapter. You might want to move a disk cable to a different host adapter on the same bus to improve performance.

This section provides the following two procedures for moving a disk cable.

Use one of these two procedures to prevent interference with normal operation of your cluster when you want to move a disk cable to a different host adapter on the same bus. If you do not follow these procedures correctly, you might see an error the next time you run the scdidadm -r command or the scgdevs command. If you see an error message that says did reconfiguration discovered invalid diskpath, go to How to Update Sun Cluster Software to Reflect Proper Device Configuration.

How to Move a Disk Cable to a New Host Adapter

Use this procedure to move a disk cable to a new host adapter within a node.

Caution –

Failure to follow this cabling procedure might introduce invalid device IDs and render the devices inaccessible.

Stop all I/O to the affected disk(s).

For more information, see your Solstice DiskSuite/Solaris Volume Manager or VERITAS Volume Manager documentation.

Unplug the cable from the old host adapter.

From the local node, unconfigure all drives that are affected by the recabling.
# cfgadm
Or reboot the local node.
# reboot -- -r

From the local node, update the Solaris device link.
# devfsadm -C

From the local node, update the device ID path.
# scdidadm -C

Connect the cable to the new host adapter.

From the local node, configure the drives in the new location.
# cfgadm
Or reboot the local node.
# reboot -- -r

Add the new device ID path.
# scgdevs

Troubleshooting

If you did not follow this procedure correctly, you might see an error the next time you run the scdidadm -r command or the scgdevs command. If you see an error message that says did reconfiguration discovered invalid diskpath, go to How to Update Sun Cluster Software to Reflect Proper Device Configuration.

How to Move a Disk Cable From One Node to Another Node

Use this procedure to move a disk cable from one node to another node.

Caution –

Failure to follow this cabling procedure might introduce invalid device IDs and render the devices inaccessible.

Delete all references to the device ID path that you are removing from all volume manager and data service configurations.

For more information, see your Sun Cluster data services collection and your Solstice DiskSuite/Solaris Volume Manager or VERITAS Volume Manager documentation.

Stop all I/O to the affected disk(s).

For more information, see your Solstice DiskSuite/Solaris Volume Manager or VERITAS Volume Manager documentation.

Unplug the cable from the old node.

From the old node, unconfigure all drives that are affected by the recabling.
# cfgadm
Or reboot the old node.
# reboot -- -r

From the old node, update the Solaris device link.
# devfsadm -C

From the old node, update the device ID path.
# scdidadm -C

Connect the cable to the new node.

From the new node, configure the drives in the new location.
# cfgadm
Or reboot the new node.
# reboot -- -r

From the new node, create the new Solaris device links.
# devfsadm

From the new node, add the new device ID path.
# scgdevs

Add the device ID path on the new node to any volume manager and data service configurations that are required.

When you configure data services, check that your node failover preferences are set to reflect the new configuration.

For more information, see your Sun Cluster data services collection and your Solstice DiskSuite/Solaris Volume Manager or VERITAS Volume Manager documentation.

Troubleshooting

How to Update Sun Cluster Software to Reflect Proper Device Configuration

If you see the following error when you run the scdidadm -r command or the scgdevs command, the Sun Cluster software does not reflect the proper device configuration because of improper device recabling.

did reconfiguration discovered invalid diskpath.
This path must be removed before a new path
can be added. Please run did cleanup (-C)
then re-run did reconfiguration (-r).

Use this procedure to ensure that the Sun Cluster software becomes aware of the new configuration and to guarantee device availability.

Ensure that your cluster meets the following conditions.
- The cable configuration is correct.
- The cable that you are removing is detached from the old node.
- The old node is removed from any volume manager or data service configurations that are required.
For more information, see your Sun Cluster data services collection and your Solstice DiskSuite/Solaris Volume Manager or VERITAS Volume Manager documentation.

From all nodes, one node at a time, unconfigure all drives.
# cfgadm
Or reboot all nodes, one node at a time.
# reboot -- -r

From all nodes, one node at a time, update the Solaris device link.
# devfsadm -C

From all nodes, one node at a time, update the device ID path.
# scdidadm -C

From all nodes, one node at a time, configure all drives.
# cfgadm
Or reboot all nodes, one node at a time.
# reboot -- -r

From any node, add the new device ID path.
# scgdevs

From all nodes that are affected by the recabling, verify that SCSI reservations are in the correct state.
# scdidadm -R device