This chapter describes the procedures about how to maintain SunTM StorEdgeTM 3310 and 3320 SCSI RAID arrays in a SunTM Cluster environment.
Read the entire procedure before you perform any steps within a procedure in this chapter. If you are not reading an online version of this document, ensure that you have the books listed in Preface available.
This chapter contains the following procedures.
This section contains the procedures about how to maintain a RAID storage array in a Sun Cluster environment. Maintenance tasks in Table 2–1 contain cluster-specific tasks. Tasks that are not cluster-specific are referenced in a list following the table.
When you upgrade firmware on a storage device or on an enclosure, redefine the stripe size of a LUN, or perform other LUN operations, a device ID might change unexpectedly. When you perform a check of the device ID configuration by running the scdidadm -c command, the following error message appears on your console if the device ID changed unexpectedly.
device id for nodename:/dev/rdsk/cXtYdZsN does not match physical device's id for ddecimalnumber, device may have been replaced. |
To fix device IDs that report this error, run the scdidadm -R command for each affected device.
Task |
Information |
---|---|
Remove a RAID storage array. | |
Replace a RAID storage array. To replace a RAID storage array, remove the RAID storage array. Add a new RAID storage array to the configuration. | |
Add a JBOD as an expansion unit. Follow the same procedure that you use in a noncluster environment. If you plan to use the disk drives in the expansion unit to create a new LUN, see How to Create and Map a LUN |
Sun StorEdge 3000 Family Installation, Operation, and Service Manual |
Remove a JBOD expansion unit. Follow the same procedure that you use in a noncluster environment. If you plan to remove a LUN in conjunction with the disk drive, see How to Unmap and Delete a LUN. |
Sun StorEdge 3000 Family Installation, Operation, and Service Manual |
Add a disk drive. Follow the same procedure that you use in a noncluster environment. If you plan to create a new LUN with the disk drive, see How to Create and Map a LUN. |
Sun StorEdge 3000 Family FRU Installation Guide |
Remove a disk drive. Follow the same procedure that you use in a noncluster environment. If you plan to remove a LUN in conjunction with the disk drive, see How to Unmap and Delete a LUN. If your configuration is running in RAID level 0, take appropriate action to prepare the volume manager for the impacted disk to be inaccessible. |
Sun StorEdge 3000 Family FRU Installation Guide |
Replace a failed controller or restore an offline controller. | |
Replace a host-to-storage array SCSI cable. Switch the device group over to the other node before performing this procedure. Then follow the same procedure that you use in a noncluster environment. |
Sun StorEdge 3000 Family Installation, Operation, and Service Manual |
Replace the I/O module. | |
Replace the terminator module. | |
Replace a failed host adapter. |
The following is a list of administrative tasks that require no cluster-specific procedures. See the Sun StorEdge 3000 Family FRU Installation Guide for instructions on replacing the following FRUs.
Use this procedure to remove a RAID storage array from a running cluster.
This procedure removes all data that is on the RAID storage array that you remove.
This procedure assumes that your nodes are not configured with dynamic reconfiguration functionality. If your nodes are configured for dynamic reconfiguration, see your Sun Cluster Hardware Administration Manual for Solaris OS.
Is one of the disks in the RAID storage array a quorum device? This RAID storage array is the storage array that you are removing.
# scstat -q |
If no, proceed to Step 2.
If yes, relocate that quorum device to another suitable RAID storage array.
For procedures about how to add and remove quorum devices, see your Sun Cluster system administration documentation.
If necessary, back up the metadevice or volume.
For more information, see your Solstice DiskSuite/Solaris Volume Manager or VERITAS Volume Manager documentation.
Perform volume management administration to remove the RAID storage array from the configuration.
If a volume manager does manage the LUN, run the appropriate Solstice DiskSuite/Solaris Volume Manager commands or VERITAS Volume Manager commands to remove the LUN from any diskset or disk group. For more information, see your Solstice DiskSuite/Solaris Volume Manager or VERITAS Volume Manager documentation. See the following paragraph for additional VERITAS Volume Manager commands that are required.
LUNs that were managed by VERITAS Volume Manager must be completely removed from VERITAS Volume Manager control before you can delete the LUNs from the Sun Cluster environment. After you delete the LUN from any disk group, use the following commands on both nodes to remove the LUN from VERITAS Volume Manager control.
# vxdisk offline cNtXdY # vxdisk rm cNtXdY |
Identify the LUNs that you need to remove.
# cfgadm -al |
On all nodes, remove references to the LUNs in the RAID storage array that you removed.
# cfgadm -c unconfigure cN::dsk/cNtXdY |
Disconnect the SCSI cables from the RAID storage array.
On both nodes, remove the paths to the LUN that you are deleting.
# devfsadm -C |
On both nodes, remove all obsolete device IDs (DIDs).
# scdidadm -C |
If no other LUN is assigned to the target and LUN ID, remove the LUN entries from /kernel/drv/sd.conf file.
Perform this step on both nodes to prevent extended boot time caused by unassigned LUN entries.
Do not remove default t0d0 entries.
Power off the RAID storage array. Disconnect the RAID storage array from the AC power source.
For the procedure about how to power off a storage array, see the Sun StorEdge 3000 Family Installation, Operation, and Service Manual.
Remove the RAID storage array.
For the procedure about how to remove a storage array, see the Sun StorEdge 3000 Family Installation, Operation, and Service Manual.
If necessary, remove any unused host adapters from the nodes.
For the procedure about how to remove a host adapter, see your Sun Cluster system administration documentation and the documentation that shipped with your host adapter and node.
From any node, verify that the configuration is correct.
# scdidadm -L |
If the RAID storage array is configured with dual controllers, see the Sun StorEdge 3000 Family FRU Installation Guidey for controller replacement procedures. If the RAID storage array is configured with a single controller, perform the procedure below to ensure high availability.
Detach the submirrors on the RAID storage array that are connected to the controller. This controller is the controller that you are replacing. Detach the submirrors to stop all I/O activity to the RAID storage array.
For more information, see your Solstice DiskSuite/Solaris Volume Manager or VERITAS Volume Manager documentation.
Replace the controller.
For the procedure about how to replace a controller, see the Sun StorEdge 3000 Family Installation, Operation, and Service Manual.
Reattach the submirrors to resynchronize the submirrors.
For more information, see your Solstice DiskSuite/Solaris Volume Manager or VERITAS Volume Manager documentation.
Use this procedure to replace a RAID storage array I/O module.
If your configuration is running in RAID level 0, take appropriate action to prepare the volume manager for the impacted disk to be inaccessible.
Detach the submirrors on the RAID storage array that are connected to the I/O module that you are replacing. Detach the submirrors to stop all I/O activity to the RAID storage array.
For more information, see your Solstice DiskSuite/Solaris Volume Manager or VERITAS Volume Manager documentation.
Disconnect the SCSI cables from the hosts.
Disconnect the SCSI cables from the I/O module.
Replace the I/O module.
For the procedure about how to replace the I/O module, see the Sun StorEdge 3000 Family FRU Installation Guide.
Reconnect the SCSI cables to the I/O module.
Reconnect the SCSI cables to the host.
Reattach the submirrors to resynchronize submirrors.
For more information, see your Solstice DiskSuite/Solaris Volume Manager or VERITAS Volume Manager documentation.
Use this procedure to replace a RAID storage array terminator module.
If your configuration is running in RAID level 0, take appropriate action to prepare the volume manager for the impacted disk to be inaccessible.
Detach the submirrors on the RAID storage array that are connected to the terminator module that you are replacing. Detach the submirrors to stop all I/O activity to the RAID storage array.
For more information, see your Solstice DiskSuite/Solaris Volume Manager or VERITAS Volume Manager documentation.
Replace the terminator module.
For the procedure about how to replace the terminator module, see the Sun StorEdge 3000 Family FRU Installation Guide.
Reattach the submirrors to resynchronize submirrors.
For more information, see your Solstice DiskSuite/Solaris Volume Manager or VERITAS Volume Manager documentation.
Use this procedure to replace a failed host adapter in a running cluster. This procedure defines Node A as the node with the failed host adapter that you are replacing.
This procedure relies on the following prerequisites and assumptions.
Except for the failed host adapter, your cluster is operational and all nodes are powered on.
Your nodes are not configured with dynamic reconfiguration functionality.
If your nodes are configured for dynamic reconfiguration and you are using two entirely separate hardware paths to your shared data, see the Sun Cluster Hardware Administration Manual for Solaris OS and skip steps that instruct you to shut down the cluster.
If you are using a single, dual-port HBA to provide the connections to your shared data, you cannot use dynamic reconfiguration for this procedure. Follow all steps in the procedure. For the details on the risks and limitations of this configuration, see Configuring Cluster Nodes With a Single, Dual-Port HBA in Sun Cluster 3.0-3.1 Hardware Administration Manual for Solaris OS.
Determine the resource groups and device groups that are running on Node A.
Record this information because you use this information in Step 11 of this procedure to return resource groups and device groups to Node A.
# scstat |
Record the details of any metadevices that are affected by the failed host adapter.
Record this information because you use it in Step 10 of this procedure to repair any affected metadevices.
Move all resource groups and device groups off Node A.
# scswitch -S -h nodename |
Shut down Node A.
For the full procedure about how to shut down and power off a node, see your Sun Cluster system administration documentation.
Power off Node A.
Replace the failed host adapter.
For the procedure about how to remove and add host adapters, see the documentation that shipped with your nodes.
Do you need to upgrade the node's host adapter firmware?
Upgrade the host adapter firmware on Node A.
PatchPro is a patch-management tool that eases the selection and download of patches required for installation or maintenance of Sun Cluster software. PatchPro provides an Interactive Mode tool especially for Sun Cluster. The Interactive Tool makes the installation of patches easier. PatchPro's Expert Mode tool helps you to maintain your configuration with the latest set of patches. Expert Mode is especially useful for obtaining all of the latest patches, not just the high availability and security patches.
To access the PatchPro tool for Sun Cluster software, go to http://www.sun.com/PatchPro/, click Sun Cluster, then choose either Interactive Mode or Expert Mode. Follow the instructions in the PatchPro tool to describe your cluster configuration and download the patches.
For third-party firmware patches, see the SunSolveSM Online site at http://sunsolve.ebay.sun.com.
Boot Node A into cluster mode.
For more information about how to boot nodes, see your Sun Cluster system administration documentation.
Perform any volume management procedures that are necessary to fix any metadevices affected by this procedure, as you identified in Step 2.
For more information, see your volume manager software documentation.
Return the resource groups and device groups you identified in Step 1 to Node A.
# scswitch -z -g resource-group -h nodename # scswitch -z -D device-group-name -h nodename |
For more information, see your Sun Cluster system administration documentation.