1. Installing and Maintaining a SCSI RAID Storage Device
How to Install a Storage Array in a New Cluster
How to Add a Storage Array to an Existing Cluster
How to Reset the LUN Configuration
How to Correct Mismatched Device ID Numbers
FRUs That Do Not Require Oracle Solaris Cluster Maintenance Procedures
Sun StorEdge A1000 Array and Netra st A1000 Array FRUs
Sun StorEdge A3500 System FRUs
How to Replace a Failed Controller or Restore an Offline Controller
How to Upgrade Controller Module Firmware
The maintenance procedures in FRUs That Do Not Require Oracle Solaris Cluster Maintenance Procedures are performed the same as in a noncluster environment. Table 1-3 lists the procedures that require cluster-specific steps.
Note - When you upgrade firmware on a storage device or on an enclosure, redefine the stripe size of a LUN, or perform other LUN operations, a device ID might change unexpectedly. When you perform a check of the device ID configuration by running the cldevice check command, the following error message appears on your console if the device ID changed unexpectedly.
device id for nodename:/dev/rdsk/cXtYdZsN does not match physical device's id for ddecimalnumber, device may have been replaced.
To fix device IDs that report this error, run the cldevice repair command for each affected device.
Table 1-3 Task Map: Maintaining a Storage Array or Storage System
|
Each storage device has a different set of FRUs that do not require cluster-specific procedures. Choose among the following storage devices:
The following is a list of administrative tasks that require no cluster-specific procedures. See the Sun StorEdge A1000 and D1000 Installation, Operations, and Service Manual and the Netra st A1000/D1000 Installation and Maintenance Manual for these procedures.
Replacing a storage array-to-host SCSI cable requires no cluster-specific procedures. See the Sun StorEdge RAID Manager User’s Guide and the Sun StorEdge RAID Manager Release Notes for these procedures.
With the exception of one item, the following is a list of administrative tasks that require no cluster-specific procedures. Shut down the cluster, and then see the Sun StorEdge A3500/A3500FC Controller Module Guide, the Sun StorEdge A1000 and D1000 Installation, Operations, and Service Manual, and the Sun StorEdge Expansion Cabinet Installation and Service Manual for the following procedures. See the Oracle Solaris Cluster system administration documentation for procedures about how to shut down a cluster. For a list of Oracle Solaris Cluster documentation, see Related Documentation.
Replacing a power cord that connects to the cabinet power distribution unit (see the Sun StorEdge Expansion Cabinet Installation and Service Manual).
Replacing a power cord to a storage array (see the Sun StorEdge A1000 and D1000 Installation, Operations, and Service Manual).
The following is a list of administrative tasks that require no cluster-specific procedures. See the Sun StorEdge A3500/A3500FC Controller Module Guide, the Sun StorEdge RAID Manager User’s Guide, the Sun StorEdge RAID Manager Release Notes, the Sun StorEdge FC-100 Hub Installation and Service Manual, and the documentation that shipped with your FC hub or FC switch for the following procedures.
Replacing a SCSI cable from the controller module to the storage array.
Replacing a storage array-to-host or storage array–to-hub fiber-optic cable.
Replacing an FC hub (see the Sun StorEdge FC-100 Hub Installation and Service Manual).
Replacing an FC hub gigabit interface converter (GB(C) or Small Form-Factor Pluggable (SFP) that connects cables to the host or hub.
Caution - This procedure removes all data that is on the storage array or storage system you are removing. |
Before You Begin
This procedure relies on the following prerequisites and assumptions.
Your cluster is operational.
You no longer need the data that is stored on the storage array or storage system you are removing.
This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the forms of the command names, the commands are identical.
To perform this procedure, become superuser or assume a role that provides solaris.cluster.read and solaris.cluster.modify RBAC authorization.
Note - Your storage array or storage system might not support LUNs as quorum devices. To determine if this restriction applies to your storage array or storage system, see Restrictions and Requirements.
To determine whether any LUNs in the storage array are quorum devices, use the following command.
# clquorum show
For procedures about how to add and remove quorum devices, see your Oracle Solaris Cluster system administration documentation. For a list of Oracle Solaris Cluster documentation, see Related Documentation.
For instructions, see your storage device documentation and your operating system documentation.
For more information, see your Solaris Volume Manager or Veritas Volume Manager documentation.
You must completely remove LUNs that were managed by Veritas Volume Manager from Veritas Volume Manager control before you can delete the LUNs.
# vxdisk offline cNtXdY # vxdisk rm cNtXdY
For the procedure about how to delete a LUN, see your storage device's documentation.
# rm /dev/rdsk/cNtXdY* # rm /dev/dsk/cNtXdY* # rm /dev/osa/dev/dsk/cNtXdY* # rm /dev/osa/dev/rdsk/cNtXdY*
# cldevice clear
The RAID Manager software creates two paths to the LUN in the /dev/osa/dev/rdsk directory. Substitute the cNtXdY number from the other controller module in the storage array to determine the alternate path.
For example:
# lad c0t5d0 1T93600714 LUNS: 0 1 c1t4d0 1T93500595 LUNS: 2
Therefore, the alternate paths are as follows:
/dev/osa/dev/dsk/c1t4d1* /dev/osa/dev/rdsk/c1t4d1*
# rm /dev/osa/dev/dsk/cNtXdY* # rm /dev/osa/dev/rdsk/cNtXdY*
Note - If no other parallel SCSI devices are connected to the nodes, you can delete the contents of the nvramrc script. At the OpenBoot PROM, set setenv use-nvramrc? to false.
For the procedure about how to shut down and power off a node, see your Oracle Solaris Cluster system administration documentation. For a list of Oracle Solaris Cluster documentation, see Related Documentation.
For the procedure about how to remove a host adapter, see the documentation that shipped with your node hardware.
Use the following command for each resource group you want to return to the original node.
# clresourcegroup switch -n nodename resourcegroup1[ resourcegroup2 …]
For failover resource groups, the node to which the groups are returned. For scalable resource groups, the node list to which the groups are returned.
The resource group or groups that you are returning to the node or nodes.
Caution - If you improperly remove RAID Manager packages, the next reboot of the node fails. Before you remove RAID Manager software packages, see the Sun StorEdge RAID Manager Release Notes. |
For the procedure about how to remove software packages, see the documentation that shipped with your storage array or storage system.
This procedure assumes that your cluster is operational. For conceptual information about SCSI reservations and failure fencing, see your Oracle Solaris Cluster concepts documentation. For a list of Oracle Solaris Cluster documentation, see Related Documentation.
Before You Begin
This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the forms of the command names, the commands are identical.
To perform this procedure, become superuser or assume a role that provides solaris.cluster.read and solaris.cluster.modify RBAC authorization.
Note - Your storage array or storage system might not support LUNs as quorum devices. To determine if this restriction applies to your storage array or storage system, see Restrictions and Requirements.
To determine whether any LUNs in the storage array are quorum devices, use the following command.
# clquorum show
For procedures about how to add and remove quorum devices, see your Oracle Solaris Cluster system administration documentation. For a list of Oracle Solaris Cluster documentation, see Related Documentation.
Caution - You must set the System_LunReDistribution parameter in the /etc/raid/rmparams file to false so that no LUNs are assigned to the controller being brought online. After you verify in Step 8 that the controller has the correct SCSI reservation state, you can balance LUNs between both controllers. |
For the procedure about how to modify the rmparams file, see the Sun StorEdge RAID Manager Installation and Support Guide.
# /etc/init.d/amdemon stop # /etc/init.d/amdemon start
For the procedure about how to replace controllers, see the Sun StorEdge A3500/A3500FC Controller Module Guide and the Sun StorEdge RAID Manager Installation and Support Guide for additional considerations.
Note - You must use the RAID Manager GUI's Recovery application to bring the controller online. Do not use the Redundant Disk Array Controller Utility (rdacutil) because this utility ignores the value of the System_LunReDistribution parameter in the /etc/raid/rmparams file.
For information about the Recovery application, see the Sun StorEdge RAID Manager User’s Guide. If you have problems with bringing the controller online, see the Sun StorEdge RAID Manager Installation and Support Guide.
Use the following command on LUN 0 of the controller you want to bring online.
In the following command, devicename is the full UNIX path name of the device, for example /dev/dsk/c1tXdY
# cldevice repair devicename
For more information about controller modes, see the Sun StorEdge RAID Manager Installation and Support Guide and the Sun StorEdge RAID Manager User’s Guide.
For the procedure about how to change the rmparams file, see the Sun StorEdge RAID Manager Installation and Support Guide.
# /etc/init.d/amdemon stop # /etc/init.d/amdemon start
Use either the online or the offline method to upgrade your NVSRAM firmware. The method that you choose depends on your firmware.
Before You Begin
This procedure assumes that your cluster is operational
If you are not upgrading the NVSRAM file, you can use the online method.
Upgrade the firmware by using the online method, as described in the Sun StorEdge RAID Manager User’s Guide. No special steps are required for a cluster environment.
If you are upgrading the NVSRAM file, you must use an offline method. Use one of the following procedures.
For more information, see your Solaris Volume Manager or Veritas Volume Manager documentation.
For more information, see your Solaris Volume Manager or Veritas Volume Manager documentation.
This step completes the firmware upgrade.
For the procedure about how to shut down a cluster, see your Oracle Solaris Cluster system administration documentation.
For the procedure about how to boot a node in noncluster mode, see your Oracle Solaris Cluster system administration documentation. For a list of Oracle Solaris Cluster documentation, see Related Documentation.
For more information about how to boot nodes, see your Oracle Solaris Cluster system administration documentation. For a list of Oracle Solaris Cluster documentation, see Related Documentation.
This step completes the firmware upgrade.
Adding a disk drive enables you to increase your storage space after a storage array has been added to your cluster.
Caution - If the disk drive that you are adding was previously owned by another controller module, reformat the disk drive to wipe clean the old DacStore information before adding the disk drive to this storage array. If any old DacStore information remains, it can cause aberrant behavior including the appearance of ghost disks or LUNs in the RAID Manager interfaces. |
Before You Begin
This procedure relies on the following prerequisites and assumptions.
Your cluster is operational.
Your storage array contains an empty disk slot.
Your nodes are not configured with dynamic reconfiguration functionality.
If your nodes are configured for dynamic reconfiguration, see the Oracle Solaris Cluster system administration documentation and skip steps that instruct you to shut down the node. For a list of Oracle Solaris Cluster documentation, see Related Documentation.
For information about how to move drives between storage arrays, see the Sun StorEdge RAID Manager Release Notes.
For the procedure about how to install a disk drive, see your storage documentation. For a list of storage documentation, see Related Documentation.
For instructions about how to run Recovery Guru and Health Check, see the Sun StorEdge RAID Manager User’s Guide.
For the procedure about how to fail and revive drives, see the Sun StorEdge RAID Manager User’s Guide.
See Also
To create LUNs for the new drives, see How to Create a LUN for more information.
Removing a disk drive enables you to reduce or reallocate your existing storage pool. You might want to perform this procedure if a disk has failed or is behaving in an unreliable manner.
For conceptual information about quorum, quorum devices, global devices, and device IDs, see your Oracle Solaris Cluster concepts documentation. For a list of Oracle Solaris Cluster documentation, see Related Documentation.
Before You Begin
This procedure relies on the following prerequisites and assumptions.
Your cluster is operational.
Your nodes are not configured with dynamic reconfiguration functionality.
If your nodes are configured for dynamic reconfiguration, see the Oracle Solaris Cluster system administration documentation and skip steps that instruct you to shut down the node. For a list of Oracle Solaris Clusterdocumentation, see Related Documentation.
For the procedure about how to replace a disk drive, see your storage documentation. For a list of storage documentation, see Related Documentation.
For the procedure about how to run Recovery Guru and Health Check, see the Sun StorEdge RAID Manager User’s Guide.
For the procedure about how to fail and revive drives, see the Sun StorEdge RAID Manager User’s Guide.
For more information, see your Solaris Volume Manager or Veritas Volume Manager documentation.
Removing a disk drive enables you to reduce or reallocate your existing storage pool. You might want to perform this procedure in the following scenarios.
You no longer need to make data accessible to a particular node.
You want to migrate a portion of your storage to another storage array.
For conceptual information about quorum, quorum devices, global devices, and device IDs, see your Oracle Solaris Cluster concepts documentation.
Before You Begin
This procedure relies on the following prerequisites and assumptions.
Your cluster is operational.
You do not need to remove the entire storage array.
If you need to remove the storage array, see How to Remove a Storage Array.
You do not need to replace the storage array's chassis.
If you need to replace your storage array's chassis, see FRUs That Do Not Require Oracle Solaris Cluster Maintenance Procedures.
Your nodes are not configured with dynamic reconfiguration functionality.
If your nodes are configured for dynamic reconfiguration, see the Oracle Solaris Cluster system administration documentation and skip steps that instruct you to shut down the node. For a list of Oracle Solaris Cluster documentation, see Related Documentation.
This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the forms of the command names, the commands are identical.
To perform this procedure, become superuser or assume a role that provides solaris.cluster.read and solaris.cluster.modify RBAC authorization.
Note - Your storage array or storage system might not support LUNs as quorum devices. To determine if this restriction applies to your storage array or storage system, see Restrictions and Requirements.
To determine whether any LUNs in the storage array are quorum devices, use the following command.
# clquorum show
For procedures about how to add and remove quorum devices, see your Oracle Solaris Cluster system administration documentation.
For the procedure about how to remove a LUN, see How to Delete a LUN.
For the procedure about how to remove a disk drive, see your storage documentation. For a list of storage documentation, see Related Documentation.
Caution - After you remove the disk drive, install a dummy drive to maintain proper cooling. |
Caution - You must be a Oracle service provider to perform disk drive firmware updates. If you need to upgrade drive firmware, contact your Oracle service provider. |
Note - Several steps in this procedure require you to halt I/O activity. To halt I/O activity, take the controller module offline by using the RAID Manager GUI's manual recovery procedure in the Sun StorEdge RAID Manager User’s Guide.
Before You Begin
This procedure relies on the following prerequisites and assumptions.
Your cluster is operational.
The node on which the host adapter resides is attached to a SCSI-based storage array or storage system.
This procedure defines Node A as the node with the host adapter on SCSI bus A. This host adapter is the host adapter that you are replacing. Node B is the node that remains in service.
This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the forms of the command names, the commands are identical.
To perform this procedure, become superuser or assume a role that provides solaris.cluster.read and solaris.cluster.modify RBAC authorization.
# clresourcegroup status -n nodename # cldevicegroup status -n nodename
Note the device groups, the resource groups, and the node list for the resource groups. You will need this information to restore the cluster to its original configuration in Step 25 of this procedure.
# clnode evacuate fromnode
For the procedure about how to shut down and power off a node, see your Oracle Solaris Cluster system administration documentation. For a list of Oracle Solaris Cluster documentation, see Related Documentation.
For instructions, see the Sun StorEdge RAID Manager User’s Guide.
For instructions, see the Sun StorEdge RAID Manager User’s Guide.
For instructions, see the Sun StorEdge RAID Manager User’s Guide.
For instructions, see the Sun StorEdge RAID Manager User’s Guide.
For the procedure about how to replace a host adapter, see the documentation that shipped with your node hardware.
For instructions, see the Sun StorEdge RAID Manager User’s Guide.
For instructions, see the Sun StorEdge RAID Manager User’s Guide.
For instructions, see the Sun StorEdge RAID Manager User’s Guide.
For instructions, see the Sun StorEdge RAID Manager User’s Guide.
For instructions, see the Sun StorEdge RAID Manager User’s Guide.
Perform the following step for each device group you want to return to the original node.
# cldevicegroup switch -n nodename devicegroup1[ devicegroup2 …]
The node to which you are restoring device groups.
The device group or groups that you are restoring to the node.
In these commands, devicegroup is one or more device groups that are returned to the node.
Perform the following step for each resource group you want to return to the original node.
# clresourcegroup switch -n nodename resourcegroup1[ resourcegroup2 …]
For failover resource groups, the node to which the groups are returned. For scalable resource groups, the node list to which the groups are returned.
The resource group or groups that you are returning to the node or nodes.