C H A P T E R 7 |
Maintaining Your Array |
This chapter covers the following maintenance and troubleshooting topics:
This section introduces the initial RAID controller firmware screen and Main Menu.
You see the following initial controller screen when you first access the RAID controller firmware via the controller COM port or Ethernet port.
To complete the connection to your management console, select the VT100 terminal mode or the appropriate mode for your communications software, and press Return.
The progress indicator is displayed when necessary to indicate the percentage of completion of a particular task or event. Sometimes the event is represented by a descriptive title, such as "Drive Copying:"
Events showing full descriptive titles for the progress indicator include:
For other events, the progress indicator merely shows a two-letter code in front of the percentage completed. These codes and their meanings are shown in the following table:
Note - For information about the battery status indicator, see Battery Status. |
After you have selected the VT100 terminal emulation display mode and pressed Return on the initial screen, the Main Menu is displayed.
To return to the previous menu without performing the selected menu option |
|
Press a letter as a keyboard shortcut for commands which have a boldface capital letter |
This menu option is not used in normal operation. It is reserved for special use in special situations, and only when directed by technical support.
Caution - Do not use this menu option unless directed by technical support. Using it will result in the loss of your existing configuration and all data you have on the devices. |
An audible alarm indicates that either a component in the array has failed or a specific controller event has occurred. Error conditions and controller events are reported by event messages and event logs. Component failures are also indicated by LED activity on the array.
Note - It is important to know the cause of the error condition because how you silence the alarm depends on the cause of the alarm. |
To silence the alarm, perform the following steps:
1. Check the error messages, event logs, and LED activity to determine the cause of the alarm.
Component event messages include but are not limited to the following terms:
See Failed Component Alarm Codes for more information about component alarms.
Controller event messages include but are not limited to the following terms:
Refer to the "Event Messages" appendix in the Sun StorEdge 3000 Family RAID Firmware User's Guide for more information about controller events.
2. Depending on whether the cause of the alarm is a failed component or a controller event and which application you are using, silence the alarm as specified in the following table.
Note - Pushing the Reset button has no effect on controller event alarms and muting the beeper has no effect on failed component alarms. |
To assign a different RAID level or a different set of drives to a logical drive, you must unmap and delete the logical drive, and then create a new logical drive.
Caution - This operation erases all data on the logical drive. Therefore, if any data exists on the logical drive, it must be copied to another location before deleting the current logical drive. |
Note - You can delete a logical drive only if it has first been unmapped. |
To first unmap and then delete a logical drive, perform the following steps:
1. Choose the "view and edit Host luns" from the Main Menu.
Existing logical drive mappings are displayed.
2. Select the existing logical drive you want to unmap and press Return.
A menu of host LUNs is displayed.
3. Select the host LUN that you want to unmap and press Return.
A confirmation message asks if you want to unmap the host LUN you have selected.
4. Choose Yes to unmap the logical drive.
5. Repeat Step 3 and Step 4 to unmap all LUNS that are mapped to the logical drive you want to delete.
6. From the Main Menu, choose "view and edit Logical drives."
7. Select the logical drive you unmapped and want to delete, and press Return.
8. Choose "Delete logical drive."
A warning notice is displayed asking if you are certain you want to delete the logical drive and its data.
The status windows used to monitor and manage the array are described in the following sections:
To check and configure logical drives, from the Main Menu choose "view and edit Logical drives" and press Return.
The status of all logical drives is displayed.
TABLE 7-4 shows definitions and values for logical drive parameters.
To handle failed, incomplete, or fatal failure status, see Identifying a Failed Drive for Replacement and Recovering From Fatal Drive Failure.
To check and configure physical drives, from the Main Menu choose "view and edit scsi Drives" and press Return.
The screen displays the status of all physical drives.
A physical drive has a USED status when it was once part of a logical drive but no longer is. This can happen, for instance, when a drive in a RAID 5 array is replaced by a spare drive and the logical drive is rebuilt with the new drive. If the removed drive is later replaced in the array and scanned, the drive status is identified as USED since the drive still has data on it from a logical drive.
When the RAID set is deleted properly, this information is erased and the drive status is shown as FRMT rather than USED. A drive with FRMT status has been formatted with either 64 KB or 256 MB of reserved space for storing controller-specific information, but has no user data on it.
If you remove the reserved space using the View and Edit SCSI drives menu, the drive status changes to NEW.
To handle BAD drives, refer to Identifying a Failed Drive for Replacement.
If two drives show BAD and MISSING status, see Recovering From Fatal Drive Failure.
Note - If a drive is installed but not listed, the drive might be defective or installed incorrectly. |
To check and configure channels, from the Main Menu choose "view and edit Scsi channels," and press Return.
The screen displays the status of all channels for this controller.
Caution - Do not change the PID and SID values of drive channels. |
To check the status of controller voltage and temperature, perform the following steps:
1. Choose "view and edit Peripheral devices Controller Peripheral Device Configuration View Peripheral Device Status."
The voltage and temperature status of the RAID unit is displayed.
The components checked for voltage and temperature are displayed on the screen and are defined as normal or out-of-order.
2. Choose "Voltage and Temperature Parameters" and press Return to view or edit the trigger thresholds that determine voltage and temperature status.
3. Select a threshold you want to view or edit and press Return.
4. Repeat this step as many times as necessary to "drill down" to the threshold ranges and triggering events.
5. To edit a trigger or other editable value, backspace over the existing information and replace it.
The array's SCSI Enclosure Services (SES) processor, located on the I/O module, monitors environmental conditions and is supported by Sun StorEdge Configuration Service and the command-line interface.
For Sun StorEdge 3510 FC JBOD arrays only, both Sun StorEdge Configuration Service and the CLI access the SES processor using device files in /dev/es, such as /dev/es/ses0, as shown in the following example:
1. /dev/rdsk/c4t0d0s2 [SUN StorEdge 3310 SN#000280] (Primary) 2. /dev/es/ses0 [SUN StorEdge 3510F D SN#00227B] (Enclosure) |
To check the status of SES components (temperature sensors, cooling fans, the beeper speaker, power supplies, and slot status), perform the following steps:
1. Choose "view and edit Peripheral devices View Peripheral Device Status SES device."
A list is displayed of environmental sensors and other hardware components of that SES device.
2. Select an item from the list and press Return to display information about it or see a submenu of its component attributes.
Choosing Overall Status displays the status of the SES device and its operating temperature:
3. Select other attributes in which you are interested and press Return to learn more about the SES device.
Monitoring temperature at different points within the array is one of the most important SES functions. High temperatures can cause significant damage if they go unnoticed. There are a number of different sensors at key points within the enclosure. The following table shows the location of each of those sensors. The Element ID corresponds to the identifier shown when you choose "view and edit Peripheral devices View Peripheral Device Status SES Device Temperature Sensors."
You can view the status of SES components, including the pair of fans located in each fan and power supply module. A fan is identified by the SES Device menus as a cooling element.
Follow these steps to view the status of each fan:
1. Choose "view and edit Peripheral devices View Peripheral Device Status SES Device Cooling element."
2. Select one of the elements (element 0, 1, 2, or 3).
Standard fan speeds are indicated by numbers 1 through 7, indicating speeds in the normal range of 4000 to 6000 RPM. The number 0 indicates that the fan has stopped.
If a fan fails and the Status field does not display the OK value, you must replace the fan and power supply module.
Cooling elements in the status table can be identified for replacement as shown in TABLE 7-8.
A controller event log records an event or alarm that occurs after the system is powered on. The controller can store up to 1000 event logs. An event log records a configuration or operation event as well as an error message or alarm event.
Note - The SES logic in each array sends messages to the event log, which reports problems and the status of the fans, temperature, and voltage. |
Caution - Powering off or resetting the controller automatically deletes all recorded event logs. |
1. To view the event logs on screen, choose "view and edit Event logs" on the Main Menu.
A log of recent events is displayed.
Note - The controller can store up to 1000 event logs. An event log can record a configuration or operation event as well as an error message or alarm event. |
2. Use your arrow keys to move up and down through the list.
3. To clear the events from the log once you've read them, use your arrow keys to move down to the last event you want to clear and press Return.
A "Clear Above xx Event Logs?" confirmation message is displayed.
4. Choose Yes to clear the recorded event logs.
Note - Resetting the controller clears the recorded event logs. To retain event logs after controller resets, you can install and use the Sun StorEdge Configuration Service program. |
If you have saved a configuration file and want to apply the same configuration to another array (or reapply it to the array that had the configuration originally), you must be certain that the channels and IDs in the configuration file are correct for the array where you are restoring the configuration.
The NVRAM configuration file restores all configuration settings (channel settings, and host IDs) but does not rebuild logical drives. See Saving Configuration (NVRAM) to a Disk for information about how to save a configuration file, including advice on saving controller-dependent configuration whenever a configuration change is made.
See Record of Settings for advice about keeping a written record of your configuration before saving or restoring configuration files. See Save NVRAM to Disk and Restore From Disk for a convenient place to keep records whenever you save or restore configuration files.
Caution - Before restoring a configuration file, be certain that the configuration file you apply matches the array to which you apply it. If host IDs, logical drive controller assignments, or other controller-dependent configuration information described in the Chapter 5 has changed since the configuration file was saved, you might lose access to mismatched channels or drives. You have to change cabling or host or drive channel IDs to correct this mismatch and restore the access you have lost. On host Solaris workstations, the address of the RAID controller channel must also match what is described in /etc/vfstab. |
To restore configuration settings from a saved NVRAM file, perform the following steps:
1. Choose "system Functions Controller maintenance Restore nvram from disks."
A prompt notifies you that the controller NVRAM data has been successfully restored from disks.
From time to time, firmware upgrades are made available as patches that you can download from SunSolve Online, located at:
Each patch applies to one or more particular pieces of firmware, including:
SunSolve has extensive search capabilities that can help you find these patches, as well as regular patch reports and alerts to let you know when firmware upgrades and other patches become available. In addition, SunSolve provides reports about bugs that have been fixed in patch updates.
Each patch includes an associated README text file that provides detailed instructions about how to download and install that patch. But, generally speaking, all firmware downloads follow the same steps:
Note - For instructions on how to download firmware to disk drives in a JBOD directly attached to a host, refer to the README file in the patch that contains the firmware. |
1. Once you have determined that a patch is available to update firmware on your array, make note of the patch number or use SunSolve Online's search capabilities to locate and navigate to the patch.
2. Read the README text file associated with that patch for detailed instructions on downloading and installing the firmware upgrade.
3. Follow those instructions to download and install the patch.
The following firmware upgrade features apply to the controller firmware:
When downloading is performed on a dual-controller system, firmware is flashed onto both controllers without interrupting host I/O. When the download process is complete, the primary controller resets and lets the secondary controller take over the service temporarily. When the primary controller comes back online, the secondary controller hands over the workload and then resets itself for the new firmware to take effect. The rolling upgrade is automatically performed by controller firmware, and the user's intervention is not necessary.
A controller that replaces a failed unit in a dual-controller system is often running a newer release of firmware version. To maintain compatibility, the surviving primary controller automatically updates the firmware running on the replacement secondary controller to the firmware version of the primary controller.
The firmware can be downloaded to the RAID controller by using an ANSI/VT100-compatible emulation program. The emulation program must support the ZMODEM file transfer protocol. Emulation programs such as HyperTerminal, Telix, and PROCOMM Plus can perform the firmware upgrade.
It is important that you run a version of firmware that is supported for your array.
If you are downloading a Sun patch that includes a firmware upgrade, the README file associated with that patch tells you which Sun StorEdge 3000 family arrays support this firmware release.
To download new versions of controller firmware, disk drive firmware, or SES and PLD firmware, use one of the following tools:
Caution - You should not use both in-band and out-of-band connections at the same time to manage the array. You might cause conflicts between multiple operations. |
You can use a Microsoft Windows terminal emulation session with ZMODEM capabilities to access the firmware application. To upgrade the RAID controller firmware through the serial port and the firmware application, perform the following steps:
1. Establish the serial port connection.
2. Upgrade both boot record and firmware binaries with the following steps:
a. Choose "system Functions Controller Maintenance Advanced Maintenance Functions Download Boot Record and Firmware."
b. Set ZMODEM as the file transfer protocol of your emulation software.
c. Send the Boot Record Binary to the controller. In HyperTerminal, go to the "Transfer" menu and choose "Send file."
If you are not using HyperTerminal, choose "Upload" or "Send" (depending on the software).
d. After the Boot Record Binary has been downloaded, send the Firmware Binary to the controller. In HyperTerminal, go to the "Transfer" menu and choose "Send file."
If you are not using HyperTerminal, choose "Upload" or "Send" (depending on the software).
When the firmware update is complete, the controller automatically resets itself.
3. Upgrade the firmware binary with the following steps:
a. Choose "System Functions Controller maintenance Download Firmware."
b. Set ZMODEM as the file transfer protocol of your emulation software.
c. Send the firmware binary to the controller. In HyperTerminal, choose "Send file."
If you are not using HyperTerminal, choose "Upload" or "Send" (depending on the software).
When the firmware update is complete, the controller automatically resets itself.
Some procedures require that you remove the front bezel and the small vertical plastic caps on either side of the bezel that cover the rackmount tabs. These rackmount tabs are often referred to as "ears."
1. Use the provided key to unlock both bezel locks.
2. Grasp the front bezel cover on both sides and pull it forward and then down.
Note - For many operations, including replacing disk drives, it is not necessary to further detach the bezel, since dropping it down moves it sufficiently out of the way. |
3. Press the right bezel arm (hinge) toward the left side to release it from the chassis hole.
The left hinge also disengages.
4. Note the location of the chassis bezel holes on each ear.
5. Remove the plastic caps from the front left and right ears of the array.
Both plastic caps are removed in the same way.
a. Squeeze both sides of the cap at the top and the bottom.
b. Turn the cap toward the center of the array until it disengages and pull it free.
Each plastic cap is replaced in the same way, but be sure to place the cap with LED labels on the right ear.
1. Align the inside round notches of the cap with the round cylindrical posts (ball studs) on the ear.
2. Push the top and bottom of the ear cap onto the ear, pressing in on the top side toward the center of the array first.
3. Continue pushing the top and bottom of the ear cap onto the ear, pressing on the side toward the outside of the array.
Do not use force when placing a cap on an ear.
Caution - Be careful to avoid "wedging" the Reset button below the LEDs on the right ear when you replace the plastic cap over it. |
4. Insert the bezel arms into the chassis holes.
5. Lift the bezel into position and press it onto the front of the chassis until it is flush with the front.
6. Use the key to lock both bezel locks.
Copyright © 2004, Sun Microsystems, Inc. All rights reserved.