1. Introduction to the Sun Storage J4500 Array
1.2 Exterior Features, Controls, and Indicators
1.2.3 Sun Storage J4500 Array Internal Components
2. Configuring and Powering On the Sun Storage J4500 Array
2.1.3 Cabling the SAS Connectors
2.2 Powering On and Off the Array
To Place the Array Into Standby Power Mode
2.2.1 AC Power Failure Auto-Recovery
3. Maintaining the Sun Storage J4500 Array
3.1 Options and Replaceable Components
3.3 Powering Off the Array and Removing It From the Rack
To Remove the Array Enclosure From the Rack
3.4 Removing and Replacing the Hard Disk Drive Access Cover
To Remove the Hard Disk Drive Access Cover
To Replace the Hard Disk Drive Access Cover
3.5 Internal Component Locations
To Replace the Front Indicator Board
To Replace the Power Distribution Board
To Replace the System Controller Module
3.7 Upgrading Enclosure Firmware
3.7.1 Ensure Both SAS Fabrics are Upgraded to the Same Firmware Revision Level
4.2 Internal Disk Drive and Fan LEDs
4.3 Diagnostic and Management Tools
4.3.2 Common Array Manager (CAM)
To Access Service Advisor Procedures
To Reserve the Array for Maintenance
To Release the Array After Maintenance
4.3.2.1 Understanding the CAM Event Log
4.4 Troubleshooting Problems with the Array
4.4.2 Check the Event and Performance Logs
4.4.2.1 Identifying Disks in the Array Enclosure
4.4.3 Using the Array Management Software to Monitor Enclosure Health
4.4.4.1 Switching SAS Cables or Making New Connections
4.4.5.1 Guidelines for Removal and Replacement of RAID Storage
4.4.5.2 Persistent Affiliation When Changing HBAs
4.4.5.3 If You Do Not See All of the 48 Disks
4.4.5.4 Multipath Problems With Unsupported Drives
4.5 Resetting the Enclosure Hardware
To Reset the Enclosure Hardware Using the Reset Button
4.6 Clearing the Enclosure Zoning Password
To Clear the Enclosure Zoning Password
B.2 I/O-to-Disk Backplane Connectors
B.2.2 High-Speed Dock Connectors
B.4 Disk Backplane-to-Front Indicator Connector
The following sections describe how to troubleshoot problems you may experience with the J4500 array.
If you are unable to see the array drives after powering on the array, check the following:
Ensure all cables are properly connected (power and SAS).
Be sure you are using SAS cables supported for use with the array. Using longer, or non-certified cables is not supported. For a list of supported cables, see 3.1 Options and Replaceable Components.
You should carefully follow the configuration rules listed in 2.1 Configuration and Cabling. Not following these rules could result in an unsupported configuration.
Check the array indicator LEDs to make sure all components are operating normally and the link LEDs are green.
The proper startup sequence for the enclosure is to power-on the enclosure first, wait one minute, then power-up the server.
The operating system event log is a good first place to start in identifying problems or potential issues with the enclosure or its disks. If you experience disk problems, such as disk errors or invalid read/writes, the system event log can help identify the problem disk.
Note - By default, errors for the enclosure (temperature, voltage, device status), may not be logged in the system event log, but only in the array management software event log. If you want errors to be forwarded to the system event log, refer to the HBA documentation to see if it supports this feature.
You may have problems with the array listed in multiple log files (system and HBA). If this is the case, concentrate on recent errors that best relate to the problem. Try to pinpoint the time when problems began to appear. Search through the log files as soon as possible for when the problems first appeared—log files can quickly fill up with errors and some information may be lost.
Disks in the array enclosure are typically identified by the operating system in sequential order in a list of 51 devices; the first 4 addresses (0-3) represent the array's four SAS expanders, the other 48 addresses (4-51) represent the 48 hard disks. Drives are mapped in numerical order as shown on the drive map label on the top of the array enclosure. Device names and address information depends on other mass storage devices attached to the server and where the array's HBA is located in the PCI bus boot order.
Your J4500 array supports a powerful set of SMP (Serial Management Protocol) and SES-2 (SCSI Enclosure Services) enclosure management features. Some or all of these features are available through supported management software (for example, the Sun Common Array Manager, or the Sun StorageTek RAID Manager software) to provide a system administrator at the array-connected server or network-connected management console the following capabilities:
Monitor the enclosure status (on/off line status, component health)
Monitor the enclosure environment (voltage and temperature)
Remotely identify and locate enclosure components
Obtain FRU identification and status (expanders, hard disks, fans, power supplies)
Remove and install FRU components
Remotely reset the enclosure hardware
Remotely upgrade the enclosure's firmware (expanders and hard disk—must use CAM)
View the enclosure event log to aid in troubleshooting
Refer to the Sun Storage J4500 Array System Overview (820-3163) for more information on array management software.
You may encounter a problem where the server is unable to communicate with the array. Complete the following troubleshooting tasks to reestablish communications with the array.
Check the SAS link LEDs at the rear of the enclosure (see 4.1 External Status LEDs) to ensure the ports are properly communicating with the HBA. Each SAS port has a SAS Link Activity LED. The following LED states will be viewable:
On – 1 to 4 links are ready.
Blink – Read/Write port activity.
Off – Link is lost.
If the link LED is off, check the SAS cables for proper connection. Ensure that the cables are supported for the enclosure (refer to 3.1 Options and Replaceable Components).
If you can not reestablish communication with the server, you can try resetting the enclosure hardware. The enclosure hardware may be reset with the power on. See 4.5 Resetting the Enclosure Hardware. You may also remotely reset the enclosure through the Sun Common Array Manager.
There may be a problem with SAS fabric you are using. Try using the redundant fabric. If you have daisy chained J4500 arrays, be sure to move all cable connections to the redundant fabric—only one SAS fabric (SAS A or SAS B) may be used per HBA port connection. Cross fabric connections on an array enclosure (SAS A to SAS B) are not supported.
There may be a problem with the SAS cable. The cable might be damaged and either prevents communication, or it may allow only degraded communication (which can manifest itself in poor array performance). The array comes with two cables, try attaching a new SAS cable.
Review the Sun and server operating system vendor knowledge base to see if the problem is a known issue with a solution, also see the Sun support site http://www.sun.com/support. The J4500 array SAS expanders have firmware that may be upgraded as fixes and new features become available from Sun. For more information on upgrading enclosure firmware, see 3.7 Upgrading Enclosure Firmware.
In a single path environment: If your J4500 array is connected to the StorageTek SAS RAID External HBA, and you switch a cable from one port of the HBA to the other port on the HBA, you should wait long enough after the initial cable pull for all the physical hard drives shown in the GUI or through the CLI to be removed from the display. This prevents the problem of the controller attempting to remove drives at the same time it is reading the same drives on another port. If no display is available, you should wait at least 2 minutes between pull and reconnect.
In a multipath environment: Since the J4500 array uses SATA drives, the potential for SATA affiliation conflicts exists. Conflict can occur when more than one initiator tries to access the drive via the same path (for example, two hosts attached to SAS A on a J4500 array), or if you move an established connection from one domain port to another (for example, from port 0 to port 1). Possible symptoms of SATA affiliation conflicts are: operating system hangs, zoning operations take longer than 10 minutes to complete, and/or disk utilities like “format” will not return device lists in a timely fashion. For more about SATA affiliations, see the Sun Storage J4500 Array System Overview (820-3163).
Issues with array disks might be identified by viewing the system event log, being alerted by your array management software, or by viewing the J4500 array's LEDs. In the event of a disk failure, the disk may be replaced with the array online.
If the disk must be replaced, complete the following tasks:
When removing and replacing RAID disks in the J4500 array, use the following guidelines:
Perform RAID disk removal and replacement procedures with the system powered on. That way, the HBA can update its RAID configuration information.
When removing and replacing disks, allow enough time between each operation for the HBA to update the RAID configuration information. When hot-plugging non-failed drives for test purposes, you should wait a full minute after removal before reinserting the drive.
When connecting your array to an HBA, it is possible that the SAS “affiliation” feature may cause problems if the array was previously connected to another HBA. An affiliation is used by the SAS protocol to prevent multiple SAS initiators (HBAs) from interfering with each other when communicating with SATA drives. If you encounter such a problem, affiliations may be removed by power cycling the array enclosure prior to connecting it to a different HBA.
If you see only some of the available disks (for example, if you see only 20 or 28 of the total 48 disks), try the following:
Look through the vents at the back of the System Controller module to see if the 4 green expander heartbeat LEDs are blinking. If not, try power cycling the array.
If the problem occurs repeatedly, there might be a problem with the System Controller module. Check in the Sun Common Array Manager (CAM) to see if the array is at firmware baseline, if not, you should upgrade the array firmware.
If updating the array firmware does not solve the problem, the System Controller module may need to be replaced. For step-by-step procedures for replacing the System Controller module, , refer to To Replace the System Controller Module.
If you have moved SAS cables from one port to another, you may have SATA affiliation conflicts. Conflict can occur when more than one initiator tries to access the drive via the same path (for example, two hosts attached to SAS A on a J4500 array), or if you move an established connection from one domain port to another (for example, from port 0 to port 1). Possible symptoms of SATA affiliation conflicts are: operating system hangs, zoning operations take longer than 10 minutes to complete, and/or disk utilities like “format” will not return device lists in a timely fashion. Refer to the chapters on zoning and multipathing in the Sun Storage J4500 Array System Overview (820-3163) for proper initiator-to-disk access configuration. Also refer to the Sun StorageTek Common Array Manager Release Notes for the version of CAM being used.
Only SATA hard disk drives supported for use with the J4500 array may be used for multipathing. If you install an unsupported drive, you might get the following error in the System Event Log and you will be unable to configure the drive for multipath:
Target:2, lun:0 doesn't have a valid GUID, multi pathing for this drive is not enabled
This error means that the drive does not have a SAS WWN (World Wide Name). All drives supported for use with the J4500 array have a unique WWN. The WWN does not change even if the drive firmware is upgraded.
At the release of this document, the following Sun hard disk drives are supported for use in the J4500 array (check the label on the drive to verify it is a supported):
HUA7250SBSUN500G A90A Hitachi 500 GB SATA 390-0384-02
HUA7275SASUN750G A90A Hitachi 750 GB SATA 390-0379-02
HUA7210SASUN1.0T A90A Hitachi 1.0 TB SATA 390-0381-012
ST35002NSSUN500G SU0B Seagate 500GB SATA 390-0412-02
ST37502NSSUN750G SU0B Seagate 750GB SATA 390-0413-02
ST31000NSSUN1.0T SU0B Seagate 1.0 TB SATA 390-0414-02
Note - The J4500 array is shipped from the factory with drives of the same capacity. Mixing drives of different capacities in the array is unsupported. Refer to the Sun Storage J4500 Array Product Notes (820-3162) for updated information.
The array enclosure needs to operate within a specific temperature range (below 35 °C or 95 °F). If the internal temperature is above that, the fans automatically increase in speed when a thermal threshold is reached. This could be a reaction to higher external ambient temperatures in the local environment. If the fan noise level and tone seem high, check to ensure there is no airflow restriction raising the enclosure's internal temperature.
If an excessive temperature threshold is reached that could damage components in the enclosure, the J4500 array Over Temperature LED will light. If this happens do the following:
Use your array management software to check for a faulty fan. An enclosure fan is a hot-swappable and may be replaced with the power on. The fans include status LEDs to identify a faulty fan. For step-by-step procedures for replacing enclosure fans, refer to To Replace a Fan Module.
Check that there is clear, uninterrupted airflow at the front and rear of the storage system.
Check for enclosure intake restrictions due to dust buildup and clear them.
Check for excessive recirculation of heated air from the rear of the array enclosure to the front.
Reduce the ambient temperature in the room where the array enclosure is racked.
The J4500 array includes redundant, hot-swappable power supplies. If a power supply fails, you may be alerted by the array management software or the enclosure power supply status LEDs (an alert LED will light amber both at the front and rear of the enclosure when service is required). For step-by-step procedures for replacing an array power supply, see To Replace a Power Supply.