This document contains installation and operational information for the Sun(TM) Enterprise Network Array(TM) disk array. It is divided into the following sections:
Read the information in this supplement before installing or operating the disk array.
Check this document periodically for updates to the issues listed. This document was last updated December 15, 1997.
Depending upon your operating environment or software, you may need to install one or more of the patches listed in the following tables.
Table 1-1 Operating Environment PatchesOperating Environment | Patch Number |
---|---|
Solaris(TM) 2.5.1 Hardware:8/97 | 105310-xx |
105324-xx | |
Solaris 2.6 | 105356-xx |
105357-xx | |
105375-xx |
Table 1-2 Software Patches
Software | Patch Number(s) |
---|---|
Sun(TM) Enterprise Volume Manager(TM) 2.4 | 105208-xx |
Sun Enterprise Volume Manager 2.5 | 105463-xx |
Solstice(TM) DiskSuite(TM) 4.0 | 102580-xx |
Solstice DiskSuite 4.1 | 104172-xx |
You can download these patches from the SunService Public Patch Page web site:
http://sunsolve.sun.com/sunsolve/pubpatches/patches.html
The following restrictions are temporary and should be relaxed by January, 1998.
For disk arrays not configured as split-loops, only single and dual initiator configurations are supported. In the Sun Enterprise Network Array Hardware Configuration Guide, part number 805-0264, the supported configurations are listed in sections 2.1.1, 2.1.2, 2.1.3, and 2.2.1.
Rackmounted disk arrays are only supported in Enterprise(TM) Expansion Cabinets.
The Sun Enterprise 10000 server is supported with up to 1 Tbyte (eight disk arrays).
The following are considered bugs and are being addressed. Released fixes for these are targeted for January, 1998.
Downloading FCode to an FC-100 host adapter that is in the boot path. The work around is to move the boot disk prior to performing the FCode update.
Downloading firmware to the disk array that contains the boot device. The work around is to move the boot disk prior to performing the firmware download.
Released fixes for these bugs are expected by January, 1998.
Downloading firmware simultaneously to multiple disk arrays on the same loop some times fails. This normally works when commands are issued from a script (very close together). However, if a download is interrupted during an OFFLINE/ONLINE sequence triggered by the reset at the end of the download of one of the other disk arrays on the loop, the interrupted download may fail. The work around is to download one disk array at a time.
luxadm power_off command issued to multiple disk arrays on the same loop may not succeed on all disk arrays due to the loop disruption caused by the other disk array's power down. The work around is to issue the command to one disk array at a time.
Messages and warnings are not automatically signs of problems.The Fibre Channel protocol and the host drivers are designed to be robust. Occasionally, warnings or messages are generated to the console that do not indicate failures but tend to cause alarm for users.
Most peripherals perform internal retries often without generating any output. Disk drive firmware has fairly complex retry algorithms which retry failures, only reporting an actual failure when retry counts are exhausted. Sun's driver philosophy is to generate these messages and warnings so that diagnosis of real problems may be facilitated. The bottom line is that messages and warnings are not always cause for alarm. The following are some common messages and warnings and some insight behind them.
Messages are informational only and do not imply a failure condition. Messages are sent to the console without any preface (such as WARNING or FATAL ERROR).
Nov 12 14:46:53 kapila unix: ID[SUNWssa.socal.link.5010] socal1: port 1: Fibre Channel is OFFLINE |
(Other messages or warnings) |
Nov 12 14:48:53 kapila unix: ID[SUNWssa.socal.link.5010] socal1: port 1: Fibre Channel is ONLINE |
The Fibre Channel loops may from time to time get re-initialized causing service to the loop to be momentarily suspended during this initialization.Common causes of OFFLINE/ONLINE (loop re-initialization)
Soft or hard addition or removal of a device on the loop
Power cycle of device on the loop
Forced loop-init by driver recovery algorithms
Disk array reset following a download
Temporary loss of sync on the loop
All outstanding commands on this particular loop are automatically retried as soon as the loop's initialization is complete and normal operation will resume.
Warnings are an indication of a non-fatal error. Typically retry logic takes care of the problem. Warning messages are prefaced at the console with the keyword WARNING.
14:43:01 kapila unix: WARNING: /io-unit@f,e0200000/sbi@0,0/SUNW,socal@2,0/sf@1,0/ssd@0,0 (ssd10): |
Nov 12 14:43:01 kapila unix: SCSI transport failed: reason 'timeout': retrying command |
This command is retried and normal operations continue. Sometimes the timeout may be accompanied by a loop reset (see OFFLINE/ONLINE sequences).These events are normal and are no cause for alarm unless they occur at a rate greater than five times per 24 hours. No data is lost or corrupted and commands are completed on subsequent retry.
Fibre Channel Loops are specified to have a bit error rate (BER) less than 10E-12. Actual BER is better than 10E-13 and may be as clean as 10E-15.However, you can occasionally experience a bit error that results in a corrupted frame. As corrupted frames are discarded, the end result will be a command that fails to complete and which eventually gets timed out by the ssd driver. A warning indicating a command timeout is generated to the console.
Nov 12 14:45:09 kapila unix: WARNING: /io-unit@f,e0200000/sbi@0,0/SUNW,socal@2,0/sf@0,0/ssd@1,0 (ssd33): |
Nov 12 14:45:09 kapila unix: SCSI transport failed: reason 'tran_err': retrying command |
Some warnings that indicate transport errors due to the link being temporarily unavailable during a loop re-initialization can be expected. For example, there may be several of these accompanying an OFFLINE/ONLINE sequence. These commands are retried after the loop is re-initialized.