This section contains the release notes for Alternate Pathing (AP) 2.2 on all Sun Enterprise servers, except the Sun Enterprise 10000. See "AP on the Sun Enterprise 10000" for information about the Enterprise 10000 server.
AP 2.2 is the first release of AP that supports Solaris 7 11/99.
AP enables you to define and control alternate physical paths to peripheral devices, adding increased availability and a level of fault recovery to your server. If a physical path to a device becomes unavailable, an alternate path can be used. For more information, see the Sun Enterprise Server AP User's Guide in the Solaris 7 11/99 on Sun Hardware Collection AnswerBook2(TM).
At the time of this printing, AP 2.2 Beta is not compatible with any version of Sun Enterprise Volume Manager(TM) (SEVM).
If you are upgrading from Solaris 2.6 to Solaris 7 11/99 and have AP 2.1 and Solstice(TM) DiskSuite(TM) 4.0 or 4.1 (SDS) on your system, you must upgrade to AP 2.2 and SDS 4.2. This section contains an overview of the entire process, which requires you to use several sections from different publications. You should ensure that you have the following publications before you start the upgrade:
Solaris 7 11/99 Release Notes Supplement for Sun Hardware (available in printed form in your Solaris 7 11/99 Media Kit)
Solaris 7 11/99 Sun Hardware Platform Guide, (available in printed form in your Solaris 7 11/99 Media Kit or in AnswerBook2 format on the Sun Hardware Supplements CD)
Sun Enterprise Server Alternate Pathing User's Guide (available in AnswerBook2 format on the Sun Hardware Supplements CD in your Solaris 7 11/99 Media Kit)
Solstice DiskSuite 4.2 User's Guide (available in AnswerBook2 format from http://docs.sun.com or from your SDS Media Kit)
Solstice DiskSuite 4.2 Installation and Product Notes (available in AnswerBook2 format from http://docs.sun.com or from your SDS Media Kit)
You must follow the sequence given here to successfully complete the upgrade.
In general, you will perform the following tasks:
Unconfigure SDS 4.0 or 4.1.
Remove AP 2.1.
Upgrade to Solaris 7 11/99.
Install AP 2.2.
Install and reconfigure SDS 4.2.
Specifically, you must perform the following tasks:
Read "Performing an Upgrade of AP" in the Solaris 7 11/99 Sun Hardware Platform Guide.
Commit any uncommitted AP metadevices (see Step 1 in "To Upgrade AP" in the Solaris 7 11/99 Sun Hardware Platform Guide).
Deconfigure SDS (see steps 1 through 8 in "How to Convert to DiskSuite 4.2 on SPARC Systems Running DiskSuite 4.0 or 4.1" in the Solstice DiskSuite 4.2 Installation and Product Notes).
Do not install Solaris 7 11/99 at this time.
Remove the current AP configuration (see Step 3 in "To Upgrade AP" in the Solaris 7 11/99 Sun Hardware Platform Guide).
Upgrade to Solaris 7 11/99 (see Step 4 in "To Upgrade AP" in the Solaris 7 11/99 Sun Hardware Platform Guide).
Upgrade to AP 2.2 (see Step 5 in "To Upgrade AP" in the Solaris 7 11/99 Sun Hardware Platform Guide).
Install SDS 4.2, then restore it (see Step 6 in "To Upgrade AP" in the Solaris 7 11/99 Sun Hardware Platform Guide and steps 10 through 16 in "How to Convert to DiskSuite 4.2 on SPARC Systems Running DiskSuite 4.0 or 4.1" in Chapter 1 of the Solstice DiskSuite 4.2 Installation and Product Notes).
This section contains general issues that involve AP on Sun Enterprise servers. Read this section before you attempt to install or configure AP.
The following devices are supported by the AP software on Sun Enterprise servers:
SPARCstorage(TM) Arrays recognized by AP using the pln,soc, and ssd ports
Sun(TM) StorEdge(TM) A5000 recognized by AP using sf, socal, and ssd ports
SunFastEthernet(TM) 2.0 (hme)
SunFDDI(TM) 5.0 (nf) SAS (Single-Attach Station) and DAS (Dual-Attach Station)
SCSI-2/Buffered Ethernet FSBE/S and DSBE/S (le)
Quad Ethernet (qe)
Sun(TM) Quad FastEthernet(TM) (qfe)
Sun GigabitEthernet 2.0 (ge)
The following table lists which devices are supported in which releases:
Table 5-1 Supported Network Devices
|
AP 2.0 |
AP 2.1 |
AP 2.2 |
---|---|---|---|
Solaris 2.5.1 |
hme, le, nf, bf, hi, qe, qfe |
N/A |
N/A |
Solaris 2.6 (5/98) |
N/A |
ge, hme, le, nf, qe, qfe, vge |
N/A |
Solaris 7 11/99 |
N/A |
N/A |
ge,hme, le, nf, qe, qfe |
AP 2.2 validation tests were performed on SunFDDI (revision 6.0) and GigabitEthernet (revision 2.0). If you install either of these devices, you must use the revision level that was tested, unless a higher revision level exists. In addition, you must install all of the available patches for these devices. Refer to http://www.sunsolve.sun.com for more information about the patches.
The Sun StorEdge A3000 supports failover capabilities that are similar to those provided by AP. Because of this, AP does not support the Sun StorEdge A3000. See that product's documentation for more information about its failover support.
AP supports the Sun StorEdge A5000 for this release.
AP 2.2 does not support the Sun StorEdge A7000 for this release.
The following lists includes the possible combinations of AP and Solaris software you can install on a Sun Enterprise server.
Solaris 2.6 5/98 with AP 2.1 and DR
Solaris 7 with AP 2.2 and DR
Solaris 7 11/99 with AP 2.2 and DR
The following bugs are no known bugs in this version of AP 2.2.
This section contains the synopses and Sun BugID number of the more important bugs that have been fixed since the AP 2.1 release (Solaris 2.6 5/98). This list does not include all of the fixed bugs.
4126743 - AP disk autofailover hangs on simultaneous multi-pathgroup failures (this bug was fixed by BugID 4136249).
4126897 - Domain panics when there are no AP database and metadevice entry in /etc/vfstab.
4136249 - I/O's to SEVM RAID volumes hang after AP autofailover.
4141438 - mhme interface hangs under heavy use of network.
4143514 - FDDI with AP hangs with heavy use of network.
4147674 - AP causes ifconfig to hang on a mutex.
4153152 - apconfig works as an ordinary user.
4161396 - AP 2.x needs capability to work with GEM.
4163270 - netstat of network ap network meta interface shows no tallies.
4166620 - snoop of AP meta network interface stops snooping after switch.
4170818 - If you run the $<callouts nadb macro, or fm2's "callout ts" command, you see thousands of qenable timeouts in the timeshare callout table.
4180055 - Accessing an AP'd metadisk with a failed active alternate panics.
4180702 - Messages from swap.c are not internationalized.
4183581 - apboot disk causes coredump when disk is the same as the current boot disk.
4185154 - AP GigabitEthernet stress test hands.
4188418 - It is possible for a hard disk error to go undetected by Veritas [SEVM].
4195441 - AP2.0 ap_daemon doesn't communicate with AP2.2.
4228731 - Non-existent network interfaces not marked as detached after reboot.
These release notes provide the latest information on Dynamic Reconfiguration (DR) functionality for Sun EnterpriseTM 6x00, 5x00, 4x00, and 3x00 systems running update Solaris 7 11/99. For more information on Sun Enterprise Server Dynamic Reconfiguration, refer to the Dynamic Reconfiguration User's Guide for Sun Enterprise 3x00/4x00/5x00/6x00 Systems.
The 11/99 update includes support for CPU/Memory boards on Sun Enterprise 6x00, 5x00, 4x00, and 3x00 systems.
Before proceeding, ensure the system supports dynamic reconfiguration. If you see the following message on your console or in your console logs, the hardware is of an older design and not suitable for dynamic reconfiguration.
Hot Plug not supported in this system
Supported I/O boards are listed in the "Solaris 7 11/99" section on the web site
http://sunsolve5.sun.com/sunsolve/Enterprise-dr/
I/O board type 2 (graphics), type 3 (PCI), and type 5 (graphics and SOC+) are not currently supported.
For Sun StorEdgeTM A5000 disk arrays or for internal FC-AL disks in the Sun Enterprise 3500 system, the firmware version must be ST19171FC 0413 or later. For more information, refer to the "Solaris 7 11/99" section at the web site:
http://sunsolve5.sun.com/sunsolve/Enterprise-dr/
Users of Solaris 7 11/99 software who wish to use dynamic reconfiguration must be running CPU PROM version 3.2.22 (firmware patch ID 103346-xx) or later. This firmware is available from the web site. See "How To Obtain Firmware".
Older versions of the CPU PROM may display the following message during boot:
Firmware does not support Dynamic Reconfiguration
CPU PROM 3.2.16 and earlier versions do not display this message, although they do not support dynamic reconfiguration of CPU/memory boards.
To see your current PROM revision, enter .version and banner at the ok prompt. Your display will be similar to the following:
ok .version |
Slot 0 - I/O Type 1 FCODE 1.8.22 1999/xx/xx 19:26 iPOST 3.4.22 1999/xx/xx 19:31 |
Slot 1 - I/O Type 1 FCODE 1.8.22 1999/xx/xx 19:26 iPOST 3.4.22 1999/xx/xx 19:31 |
Slot 2 - CPU/Memory OBP 3.2.22 1999/xx/xx 19:27 POST 3.9.22 1999/xx/xx 19:31 |
Slot 3 - I/O Type 4 FCODE 1.8.22 1999/xx/xx 19:27 iPOST 3.4.22 1999/xx/xx 19:31 |
Slot 4 - CPU/Memory OBP 3.2.22 1999/xx/xx 19:27 POST 3.9.22 1999/xx/xx 19:31 |
Slot 5 - CPU/Memory OBP 3.2.22 1999/xx/xx 19:27 POST 3.9.22 1999/xx/xx 19:31 |
Slot 6 - CPU/Memory OBP 3.2.22 1999/xx/xx 19:27 POST 3.9.22 1999/xx/xx 19:31 |
Slot 7 - CPU/Memory OBP 3.2.22 1999/xx/xx 19:27 POST 3.9.22 1999/xx/xx 19:31 |
Slot 9 - CPU/Memory OBP 3.2.22 1999/xx/xx 19:27 POST 3.9.22 1999/xx/xx 19:31 |
Slot 11 - CPU/Memory OBP 3.2.22 1999/xx/xx 19:27 POST 3.9.22 1999/xx/xx 19:31 |
Slot 12 - CPU/Memory OBP 3.2.22 1999/xx/xx 19:27 POST 3.9.22 1999/xx/xx 19:31 |
Slot 14 - CPU/Memory OBP 3.2.22 1999/xx/xx 19:27 POST 3.9.22 1999/xx/xx 19:31 |
ok banner |
16-slot Sun Enterprise E6500 |
OpenBoot 3.2.22, 4672 MB memory installed, Serial #xxxxxxxx. |
Ethernet address 8:0:xx:xx:xx:xx, Host ID: xxxxxxxx. |
For information about updating your firmware, refer to the "Solaris 7 11/99" section at the following web site:
http://sunsolve5.sun.com/sunsolve/Enterprise-dr/
At this site, you will find information on how to:
Download the DR-capable PROM firmware
Upgrade the PROM
If you cannot use the web site, contact your Sun support service provider for assistance.
In the /etc/system file, two variables must be set to enable dynamic reconfiguration and an additional variable must be set to enable the removal of CPU/memory boards.
Log in as root.
To enable dynamic reconfiguration, edit the /etc/system file and add the following lines to the /etc/system file:
set pln:pln_enable_detach_suspend=1 set soc:soc_enable_detach_suspend=1
To enable the removal of a CPU/memory board, add this line to the /etc/system file:
set kernel_cage_enable=1
Setting this variable enables the memory unconfiguration operation.
Reboot the system to put the changes into effect.
On a large system, the quiesce-test command (cfgadm -x quiesce-test sysctrl0:slotnumber) may run as long as a minute or so. During this time no messages are displayed if cfgadm does not find incompatible drivers. This is normal behavior.
If a board is on the disabled board list, an attempt to connect the board may produce an error message:
# cfgadm -c connect sysctrl0:slotnumber cfgadm: Hardware specific failure: connect failed: board is disabled: must override with [-f][-o enable-at-boot]
To override the disabled condition, use the force flag (-f) or the enable option (-o enable-at-boot) with the cfgadm command:
# cfgadm -f -c connect sysctrl0:slotnumber
# cfgadm -o enable-at-boot -c connect sysctrl0:slotnumber
To remove all boards from the disabled board list, set the disabled-board-list variable to a null set with the system command:
# eeprom disabled-board-list=
If you are at the OpenBootTM prompt, use this OBP command instead to remove all boards from the disabled board list:
OK set-default disabled-board-list
For further information about the disabled-board-list setting, refer to the section "Specific NVRAM Variables" in the Platform Notes: Sun Enterprise 3x00, 4x00, 5x00, and 6x00 Systems, part number 805-4454.
For information about the OBP disabled-memory-list setting, refer to the section "Specific NVRAM Variables" in the Platform Notes: Sun Enterprise 3x00, 4x00, 5x00, and 6x00 Systems.
If it is necessary to unload detach-unsafe drivers, use the modinfo(1M) line command to find the module IDs of the drivers. You can then use the module IDs in the modunload(1M) command to unload detach-unsafe drivers.
A memory board or CPU/memory board that contains interleaved memory cannot be dynamically unconfigured.
To determine if memory is interleaved, use the prtdiag command or the cfgadm command.
To permit DR operations on CPU/memory boards, set the NVRAM memory-interleave property to min.
For related information about interleaved memory, see "Memory Interleaving Set Incorrectly After a Fatal Reset, Bug-ID 4156075 " and "DR: Cannot Unconfigure a CPU/Memory Board That Has Interleaved Memory, Bug ID 4210234".
If the error "cfgadm: Hardware specific failure: connect failed: firmware operation error" is displayed during a DR connect sequence, remove the board from the system as soon as possible. The board has failed self-test, and removing the board avoids possible reconfiguration errors that can occur during the next reboot.
If you want to immediately retry the failed operation, you must first remove and reinsert the board, because the board status does not allow further operations.
As noted in the Dynamic Reconfiguration User's Guide for Sun Enterprise 3x00/4x00/5x00/6x00 Systems, Sun Enterprise SyMONTM system monitoring and management software supports dynamic reconfiguration. However, the user's guide listed the wrong reference. The correct reference is Sun Enterprise SyMON 2.0.1 Supplement for Sun Enterprise Midrange Servers.
For the latest bug and patch information, refer to: http://sunsolve5.sun.com/sunsolve/Enterprise-dr.
Category: RFE
The memory test should give occasional indications that it is still running. During a long test, the user cannot easily determine that the system is not hanging.
Workaround: Monitor system progress in another shell or window, using vmstat(1M), ps(1), or similar shell commands.
Category: Bug
Memory interleaving is left in an incorrect state after a Sun Enterprise x500 server encounters a Fatal Reset. Subsequent DR operations fail. The problem only occurs on systems with memory interleaving set to min.
Workarounds: Two choices are listed below.
To clear the problem after it occurs, manually reset the system at the OK prompt.
To avoid the problem before it occurs, set the NVRAM memory-interleave property to max. This causes memory to be interleaved whenever the system is booted. However, you may find this option to be unacceptable, as a memory board containing interleaved memory cannot be dynamically unconfigured. See "DR: Cannot Unconfigure a CPU/Memory Board That Has Interleaved Memory, Bug ID 4210234".
Category: Bug
vmstat shows an unusually high number of interrupts after configuring CPUs. With vmstat in the background, the interrupt field becomes abnormally large (but this does not indicate a problem exists). In the last row in the example below, the interrupts (in) column has a value of 4294967216:
# procs memory page disk faults cpu |
r b w swap free re mf pi po fr de sr s6 s9 s1 -- in sy cs us sy id |
0 0 0 437208 146424 0 1 4 0 0 0 0 0 1 0 0 50 65 79 0 1 99 |
0 0 0 413864 111056 0 0 0 0 0 0 0 0 0 0 0 198 137 214 0 3 97 |
0 0 0 413864 111056 0 0 0 0 0 0 0 0 0 0 0 286 101 200 0 3 97 |
0 0 0 413864 111072 0 11 0 0 0 0 0 0 1 0 0 4294967216 43 68 0 0 100 |
Workaround: Restart vmstat.
Category: Bug
If two CPUs on a single board fail before reporting to the master CPU, the POST system status display lists one CPU as failing, but the second CPU may not be listed at all.
Workaround: None.
Category: RFE
Cannot unconfigure a CPU/Memory board that has interleaved memory.
To unconfigure and subsequently disconnect a CPU board with memory or a memory-only board, it is necessary to first unconfigure the memory. However, if the memory on the board is interleaved with memory on other boards, the memory cannot currently be unconfigured dynamically.
Memory interleaving can be displayed using the prtdiag or the cfgadm commands.
Workaround: Shut down the system before servicing the board, then reboot afterward. To permit future DR operations on the CPU/memory board, set the NVRAM memory-interleave property to min. See also "Memory Interleaving Set Incorrectly After a Fatal Reset, Bug-ID 4156075 " for a related discussion on interleaved memory.
Category: RFE
To unconfigure and subsequently disconnect a CPU board with memory or a memory-only board, it is necessary to first unconfigure the memory. However, some memory is not currently relocatable. This memory is considered permanent.
Permanent memory on a board is marked "permanent" in the cfgadm status display:
# cfgadm -s cols=ap_id:type:info Ap_Id Type Information ac0:bank0 memory slot3 64Mb base 0x0 permanent ac0:bank1 memory slot3 empty ac1:bank0 memory slot5 empty ac1:bank1 memory slot5 64Mb base 0x40000000
In this example, the board in slot3 has permanent memory and so cannot be removed.
Workaround: Shut down the system before servicing the board, then reboot afterward.
Category: Bug
If a cfgadm process is running on one board, an attempt to simultaneously disconnect a second board fails.
A cfgadm disconnect operation fails if another cfgadm process is already running on a different board. The message is:
cfgadm: Hardware specific failure: disconnect failed: nexus error during detach: address
Workaround: Do only one cfgadm operation at a time. If a cfgadm operation is running on one board, wait for it to finish before you start a cfgadm disconnect operation on a second board.
Category: Bug
After DR operations have been run, an attempt to power down a system with the init 5 command may cause a fatal reset.
Workaround: Reset the system, then power it off by entering the command power-off at the ok prompt.
When a server is configured as a boot server for Solaris 2.5.1-based x86 clients, it has several rpld jobs running, whether or not such devices are in use. These active references prevent DR operations from detaching these devices.
Workaround: To perform a DR detach operation: