This chapter introduces you to the concepts of Dynamic Reconfiguration and hot-plug operations. It also explains the requirements and limitations of Dynamic Reconfiguration.
Topics covered in this chapter include:
Sun Fire 880 systems support hot-pluggable Peripheral Component Interconnect (PCI) cards. This hot-plug capability enables you to add, remove, or replace a PCI card on a powered-on system, while the rest of the system's capabilities remain unaffected.
Hot-plugging significantly reduces system downtime associated with PCI card replacement. However, the hot-plug procedure involves software commands for preparing the system prior to removal of a PCI card and for reconfiguring the operating environment after installation of a new card.
In contrast, Sun Fire 880 fan trays and power supplies are hot-swappable. You can remove or insert these components at any time without any prior software preparation. For more information about hot-swappable system components, refer to the Sun Fire 880 Server Owner's Guide.
The Sun Fire 880 Remote System Control (RSC) card is not a hot-pluggable component. Before installing or removing an RSC card, you must power off the system and disconnect all system power cords.
Hot-plug operations for PCI cards involve Dynamic Reconfiguration (DR). Dynamic Reconfiguration is an operating environment feature that enables you to reconfigure system hardware while the system is running. Using DR, you can add or replace hardware resources with little or no interruption of normal system operations.
PCI hot-plug procedures may involve software commands for preparing the system prior to removing a device, and for reconfiguring the operating environment after installing a new device. In addition, certain other system requirements must be met in order for hot-plug operations to succeed. For details, see "About Dynamic Reconfiguration Requirements".
For detailed PCI hot-plug procedures, see Chapter 2, Using Dynamic Reconfiguration.
You can hot-plug any off-the-shelf PCI card that is compliant with the PCI 2.2 specification, provided a suitable software driver exists for the Solaris Operating Environment, and the driver supports hot-plugging.
There are three different methods for performing PCI hot-plug operations on Sun Fire 880 systems:
Push-button method
Command-line method
Graphical user interface (GUI) method
The push-button method relies on push buttons and status LEDs located near each PCI card slot. You can initiate a hot-plug operation by pressing the push button for the corresponding slot. Three status LEDs located near each slot indicate successful results or failure conditions.
The command-line method lets you perform hot-plug operations via a remote login session, a locally attached console, or an RSC console. This method involves the Solaris cfgadm(1M) command, and uses the LEDs near each slot to indicate where to insert or remove the affected card.
A graphical user interface for performing DR operations is provided through the SunTM Management Center system monitoring and management software (formerly known as Sun Enterprise SyMONTM software). For more information, refer to the Sun Management Center Software User's Guide and the Sun Management Center Software Supplement for Workgroup Servers.
All three hot-plug methods use the status LEDs located near each PCI slot. These LEDs indicate when it is safe to insert or remove a card from its slot, and show whether the operation has succeeded or failed. For additional details on Sun Fire 880 hot-plug status LEDs, see "About Slot LEDs".
Regardless of the method you use, it is often necessary to perform additional administrative steps to prepare for a hot-plug removal operation. Prior to performing a removal operation, you must ensure that the devices residing on the card are not currently in use. To identify and manually terminate usage of such devices, you can use standard Solaris Operating Environment commands such as mount(1M), umount(1M), swap(1M), ifconfig(1M), and ps(1).
For detailed PCI hot-plug procedures, see Chapter 2, Using Dynamic Reconfiguration.
DR works in conjunction with (but does not require) multipathing software. You can use multipathing software to switch I/O operations from one I/O controller to another to prepare for DR operations. With a combination of DR and multipathing software, you can remove, replace, or deactivate a PCI controller card with no interruption to system operation. Note that this requires redundant hardware; that is, the system must contain an alternate I/O controller that is connected to the same device(s) as the card being removed or replaced. The alternate controller must reside on a different PCI card or be integrated into the Sun Fire 880 system motherboard or I/O board. For more information about multipathing software, refer to the Sun Fire 880 Server Owner's Guide.
Certain system requirements must be met in order for DR operations to succeed. These requirements are summarized below and covered in more detail in the sections that follow.
For a PCI card to be successfully detached from a running operating environment:
All devices on the card must use detach-safe device drivers.
If the card controls any vital system resources, alternate paths to those resources must be available through some other card or on-board controller integrated into the system motherboard or system I/O board.
For a PCI card to be successfully detached from a running operating environment, each device on the card must have a detach-safe driver. A detach-safe driver enables a single instance of a driver to be closed while other instances are allowed to remain open to service similar devices used elsewhere in the system. To be considered detach-safe, a driver must be able to perform a basic Device Driver Interface/Device Kernel Interface (DDI/DKI) function called DDI_DETACH. Any driver that does not support the DDI_DETACH function is called detach-unsafe.
Sun Microsystems offers a variety of hot-pluggable PCI cards that use detach-safe device drivers. For an up-to-date list of Sun PCI cards that use detach-safe drivers, please see the Sun Fire 880 Server Product Notes or contact your local Sun sales representative.
Many third-party drivers (those purchased from vendors other than Sun Microsystems) do not support the DDI_DETACH function. Sun Microsystems suggests that you test these driver functions during the qualification and installation phases of any third-party PCI card, prior to use in a production environment.
While it is possible to detach a PCI card that has detach-unsafe drivers, it is a fairly complex procedure. To do so, you must:
Stop all usage of the detach-unsafe drivers on the card.
Stop all usage of other devices in the system that share the same detach-unsafe drivers.
Manually close all instances and unload all of the affected drivers.
For more information, see "How to Remove PCI Cards That Use Detach-Unsafe Drivers".
You cannot dynamically detach a PCI card that controls vital system resources unless alternate paths to those resources are available. The alternate paths must be available through a different PCI card or an on-board controller integrated into the system motherboard or system I/O board. Before detaching the card, you must switch control of the vital resources over to the alternate path (note that some multipathing software may handle this automatically). Examples of vital system resources include the system's boot disk, swap space, and primary network interface.
Some cards cannot be detached. A PCI card is not detachable if it controls a boot drive for which no alternate path is available.
If possible, the system's swap space should reside on two or more disks attached to controllers on separate boards. For example, some of the swap space might be controlled by a PCI host adapter card, while the rest could be controlled by the system's on-board controller. With this kind of configuration, a particular swap partition is not a vital system resource, because swap space is accessible through multiple controllers, and additional swap space can be dynamically configured via the swap(1M) command.
Before detaching a PCI card that controls disk swap space, you must ensure that the system's remaining memory and disk swap space will be large enough to accommodate currently running programs.
Inserting a faulty card may cause a system crash. Use only cards that are known to be functional.
When hot-plugging a PCI card, it is important to be aware that a newly inserted PCI card with a serious failure can, when powered-on, introduce failures in the bus segment to which it is connected.
For Sun StorEdgeTM A5000 disk arrays, the firmware version must be ST19171FC 0413 or later.