Sun Enterprise 6x00, 5x00, 4x00, and 3x00 Systems Dynamic Reconfiguration User's Guide

Chapter 3 Procedures

Click one of the following links to go to the procedure:


Note -

The screen, mouse, and keyboard will not be operational at times when DR momentarily suspends the system, but you will regain control of these devices after the suspension.


General Preparations

  1. Look for the latest service information on the web at http://sunsolve2.Sun.COM/Enterprise-dr

    The web site is updated periodically. If you do not have direct access to this web site, ask your Sun service provider for assistance.

  2. Determine that the board is compatible with DR.

    Some drivers do not yet support DR operations. A driver must be suspendable. The quiesce-test option tests for suspendable drivers:


    # cfgadm -x quiesce-test sysctrl#:slot#
    

  3. Be sure that the board to be replaced is inactive.

    DR does not automatically stop activity on system boards.

    For example, before replacing an I/O board that controls a tape drive, wait for any read/write operations to finish.

Removing a Board

If you have not already done so, read "General Preparations".

Terminating I/O Devices

Terminate the use of all devices in the board. All I/O devices must be closed before they are unconfigured.

  1. Terminate all usage of devices on the board.

    1. To identify the components that are on the board to be unconfigured, use the ifconfig, mount, pf, or swap commands.

    2. To see which processes have these devices open, use the fuser(1M) command.

    3. Ensure that any networking interfaces on the board are not in use. All storage devices attached to the board should be unmounted and closed. See "I/O Board Unconfiguration".


      Note -

      DR does not automatically terminate network use or close devices. There currently is no way to ensure that the use of the network remains terminated or that all devices remain closed. Other clients may remount them between the time of the unmount and the unconfigure operations, so be careful.


    4. If AP (alternate pathing) is in use on the system, switch all board functions to the alternate board. Wait until all of the alternate paths are functioning before proceeding to Step 2.

    5. If AP is not available, warn all users to stop using the functions that the board provides.

  2. If the redundancy features of Alternate Pathing or Solstice DiskSuite mirroring are used to access a device connected to the board, reconfigure these subsystems so that the device or network is accessible by way of controllers on other system boards.

  3. Unmount file systems, including Solstice DiskSuite meta-devices that have a board resident partition. (Example: umount/partit)

  4. Remove Solstice DiskSuite or Alternate Pathing databases from board-resident partitions. The location of Solstice DiskSuite or Alternate Pathing databases is chosen by the user and can be changed.

  5. Remove any private regions used by Sun(TM) Enterprise Volume Manager(TM). Volume Manager by default uses a private region on each device that it controls, so such devices must be removed from Volume Manager control before they can be detached.

  6. Any Sun(TM) RSM Array(TM) 2000 controllers on the board that is being detached should be taken offline, using the rm6 or rdacutil commands.

  7. Remove disk partitions from the swap configuration.

  8. Either kill any process that directly opens a device or raw partition, or direct it to close the open device on the board.

  9. If a detach-unsafe device is present on the board, close all instances of the device and use modunload(1M) to unload the driver. If a detach-unsafe device is present on the board, close all instances of the device and use modunload(1M) to unload the driver.


    Caution - Caution -

    Unmounting file systems may affect NFS(TM) client systems.


Removal Procedure

  1. Terminate all usage of devices on the board.

    See "Terminating I/O Devices".

  2. Check the status of the board:

    • For a simple list containing board names, states, and conditions, enter:


      # cfgadm
      

    • For a more detailed list, enter:


      # cfgadm -v
      

    For a board removal or replacement, the states and conditions must be one of the following sets:

    • The board is ok:

      • Receptacle state--Connected

      • Occupant state--Configured

      • Condition--OK

    • The board is failing:

      • Receptacle state--Connected

      • Occupant state--Configured

      • Condition--Failing

  3. Unconfigure the board:


    # cfgadm -c unconfigure sysctrl#:slot#
    

    For sysctrl#:slot# (the attachment point ID) use the board name that was listed in the status report of the previous step.

  4. Use the cfgadm command to see if the board is unconfigured.

    If the unconfigure operation failed:

    1. See "Removing Boards that Use Detach-Unsafe Drivers".

    2. See "Quiescence".

    3. Resolve the problem.

    4. Unconfigure the board again (Step 3).


    Note -

    A failure of the unconfigure step results in a partially unconfigured condition. If this happens, attempt to unconfigure again. A configuration operation is not permitted at this point.


  5. When the board is successfully unconfigured, you can do one of the following:

    • Leave the board in the system unconfigured

    • Configure the board

    • Logically disconnect the board, in preparation for removal:


      # cfgadm -v -c disconnect sysctrl#:slot#
      

  6. The disconnection takes a few moments, so if you wish to remove the board from the card cage at this time, first verify the board status.

    1. Use cfgadm to verify that the board is logically disconnected.

    2. Check the LEDs on the board to verify that the board is electrically disconnected.

      The two outer LEDs must be off and the middle LED may be either lighted or off.

  7. After you have verified that the board is disconnected, you may physically remove or replace the board (see "Installing a Replacement Board").

If you wish, you can leave the board in place until a later time.


Caution - Caution -

If no replacement is available, leave the board in the slot, or fill the empty slot with a dummy board or a load board to maintain the proper flow of cooling air in the cardcage. For Enterprise 3000, 3500, 4000, 4500, 5000, and 5500 systems, use a dummy board. For Enterprise 6000 or 6500 systems, use a load board.


Removing Boards that Use Detach-Unsafe Drivers

Some drivers do not yet support DR on Sun Enterprise 3x00, 4x00, 5x00, and 6x00 systems.

DR cannot detach these drivers, but you can remove some undetachable drivers manually.

  1. Halt all use of the device controller.

  2. Halt the use of all other controllers of the same type on all boards in the machine.

    The remaining controllers can be used again after the DR unconfigure operation is complete.

  3. Use Unix commands to manually close all such drivers on the board and use the modunload command to unload them.

  4. Disconnect the board with this command:


    # cfgadm -c disconnect sysctrl#:slot#
    

The disconnected board can be physically removed now or at a later time.


Caution - Caution -

If no replacement is available, leave the board in the slot, or fill the empty slot with a dummy board or a load board to maintain the proper flow of cooling air in the cardcage. Use a dummy board for Enterprise 3000, 3500, 4000, 4500, 5000, and 5500 systems. Use a load board for Enterprise 6000 or 6500 systems.



Note -

If you cannot execute the above steps, recover the system configuration by adding the board to the disabled board list using the NVRAM setting disabled-board-list (see Platform Notes), then reboot the system. Remove the board at a later time.



Tip -

Many third-party drivers (those purchased from vendors other than Sun Microsystems) do not yet properly support the standard Solaris software modunload interface. Test these driver functions during the qualification and installation phases of any third-party device.


Installing a Replacement Board

If you have not already done so, read "General Preparations".

  1. If you are not continuing from "Removing a Board" above, otherwise go to Step 2.

    1. Use the cfgadm command to display the current system configuration.

    2. Select a card cage slot to use, but do not insert the board yet.

  2. View the configuration list and verify that the slot is unconfigured:


    # cfgadm
    

  3. Insert the board in the slot and look for an acknowledgment on the console, such as, "name board inserted into slot3."

  4. Use the cfgadm command again to look for the system name assigned to the new board.

  5. Configure the board using the system name for the board:


    # cfgadm -c configure sysctrl#:slotx
    

  6. Configure any I/O devices on the board using commands such as drvconfig and devlinks, as appropriate.

  7. Activate the devices on the board using commands such as mount and ifconfig, as appropriate.

Installing a New Board

If you have not already done so, read "General Preparations".

The process of adding and configuring a board involves (1) connecting the attachment point and (2) configuring its occupant. In most cases the cfgadm(1M) command can perform both steps at once.

  1. Verify that the selected slot is ready for a board.


    # cfgadm
    

    The states and conditions should be:

    • Receptacle state--Empty

    • Occupant state--Unconfigured

    • Condition--Unknown

    or

    • Receptacle state--Disconnected

    • Occupant state--Unconfigured

    • Condition--Unknown

  2. If the status of the slot is not "empty" or "disconnected", enter:


    # cfgadm -c disconnect sysctrl#:slot#
    

  3. Physically insert the board into the slot and look for an acknowledgment on the console, such as, "name board inserted into slot3."

    After an I/O board is inserted, the states and conditions should become:

    • Receptacle state--Disconnected

    • Occupant state--Unconfigured

    • Condition--Unknown

    Any other states or conditions should be considered an error.

  4. Connect any peripheral cables and interface modules to the board.

  5. Configure the board with the command:


    # cfgadm -v -c configure sysctrl#:slot#
    

    This command should both connect and configure the receptacle. Verify with the cfgadm command.

    The states and conditions for a connected and configured attachment point should be:

    • Receptacle state--Connected

    • Occupant state--Configured

    • Condition--OK

    Now the system is also aware of the usable devices which reside on the board and all devices may be mounted or configured to be used.

    If the command fails to connect and configure the board and slot (the status should be shown as "configured" and "ok"), do the connection and configuration as separate steps:

  6. Connect the board and slot by entering:


    # cfgadm -v -c connect sysctrl#:slot#
    

    The states and conditions for a connected attachment point should be:

    • Receptacle state--Connected

    • Occupant state--Unconfigured

    • Condition--OK

    Now the system is aware of the board, but not the usable devices which reside on the board. Temperature is monitored and power and cooling affect the attachment point condition.

  7. Configure the board and slot by entering:


    # cfgadm -v -c configure sysctrl#:slot#
    

    The states and conditions for a configured attachment point should be:

    • Receptacle state--Connected

    • Occupant state--Configured

    • Condition--OK

    Now the system is also aware of the usable devices which reside on the board and all devices may be mounted or configured to be used.

  8. Reconfigure the devices on the board by entering:


    # drvconfig; devlinks; disks; ports; tapes;
    

    The console should display a list of devices and their addresses.

  9. Activate the devices on the board using commands such as mount and ifconfig, as appropriate.

Adding Storage Devices

To add storage devices to an existing I/O board:

  1. Terminate all active use of the devices on the I/O board.

    See "Terminating I/O Devices".

  2. Unconfigure the board.


    # cfgadm -c unconfigure sysctrl#:slot#
    

  3. Add the storage device controller:

    • For an optical controller, attach the I/O module and interface cable.

    • For an SBus or PCI controller card, use the Disconnect command before removing the board. Add the controller card and place the I/O board back in the card cage.

  4. Reconfigure the board.


    # cfgadm -c configure sysctr#:slot#
    

    Only the Occupant state should change. The Receptacle state and condition should remain the same.

  5. If you installed the board in a different slot, reconfigure the devices on the board by entering:


    # drvconfig; devlinks; disks; ports; tapes;
    

    The console should display a list of devices and their addresses.

  6. Activate the devices on the board using commands such as mount and ifconfig, as appropriate.