Platform Notes: Sun Enterprise 250 Server

Removing a Hot-Pluggable Disk Drive

This section contains information on how to configure your system to remove a disk drive while the power is on and the operating system is running. Use the procedures in this chapter if you do not intend to replace the disk drive.

The way in which you remove a disk drive depends on the application you are using. Each application is different, but requires that you:

  1. Select the disk drive

  2. Remove the disk

  3. Reconfigure the operating environment.

In all cases you must select the disk and stop any activity or applications on it, unmount it, physically remove the drive, and configure the Solaris environment to recognize that the drive is no longer there. Then you must configure your application to operate without this device in place.

Identifying the Faulty Disk Drive

Disk errors may be reported in a number of different ways. Often you can find messages about failing or failed disks in your system console. This information is also logged in the /usr/adm/messages file(s). These error messages typically refer to a failed disk drive by its UNIX physical device name (such as /devices/pci@1f,4000/scsi@3/sd@b,0) and its UNIX device instance name (such as sd11). In some cases, a faulty disk may be identified by its UNIX logical device name, such as c0t11d0. In addition, some applications may report a disk slot number (0 through 5) or activate an LED located next to the disk drive itself (see the following figure ).

Figure 2-3 Disk Slot Numbers and LED Locations

Graphic

In order to perform a disk hot-plug procedure, you need to know the slot number of the faulty disk (0 through 5) and its logical device name (for example, c0t11d0). If you know the disk slot number, it is possible to determine the logical device name, and vice versa. It is also possible to determine both the disk slot number and the xlogical device name from a physical device name (such as /devices/pci@1f,4000/scsi@3/sd@b,0).

To make the necessary translation from one form of disk identifier to another, see the chapter, "Mapping From UNIX Logical Name to Disk Slot Number". Once you have determined both the disk slot number and logical device name, you are ready to continue with this procedure.

Removing a Disk Drive From Your Application

Continue the hot disk removal by following the instructions for your specific application:

UNIX File System (UFS)

The following procedure describes how to remove a disk being used by one or more UFS file systems.

  1. Type su and your superuser password.

  2. Identify activities or applications attached to the device you plan to remove.

    Commands to use are mount, showmount -a, and ps -ef. See the mount(1M), showmount(1M), and ps(1) man pages for more details.

    For example, where the controller number is 0 and the target ID is 11:


    # mount | grep c0t11
    /export/home1 on /dev/dsk/c0t11d0s2 setuid/read/write on
    # showmount -a | grep /export/home1
    cinnamon:/export/home1/archive
    austin:/export/home1
    swlab1:/export/home1/doc
    # ps -f | grep c0t11
    root  1225   450   4 13:09:58  pts/2   0:00 grep c0t11

    In this example, the file system /export/home1 on the faulty disk is being remotely mounted by three different systems--cinnamon, austin, and swlab1. The only process running is grep, which has finished.

  3. Stop any activity or application processes on the file systems to be deconfigured.

  4. Back up your system.

  5. Determine what file system(s) are on the disk:


    # mount | grep cwtx
    

  6. Unmount any file systems on the disk.


    Note -

    If the file system(s) are on a disk that is failing or has failed, the umount operation may not complete. A large number of error messages may be displayed in the system console and in the /var directory during the umount operation. If the umount operation does not complete, you may have to restart the system.


    For each file system returned, type:


     # umount file_system
    

    where file_system is the first field for each line returned in Step 5.

    For example:


    # umount /export/home
    # umount /export/home1
    

  7. Use the ssaadm remove_device command to take the device offline:


    # ssaadm remove_device logical_device_name
    ssaadm: warning: can't quiesce "/dev/rdsk/c0t11d0s2": I/O error
    Bus is ready for the removal of device
    Remove device and reconfigure bus as needed
    Press RETURN when ready to continue

    Here, logical_device_name is the full logical device name for the drive to be removed (/dev/rdsk/c0t11d0s2, for example). You must specify slice 2, which represents the entire disk. Note that this command also accepts a physical device name as an alternative.

    You can safely ignore the warning message since the Enterprise 250 SCSI bus does not require quiescing.

  8. Remove the disk drive from its slot.

    Refer to the Sun Enterprise 250 Server Owner's Guide for drive removal instructions.

  9. Press Return to complete the hot-plug operation.

The ssaadm command deletes the symbolic links for the device in the /dev/dsk, and /dev/rdsk hierarchies.

Solstice DiskSuite

The following procedure describes how to deconfigure a disk in use by Solstice DiskSuite software. For more information, refer to the Solstice DiskSuite documentation.

  1. Back up your system.

  2. Type su and your superuser password.

  3. Identify metadevices or applications using the device you plan to remove.

    For example:


    # metadb | grep c0t11d0
    # metastat | grep c0t11d0
    # mount | grep c0t11d0
    

  4. Delete database replicas.

    If there are database replicas on the disk, these must be deleted. For example:


    # metadb -d c0t11d0s0
    

  5. Replace slices or clear metadevices.

    If any slices of the disk are in use by submirrors or within RAID metadevices, they can be replaced by other available slices. For example:


    # metareplace d20 c0t11d0s1 c0t8d0s1
    

    If there are no replacement slices available, the metadevices must be cleared. For example:


    # metaclear d21
    

  6. Replace slices or clear hotspares.

    If any slices of the disk are used by hotspare pools, they can be replaced by other available slices.


    # metahs -r all c0t11d0s1 c0t8d0s1
    

    For example:

  7. Unmount any file systems on the disk.


    Note -

    If the file system(s) are on a disk that is failing or has failed, the umount operation may not complete. A large number of error messages may be displayed in the system console and in the /var directory during the umount operation. If the umount operation does not complete, you may have to restart the system.


    For each file system, type:


     # umount file_system
    

    For example:


    # umount /export/home
    # umount /export/home1
    

    Refer to the Solstice DiskSuite documentation for more information.

  8. Use the ssaadm remove_device command to take the device offline:


    # ssaadm remove_device logical_device_name
    ssaadm: warning: can't quiesce "/dev/rdsk/c0t11d0s2": I/O error
    Bus is ready for the removal of device
    Remove device and reconfigure bus as needed
    Press RETURN when ready to continue

    Here, logical_device_name is the full logical device name for the drive to be removed (/dev/rdsk/c0t11d0s2, for example). You must specify slice 2, which represents the entire disk. Note that this command also accepts a physical device name as an alternative.

    You can safely ignore the warning message since the Enterprise 250 SCSI bus does not require quiescing.

  9. Remove the disk drive from its slot.

    Refer to the Sun Enterprise 250 Server Owner's Guide for drive removal instructions.

  10. Press Return to complete the hot-plug operation.

The ssaadm command deletes the symbolic links for the device in the /dev/dsk, and /dev/rdsk hierarchies.