Solaris 10 What's New

Faulty Device Retirement Feature

Starting with the Solaris 10 10/08 release, the Solaris OS includes a new device retirement mechanism to isolate a device as faulty by the fault management framework (FMA). This feature allows faulty devices to be safely and automatically inactivated to avoid data loss, data corruption, panics, and system down time. The retirement process is done safely, taking into account the stability of the system after the device has been retired.

Critical devices are never retired. If you need to manually replace a retired device, use the fmadm repair command after the device replacement so that system knows that the device is replaced, in addition to the manual replacement steps.

The fmadm repair process is as follows:

For more information, see fmadm(1M).

A general message regarding device retirement is displayed on the console and written to the /var/adm/messages file so that you aware of a retired device. For example:

Aug 9 18:14 starbug genunix: [ID 751201 kern.notice] 
NOTICE: One or more I/O devices have been retired

You can use the prtconf command to identify specific retired devices. For example:

# prtconf
pci, instance #2
        scsi, instance #0
            disk (driver not attached)
            tape (driver not attached)
            sd, instance #3
            sd, instance #0 (retired)
        scsi, instance #1 (retired)
            disk (retired)
            tape (retired)
    pci, instance #3
        network, instance #2 (driver not attached)
        network, instance #3 (driver not attached)
    os-io (driver not attached)
    iscsi, instance #0
    pseudo, instance #0