Go to main content

Managing ZFS File Systems in Oracle® Solaris 11.4

Exit Print View

Updated: July 2019
 
 

Working With Hot Spares in Storage Pools

The hot spares feature enables you to identify disks that could be used to replace a failed or faulted device in a storage pool. A hot spare device is inactive in a pool until the spare replaces the failed device.

Designating Hot Spares in a Storage Pool

Devices can be designated as hot spares in the following ways:

  • When the pool is created.

    $ zpool create pool keyword devices spare devices
  • After the pool is created.

    $ zpool add pool spare devices

To remove a hot spare, use the following command:

$ zpool remove pool spare-device

Note -  You cannot remove a hot spare that is currently being used by the pool.

A hot spare device must be equal to or larger than the size of the largest disk in the pool. Otherwise, the smaller spare device can still be designated as a hot spare. However, when that device is activated to replace a failed device, the operation fails with the following error message:

cannot replace disk3 with disk4: device is too small

Do not share a spare across multiple pools or multiple systems even if the device is visible for access by these systems. You can configure a disk to be shared among several pools provided that only a single system must control all of these pools. However, this practice is risky. For example, if pool A that is using the shared spare is exported, pool B could unknowingly use the spare while pool A is exported. When pool A is imported, data corruption could occur because both pools are using the same disk.

Activating and Deactivating Hot Spares in Your Storage Pool

You activate hot spares in the following ways:

  • Manual replacement – Run the zpool replace command to replace a failed device. When a new device is inserted to replace the failed disk, you activate the new device by detaching the spare.

  • Automatic replacement – An FMA agent detects a fault, determines spare availability, and automatically replaces the faulted device. A hot spare also replaces a device in the UNAVAIL state.

    If you set the autoreplace pool property to on, the spare is automatically detached and returned to the spare pool when the new device is inserted and the online operation completes.

To deactivate a hot spare, perform one of the following actions:

Example 14  Detaching a Hot Spare After the Failed Disk Is Replaced

This example assume the following configuration:

  • In system1's mirror-1 configuration, disk c0t5000C500335BA8C3d0 has failed. The following partial output shows the status of mirror-1:

    $ zpool status system1
    .
      mirror-1                 DEGRADED     0     0     0
        c0t5000C500335BD117d0  ONLINE       0     0     0
        c0t5000C500335BA8C3d0  UNAVAIL      0     0     0Failed disk
    
  • The pool's spare c0t5000C500335E106Bd0 is automatically activated to replace the failed disk.

  • You physically replace the failed disk with a new device c0t5000C500335DC60Fd0.

The example begins with reconfiguring the pool with the new device. First, you run zpool replace to inform ZFS about the removed device. Then, if necessary, you run zpool detach to deactivate the spare and return it to the spare pool. The example ends with displaying the status of the new configuration and performing the appropriate FMA steps for fault devices, as shown in Step 6 of How to Replace a Device in a Storage Pool.

$ zpool replace system1 c0t5000C500335BA8C3d0
$ zpool detach system1 c0t5000C500335E106Bd0
$ zpool status system1
.
.
  mirror-1                 ONLINE       0     0     0
    c0t5000C500335BD117d0  ONLINE       0     0     0
    c0t5000C500335DC60Fd0  ONLINE       0     0     0Replacement device
  spares
    c0t5000C500335E106Bd0    AVAIL                   Deactivated spare

$ fmadm faulty
$ fmadm repaired zfs://pool=name/vdev=guid
Example 15  Detaching a Failed Disk and Using the Hot Spare

Instead of a new replacement device, you can use the spare device as a permanent replacement instead. In this case, you simply detach the failed disk. If the failed disk is subsequently repaired, then you can add it to the pool as a newly designated spare.

This example uses the same assumptions as Example 14, Detaching a Hot Spare After the Failed Disk Is Replaced.

  • The mirror-1 configuration of the pool system1 is in a degraded state.

    $ zpool status system1
    .
      mirror-1                 DEGRADED     0     0     0
        c0t5000C500335BD117d0  ONLINE       0     0     0
        c0t5000C500335BA8C3d0  UNAVAIL      0     0     0Failed disk
    
  • The pool's spare c0t5000C500335E106Bd0 is automatically activated to replace the failed disk.

The example begins with detaching the failed disk that has been replaced by the spare.

$ zpool detach system1 c0t5000C500335BA8C3d0
$ zpool status system1
.
.
  mirror-1                 ONLINE       0     0     0
    c0t5000C500335BD117d0  ONLINE       0     0     0
    c0t5000C500335E106Bd0  ONLINE       0     0     0Spare replaces failed disk

errors: No known data errors

Subsequently, you add the repaired disk back to the pool as the spare device. You complete the procedure by performing the appropriate FMA steps for fault devices.

$ zpool add system1 spare c0t5000C500335BA8C3d0
$ zpool status system1
.
.
  mirror-1                 ONLINE       0     0     0
    c0t5000C500335BD117d0  ONLINE       0     0     0
    c0t5000C500335E106Bd0  ONLINE       0     0     0Former spare
  spares
    c0t5000C500335BA8C3d0    AVAIL                   Repaired disk as spare

errors: No known data errors

$ fmadm faulty
$ fmadm repaired zfs://pool=name/vdev=guid