The hot spares feature enables you to identify disks that could be used to replace a failed or faulted device in a storage pool. A hot spare device is inactive in a pool until the spare replaces the failed device.
Devices can be designated as hot spares in the following ways:
When the pool is created.
# zpool create pool keyword devices spare devices
After the pool is created.
# zpool add pool spare devices
To remove a hot spare, use the following command:
# zpool remove pool spare-device
A hot spare device must be equal to or larger than the size of the largest disk in the pool. Otherwise, the smaller spare device can still be designated as a hot spare. However, when that device is activated to replace a failed device, the operation fails with the following error message:
cannot replace disk3 with disk4: device is too small
Do not share a spare across multiple pools or multiple systems even if the device is visible for access by these systems. You can configure a disk to be shared among several pools provided that only a single system must control all of these pools. However, this practice is risky. For example, if pool A that is using the shared spare is exported, pool B could unknowingly use the spare while pool A is exported. When pool A is imported, data corruption could occur because both pools are using the same disk.
You activate hot spares in the following ways:
Manual replacement – Run the zpool replace command to replace a failed device. When a new device is inserted to replace the failed disk, you activate the new device by detaching the spare.
Automatic replacement – An FMA agent detects a fault, determines spare availability, and automatically replaces the faulted device. A hot spare also replaces a device in the UNAVAIL state.
If you set the autoreplace pool property to on, the spare is automatically detached and returned to the spare pool when the new device is inserted and the online operation completes.
To deactivate a hot spare, perform one of the following actions:
Remove the hot spare from the storage pool.
Detach the hot spare after physically replacing a failed disk. See Example 13, Detaching a Hot Spare After the Failed Disk Is Replaced.
Swap in another hot spare either temporarily or permanently. See Example 14, Detaching a Failed Disk and Using the Hot Spare.
This example assume the following configuration:
In system1's mirror-1 configuration, disk c0t5000C500335BA8C3d0 has failed. The following partial output shows the status of mirror-1:
# zpool status system1
.
mirror-1 DEGRADED 0 0 0
c0t5000C500335BD117d0 ONLINE 0 0 0
c0t5000C500335BA8C3d0 UNAVAIL 0 0 0Failed disk
The pool's spare c0t5000C500335E106Bd0 is automatically activated to replace the failed disk.
You physically replace the failed disk with a new device c0t5000C500335DC60Fd0.
The example begins with reconfiguring the pool with the new device. First, you run zpool replace to inform ZFS about the removed device. Then, if necessary, you run zpool detach to deactivate the spare and return it to the spare pool. The example ends with displaying the status of the new configuration and performing the appropriate FMA steps for fault devices, as shown in Step 6 of How to Replace a Device in a Storage Pool.
# zpool replace system1 c0t5000C500335BA8C3d0 # zpool detach system1 c0t5000C500335E106Bd0 # zpool status system1 . . mirror-1 ONLINE 0 0 0 c0t5000C500335BD117d0 ONLINE 0 0 0 c0t5000C500335DC60Fd0 ONLINE 0 0 0Replacement device spares c0t5000C500335E106Bd0 AVAIL Deactivated spare # fmadm faulty # fmadm repaired zfs://pool=name/vdev=guidExample 14 Detaching a Failed Disk and Using the Hot Spare
Instead of a new replacement device, you can use the spare device as a permanent replacement instead. In this case, you simply detach the failed disk. If the failed disk is subsequently repaired, then you can add it to the pool as a newly designated spare.
This example uses the same assumptions as Example 13, Detaching a Hot Spare After the Failed Disk Is Replaced.
The mirror-1 configuration of the pool system1 is in a degraded state.
# zpool status system1
.
mirror-1 DEGRADED 0 0 0
c0t5000C500335BD117d0 ONLINE 0 0 0
c0t5000C500335BA8C3d0 UNAVAIL 0 0 0Failed disk
The pool's spare c0t5000C500335E106Bd0 is automatically activated to replace the failed disk.
The example begins with detaching the failed disk that has been replaced by the spare.
# zpool detach system1 c0t5000C500335BA8C3d0
# zpool status system1
.
.
mirror-1 ONLINE 0 0 0
c0t5000C500335BD117d0 ONLINE 0 0 0
c0t5000C500335E106Bd0 ONLINE 0 0 0Spare replaces failed disk
errors: No known data errors
Subsequently, you add the repaired disk back to the pool as the spare device. You complete the procedure by performing the appropriate FMA steps for fault devices.
# zpool add system1 spare c0t5000C500335BA8C3d0 # zpool status system1 . . mirror-1 ONLINE 0 0 0 c0t5000C500335BD117d0 ONLINE 0 0 0 c0t5000C500335E106Bd0 ONLINE 0 0 0Former spare spares c0t5000C500335BA8C3d0 AVAIL Repaired disk as spare errors: No known data errors # fmadm faulty # fmadm repaired zfs://pool=name/vdev=guid