Hot spares are activated in the following ways:
Manual replacement – Replace a failed device in a storage pool with a hot spare by using the zpool replace command.
Automatic replacement – When a fault is received, an FMA agent examines the pool to see if it has any available hot spares. If so, it replaces the faulted device with an available spare.
If a hot spare that is currently in use fails, the agent detaches the spare and thereby cancels the replacement. The agent then attempts to replace the device with another hot spare, if one is available. This feature is currently limited by the fact that the ZFS diagnosis engine only emits faults when a device disappears from the system.
If you physically replace a failed device with an active spare, you can reactivate the original device by using the zpool detach command to detach the spare. If you set the autoreplace pool property to on, the spare is automatically detached back to the spare pool when the new device is inserted and the online operation completes.
Manually replace a device with a hot spare by using the zpool replace command. See Example 4–7.
A faulted device is automatically replaced if a hot spare is available. For example:
# zpool status -x pool: zeepool state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using 'zpool online'. see: http://www.sun.com/msg/ZFS-8000-2Q scrub: resilver completed after 0h0m with 0 errors on Mon Jan 11 10:20:35 2010 config: NAME STATE READ WRITE CKSUM zeepool DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 c1t2d0 ONLINE 0 0 0 spare-1 DEGRADED 0 0 0 c2t1d0 UNAVAIL 0 0 0 cannot open c2t3d0 ONLINE 0 0 0 88.5K resilvered spares c2t3d0 INUSE currently in use errors: No known data errors |
Currently, three ways to deactivate hot spares are available:
Removing the hot spare from the storage pool
Detaching a hot spare after a failed disk is physically replaced. See Example 4–8.
Temporarily or permanently swapping in the hot spare. See Example 4–9.
In this example, the zpool replace command is used to replace disk c2t1d0 with the spare disk c2t3d0.
# zpool replace zeepool c2t1d0 c2t3d0 # zpool status zeepool pool: zeepool state: ONLINE scrub: resilver completed after 0h0m with 0 errors on Wed Oct 21 12:47:29 2009 config: NAME STATE READ WRITE CKSUM zeepool ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 spare ONLINE 0 0 0 c2t1d0 ONLINE 0 0 0 c2t3d0 ONLINE 0 0 0 76.5K resilvered spares c2t3d0 INUSE currently in use errors: No known data errors |
After the faulted device is replaced, use the zpool detach command to return the hot spare back to the spare set. For example:
# zpool detach zeepool c2t3d0 # zpool status zeepool pool: zeepool state: ONLINE scrub: resilver completed with 0 errors on Fri Aug 28 14:21:02 2009 config: NAME STATE READ WRITE CKSUM zeepool ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c2t1d0 ONLINE 0 0 0 spares c2t3d0 AVAIL errors: No known data errors |
If you want to replace a failed disk with a hot spare that is currently replacing it, then detach the original (failed) disk. If the failed disk is eventually replaced, then you can add it back to the storage pool as a spare. For example:
# zpool status zeepool pool: zeepool state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using 'zpool online'. see: http://www.sun.com/msg/ZFS-8000-2Q scrub: resilver in progress for 0h0m, 70.47% done, 0h0m to go config: NAME STATE READ WRITE CKSUM zeepool DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 c1t2d0 ONLINE 0 0 0 spare-1 DEGRADED 0 0 0 c2t1d0 UNAVAIL 0 0 0 cannot open c2t3d0 ONLINE 0 0 0 51.5M resilvered spares c2t3d0 INUSE currently in use errors: No known data errors # zpool detach zeepool c2t1d0 # zpool status zeepool pool: zeepool state: ONLINE scrub: resilver completed after 0h0m with 0 errors on Tue Oct 20 11:58:43 2009 config: NAME STATE READ WRITE CKSUM zeepool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c2t3d0 ONLINE 0 0 0 70.5M resilvered errors: No known data errors (Original failed disk c2t1d0 is physically replaced) # zpool add zeepool spare c2t1d0 # zpool status zeepool pool: zeepool state: ONLINE scrub: resilver completed after 0h0m with 0 errors on Tue Oct 20 11:58:43 2009 config: NAME STATE READ WRITE CKSUM zeepool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c2t3d0 ONLINE 0 0 0 70.5M resilvered spares c2t1d0 AVAIL errors: No known data errors |