The process of replacing a device can take an extended period of time, depending on the size of the device and the amount of data in the pool. The process of moving data from one device to another device is known as resilvering and can be monitored by using the zpool status command.
Traditional file systems resilver data at the block level. Because ZFS eliminates the artificial layering of the volume manager, it can perform resilvering in a much more powerful and controlled manner. The two main advantages of this feature are as follows:
ZFS only resilvers the minimum amount of necessary data. In the case of a short outage (as opposed to a complete device replacement), the entire disk can be resilvered in a matter of minutes or seconds. When an entire disk is replaced, the resilvering process takes time proportional to the amount of data used on disk. Replacing a 500-GB disk can take seconds if a pool has only a few gigabytes of used disk space.
Resilvering is interruptible and safe. If the system loses power or is rebooted, the resilvering process resumes exactly where it left off, without any need for manual intervention.
To view the resilvering process, use the zpool status command. For example:
# zpool status tank pool: tank state: DEGRADED status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scrub: resilver in progress for 0h0m, 22.60% done, 0h1m to go config: NAME STATE READ WRITE CKSUM tank DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 replacing-0 DEGRADED 0 0 0 c1t0d0 UNAVAIL 0 0 0 cannot open c2t0d0 ONLINE 0 0 0 85.0M resilvered c1t1d0 ONLINE 0 0 0 errors: No known data errors |
In this example, the disk c1t0d0 is being replaced by c2t0d0. This event is observed in the status output by the presence of the replacing virtual device in the configuration. This device is not real, nor is it possible for you to create a pool by using it. The purpose of this device is solely to display the resilvering progress and to identify which device is being replaced.
Note that any pool currently undergoing resilvering is placed in the ONLINE or DEGRADED state because the pool cannot provide the desired level of redundancy until the resilvering process is completed. Resilvering proceeds as fast as possible, though the I/O is always scheduled with a lower priority than user-requested I/O, to minimize impact on the system. After the resilvering is completed, the configuration reverts to the new, complete, configuration. For example:
# zpool status tank pool: tank state: ONLINE scrub: resilver completed after 0h1m with 0 errors on Tue Feb 2 13:54:30 2010 config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 c2t0d0 ONLINE 0 0 0 377M resilvered c1t1d0 ONLINE 0 0 0 errors: No known data errors |
The pool is once again ONLINE, and the original failed disk (c1t0d0) has been removed from the configuration.