Viewing Resilvering Status

The process of replacing a device can take an extended period of time, depending on the size of the device and the amount of data in the pool. The process of moving data from one device to another device is known as resilvering and can be monitored by using the zpool status command.

The following zpool status resilver status messages are provided:

  • Resilver in-progress report. For example:

    scan: resilver in progress since Mon Jun  7 09:17:27 2010
    13.3G scanned
    13.3G resilvered at 18.5M/s, 82.34% done, 0h2m to go
  • Resilver completion message. For example:

    resilvered 16.2G in 0h16m with 0 errors on Mon Jun  7 09:34:21 2010

Resilver completion messages persist across system reboots.

Traditional file systems resilver data at the block level. Because ZFS eliminates the artificial layering of the volume manager, it can perform resilvering in a much more powerful and controlled manner. The two main advantages of this feature are as follows:

  • ZFS only resilvers the minimum amount of necessary data. In the case of a short outage (as opposed to a complete device replacement), the entire disk can be resilvered in a matter of minutes or seconds. When an entire disk is replaced, the resilvering process takes time proportional to the amount of data used on disk. Replacing a 500GB disk can take seconds if a pool has only a few gigabytes of used disk space.

  • If the system loses power or is rebooted, the resilvering process resumes exactly where it left off, without any need for manual intervention.

To view the resilvering process, use the zpool status command. For example:

$ zpool status system1
pool: system1
state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Mon Jun  7 10:49:20 2010
54.6M scanned54.5M resilvered at 5.46M/s, 24.64% done, 0h0m to go

config:

NAME             STATE     READ WRITE CKSUM
system1          ONLINE       0     0     0
   mirror-0      ONLINE       0     0     0
   replacing-0   ONLINE       0     0     0
      c1t0d0     ONLINE       0     0     0
      c2t0d0     ONLINE       0     0     0  (resilvering)
      c1t1d0     ONLINE       0     0     0

In this example, the disk c1t0d0 is being replaced by c2t0d0. This event is observed in the status output by the presence of the replacing virtual device in the configuration. This device is not real, nor is it possible for you to create a pool by using it. The purpose of this device is solely to display the resilvering progress and to identify which device is being replaced.

Note that any pool currently undergoing resilvering is placed in the ONLINE or DEGRADED state because the pool cannot provide the desired level of redundancy until the resilvering process is completed. Resilvering proceeds as fast as possible, though the I/O is always scheduled with a lower priority than user-requested I/O, to minimize impact on the system. After the resilvering is completed, the configuration reverts to the new, complete, configuration. For example:

$ zpool status system1
pool: system1
state: ONLINE
scrub: resilver completed after 0h1m with 0 errors on Tue Feb  2 13:54:30 2010
config:

NAME          STATE     READ WRITE CKSUM
system1       ONLINE       0     0     0
   mirror-0   ONLINE       0     0     0
      c2t0d0  ONLINE       0     0     0  377M resilvered
      c1t1d0  ONLINE       0     0     0

errors: No known data errors

The pool is once again ONLINE, and the original failed disk (c1t0d0) has been removed from the configuration.