Identifying Problems With ZFS Storage Pools
You can use the following features to identify problems with your ZFS configuration:
-
Detailed ZFS storage pool information can be displayed by using the
zpool status
command. -
Pool and device failures are reported through ZFS/FMA diagnostic messages.
-
Previous ZFS commands that modified pool state information can be displayed by using the
zpool history
command. -
A ZFS storage pool that is accidentally destroyed can be recovered by using the
zpool import -D
command, but its important that the pool is recovered quickly so that the devices are not reused or accidentally overwritten. For more information, see Recovering Destroyed ZFS Storage Pools. No similar feature exists to recover ZFS file systems or data. Always have good backups.
Most ZFS troubleshooting involves the zpool status
command. This command analyzes the various failures in a system and identifies the most severe problem, presenting you with a suggested action and a link to a knowledge article for more information. Note that the command only identifies a single problem with a pool, though multiple problems can exist. For example, data corruption errors generally imply that one of the devices has failed, but replacing the failed device might not resolve all of the data corruption problems.
In addition, a ZFS diagnostic engine diagnoses and reports pool failures and device failures. Checksum, I/O, device, and pool errors associated with these failures are also reported. ZFS failures as reported by fmd
are displayed on the console as well as the system messages file. In most cases, the fmd
message directs you to the zpool status
command for further recovery instructions.
The basic recovery process is as follows:
-
If appropriate, use the
zpool history
command to identify the ZFS commands that preceded the error scenario. For example:$ zpool history system1 History for 'system1': 2012-11-12.13:01:31 zpool create system1 mirror c0t1d0 c0t2d0 c0t3d0 2012-11-12.13:28:10 zfs create system1/glori 2012-11-12.13:37:48 zfs set checksum=off system1/glori
In this output, note that checksums are disabled for the
system1/glori
file system. This configuration is not recommended. -
Identify the errors through the
fmd
messages that are displayed on the system console or in the/var/adm/messages
file. -
Find further repair instructions by using the
zpool status -x
command. -
Repair the failures, which involves the following steps:
-
Replacing the unavailable or missing device and bring it online.
-
Restoring the faulted configuration or corrupted data from a backup.
-
Verifying the recovery by using the
zpool status -x
command. -
Backing up your restored configuration, if applicable.
-
This section describes how to interpret zpool status
output in order to diagnose the type of failures that can occur. Although most of the work is performed automatically by the command, it is important to understand exactly what problems are being identified in order to diagnose the failure. Subsequent sections describe how to repair the various problems that you might encounter.