Resolving a Missing or Removed Device
If a device cannot be opened, it displays the UNAVAIL
state in
the zpool status
output. This state means that ZFS was unable to
open the device when the pool was first accessed, or the device has since become
unavailable. If the device causes a top-level virtual device to be unavailable, then
nothing in the pool can be accessed. Otherwise, the fault tolerance of the pool
might be compromised. In either case, the device just needs to be reattached to the
system to restore normal operations. If you need to replace a device that is
UNAVAIL
because it has failed, see Replacing a Device in a ZFS Storage Pool.
If a device is UNAVAIL
in a root pool or a mirrored root
pool, see the following references:
-
Mirrored root pool disk failed – Booting From an Alternate Root Pool Disk
-
Replacing a disk in a root pool – How to Replace a Disk in a ZFS Root Pool
-
Full root pool disaster recovery – Using Unified Archives for System Recovery and Cloning in Oracle Solaris 11.4
For example, you might see a message similar to the following from fmd
after a device failure:
SUNW-MSG-ID: ZFS-8000-QJ, TYPE: Fault, VER: 1, SEVERITY: Minor EVENT-TIME: Wed Jun 20 13:09:55 MDT 2012 ... SOURCE: zfs-diagnosis, REV: 1.0 EVENT-ID: e13312e0-be0a-439b-d7d3-cddaefe717b0 DESC: Outstanding dtls on ZFS device 'id1,sd@n5000c500335dc60f/a' in pool 'pond'. AUTO-RESPONSE: No automated response will occur. IMPACT: None at this time. REC-ACTION: Use 'fmadm faulty' to provide a more detailed view of this event. Run 'zpool status -lx' for more information. Please refer to the associated reference document at http://support.oracle.com/msg/ZFS-8000-QJ for the latest service procedures and policies regarding this diagnosis.
To view more detailed information about the device problem and the resolution, use the zpool status -v
command. For example:
$ zpool status -v
pool: pond
state: DEGRADED
status: One or more devices are unavailable in response to persistent errors.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or 'fmadm repaired', or replace the device
with 'zpool replace'.
scan: scrub repaired 0 in 0h0m with 0 errors on Wed Jun 20 13:16:09 2012
config:
NAME STATE READ WRITE CKSUM
pond DEGRADED 0 0 0
mirror-0 ONLINE 0 0 0
c0t5000C500335F95E3d0 ONLINE 0 0 0
c0t5000C500335F907Fd0 ONLINE 0 0 0
mirror-1 DEGRADED 0 0 0
c0t5000C500335BD117d0 ONLINE 0 0 0
c0t5000C500335DC60Fd0 UNAVAIL 0 0 0
device details:
c0t5000C500335DC60Fd0 UNAVAIL cannot open
status: ZFS detected errors on this device.
The device was missing.
see: http://support.oracle.com/msg/ZFS-8000-LR for recovery
You can see from this output that the c0t5000C500335DC60Fd0
device is not functioning. If you determine that the device is faulty, replace it.
If necessary, use the zpool online
command to bring the replaced device online. For example:
$ zpool online pond c0t5000C500335DC60Fd0
Let FMA know that the device has been replaced if the output of the fmadm faulty
identifies the device error. For example:
$ fmadm faulty
--------------- ------------------------------------ -------------- ---------
TIME EVENT-ID MSG-ID SEVERITY
--------------- ------------------------------------ -------------- ---------
Jun 20 13:15:41 3745f745-371c-c2d3-d940-93acbb881bd8 ZFS-8000-LR Major
Problem Status : solved
Diag Engine : zfs-diagnosis / 1.0
System
Manufacturer : unknown
Name : ORCL,SPARC-T3-4
Part_Number : unknown
Serial_Number : 1120BDRCCD
Host_ID : 84a02d28
----------------------------------------
Suspect 1 of 1 :
Fault class : fault.fs.zfs.open_failed
Certainty : 100%
Affects : zfs://pool=86124fa573cad84e/
vdev=25d36cd46e0a7f49/pool_name=pond/
vdev_name=id1,sd@n5000c500335dc60f/a
Status : faulted and taken out of service
FRU
Name : "zfs://pool=86124fa573cad84e/
vdev=25d36cd46e0a7f49/pool_name=pond/
vdev_name=id1,sd@n5000c500335dc60f/a"
Status : faulty
Description : ZFS device 'id1,sd@n5000c500335dc60f/a'
in pool 'pond' failed to open.
Response : An attempt will be made to activate a hot spare if available.
Impact : Fault tolerance of the pool may be compromised.
Action : Use 'fmadm faulty' to provide a more detailed view of this event.
Run 'zpool status -lx' for more information. Please refer to the
associated reference document at
http://support.oracle.com/msg/ZFS-8000-LR for the latest service
procedures and policies regarding this diagnosis.
Extract the string in the Affects:
section of the fmadm faulty
output and include it with the following command to let FMA know that the device is replaced:
$ fmadm repaired zfs://pool=86124fa573cad84e/ \
vdev=25d36cd46e0a7f49/pool_name=pond/ \
vdev_name=id1,sd@n5000c500335dc60f/a
fmadm: recorded repair to of zfs://pool=86124fa573cad84e/
vdev=25d36cd46e0a7f49/pool_name=pond/vdev_
name=id1,sd@n5000c500335dc60f/a
As a last step, confirm that the pool with the replaced device is healthy. For example:
$ zpool status -x system1
pool 'system1' is healthy