JavaScript is required to for searching.
Skip Navigation Links
Exit Print View
Oracle Solaris 11.1 Administration: ZFS File Systems     Oracle Solaris 11.1 Information Library
search filter icon
search icon

Document Information

Preface

1.  Oracle Solaris ZFS File System (Introduction)

2.  Getting Started With Oracle Solaris ZFS

3.  Managing Oracle Solaris ZFS Storage Pools

4.  Managing ZFS Root Pool Components

5.  Managing Oracle Solaris ZFS File Systems

6.  Working With Oracle Solaris ZFS Snapshots and Clones

7.  Using ACLs and Attributes to Protect Oracle Solaris ZFS Files

8.  Oracle Solaris ZFS Delegated Administration

9.  Oracle Solaris ZFS Advanced Topics

10.  Oracle Solaris ZFS Troubleshooting and Pool Recovery

Resolving ZFS Space Issues

ZFS File System Space Reporting

ZFS Storage Pool Space Reporting

Identifying ZFS Failures

Missing Devices in a ZFS Storage Pool

Damaged Devices in a ZFS Storage Pool

Corrupted ZFS Data

Checking ZFS File System Integrity

File System Repair

File System Validation

Controlling ZFS Data Scrubbing

Explicit ZFS Data Scrubbing

ZFS Data Scrubbing and Resilvering

Resolving Problems With ZFS

Determining If Problems Exist in a ZFS Storage Pool

Reviewing zpool status Output

Overall Pool Status Information

Pool Configuration Information

Scrubbing Status

Data Corruption Errors

System Reporting of ZFS Error Messages

Repairing a Damaged ZFS Configuration

Resolving a Missing Device

Physically Reattaching a Device

Notifying ZFS of Device Availability

Replacing or Repairing a Damaged Device

Determining the Type of Device Failure

Clearing Transient Errors

Replacing a Device in a ZFS Storage Pool

Determining If a Device Can Be Replaced

Devices That Cannot be Replaced

Replacing a Device in a ZFS Storage Pool

Viewing Resilvering Status

Repairing Damaged Data

Identifying the Type of Data Corruption

Repairing a Corrupted File or Directory

Repairing Corrupted Data With Multiple Block References

Repairing ZFS Storage Pool-Wide Damage

Repairing an Unbootable System

11.  Archiving Snapshots and Root Pool Recovery

12.  Recommended Oracle Solaris ZFS Practices

A.  Oracle Solaris ZFS Version Descriptions

Index

Resolving a Missing Device

If a device cannot be opened, it displays the UNAVAIL state in the zpool status output. This state means that ZFS was unable to open the device when the pool was first accessed, or the device has since become unavailable. If the device causes a top-level virtual device to be unavailable, then nothing in the pool can be accessed. Otherwise, the fault tolerance of the pool might be compromised. In either case, the device just needs to be reattached to the system to restore normal operations. If you need to replace a device that is UNAVAIL because it has failed, see Replacing a Device in a ZFS Storage Pool.

If a device is UNAVAIL in a root pool or a mirrored root pool, see the following references:

For example, you might see a message similar to the following from fmd after a device failure:

SUNW-MSG-ID: ZFS-8000-QJ, TYPE: Fault, VER: 1, SEVERITY: Minor
EVENT-TIME: Wed Jun 20 13:09:55 MDT 2012
PLATFORM: ORCL,SPARC-T3-4, CSN: 1120BDRCCD, HOSTNAME: tardis
SOURCE: zfs-diagnosis, REV: 1.0
EVENT-ID: e13312e0-be0a-439b-d7d3-cddaefe717b0
DESC: Outstanding dtls on ZFS device 'id1,sd@n5000c500335dc60f/a' in pool 'pond'.
AUTO-RESPONSE: No automated response will occur.
IMPACT: None at this time.
REC-ACTION: Use 'fmadm faulty' to provide a more detailed view of this event. 
Run 'zpool status -lx' for more information. Please refer to the associated 
reference document at http://support.oracle.com/msg/ZFS-8000-QJ for the latest 
service procedures and policies regarding this diagnosis.

To view more detailed information about the device problem and the resolution, use the zpool status -v command. For example:

# zpool status -v
  pool: pond
 state: DEGRADED
status: One or more devices are unavailable in response to persistent errors.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or 'fmadm repaired', or replace the device
        with 'zpool replace'.
  scan: scrub repaired 0 in 0h0m with 0 errors on Wed Jun 20 13:16:09 2012
config:

        NAME                       STATE     READ WRITE CKSUM
        pond                       DEGRADED     0     0     0
          mirror-0                 ONLINE       0     0     0
            c0t5000C500335F95E3d0  ONLINE       0     0     0
            c0t5000C500335F907Fd0  ONLINE       0     0     0
          mirror-1                 DEGRADED     0     0     0
            c0t5000C500335BD117d0  ONLINE       0     0     0
            c0t5000C500335DC60Fd0  UNAVAIL      0     0     0

device details:

        c0t5000C500335DC60Fd0    UNAVAIL          cannot open
        status: ZFS detected errors on this device.
                The device was missing.
           see: http://support.oracle.com/msg/ZFS-8000-LR for recovery

You can see from this output that the c0t5000C500335DC60Fd0 device is not functioning. If you determine that the device is faulty, replace it.

If necessary, use the zpool online command to bring the replaced device online. For example:

Let FMA know that the device has been replaced if the output of the fmadm faulty identifies the device error. For example:

# fmadm faulty
--------------- ------------------------------------  -------------- ---------
TIME            EVENT-ID                              MSG-ID         SEVERITY
--------------- ------------------------------------  -------------- ---------
Jun 20 13:15:41 3745f745-371c-c2d3-d940-93acbb881bd8  ZFS-8000-LR    Major    

Problem Status    : solved
Diag Engine       : zfs-diagnosis / 1.0
System
    Manufacturer  : unknown
    Name          : ORCL,SPARC-T3-4
    Part_Number   : unknown
    Serial_Number : 1120BDRCCD
    Host_ID       : 84a02d28

----------------------------------------
Suspect 1 of 1 :
   Fault class : fault.fs.zfs.open_failed
   Certainty   : 100%
   Affects     : zfs://pool=86124fa573cad84e/vdev=25d36cd46e0a7f49/pool_name=pond/vdev_
name=id1,sd@n5000c500335dc60f/a
   Status      : faulted and taken out of service

   FRU
     Name             : "zfs://pool=86124fa573cad84e/vdev=25d36cd46e0a7f49/pool_name=pond/vdev_
name=id1,sd@n5000c500335dc60f/a"
        Status        : faulty

Description : ZFS device 'id1,sd@n5000c500335dc60f/a' in pool 'pond' failed to
              open.

Response    : An attempt will be made to activate a hot spare if available.

Impact      : Fault tolerance of the pool may be compromised.

Action      : Use 'fmadm faulty' to provide a more detailed view of this event.
              Run 'zpool status -lx' for more information. Please refer to the
              associated reference document at
              http://support.oracle.com/msg/ZFS-8000-LR for the latest service
              procedures and policies regarding this diagnosis.

Extract the string in the Affects: section of the fmadm faulty output and include it with the following command to let FMA know that the device is replaced:

# fmadm repaired zfs://pool=86124fa573cad84e/vdev=25d36cd46e0a7f49/pool_name=pond/vdev_
name=id1,sd@n5000c500335dc60f/a
fmadm: recorded repair to of zfs://pool=86124fa573cad84e/vdev=25d36cd46e0a7f49/pool_name=pond/vdev_
name=id1,sd@n5000c500335dc60f/a

As a last step, confirm that the pool with the replaced device is healthy. For example:

# zpool status -x tank
pool 'tank' is healthy

Physically Reattaching a Device

Exactly how a missing device is reattached depends on the device in question. If the device is a network-attached drive, connectivity to the network should be restored. If the device is a USB device or other removable media, it should be reattached to the system. If the device is a local disk, a controller might have failed such that the device is no longer visible to the system. In this case, the controller should be replaced, at which point the disks will again be available. Other problems can exist and depend on the type of hardware and its configuration. If a drive fails and it is no longer visible to the system, the device should be treated as a damaged device. Follow the procedures in Replacing or Repairing a Damaged Device.

A pool might be SUSPENDED if device connectivity is compromised. A SUSPENDED pool remains in the wait state until the device issue is resolved. For example:

# zpool status cybermen
  pool: cybermen
 state: SUSPENDED
status: One or more devices are unavailable in response to IO failures.
        The pool is suspended.
action: Make sure the affected devices are connected, then run 'zpool clear' or
        'fmadm repaired'.
        Run 'zpool status -v' to see device specific details.
   see: http://support.oracle.com/msg/ZFS-8000-HC
  scan: none requested
config:

        NAME           STATE     READ WRITE CKSUM
        cybermen       UNAVAIL      0    16     0
            c8t3d0     UNAVAIL      0     0     0
            c8t1d0     UNAVAIL      0     0     0

After device connectivity is restored, clear the pool or device errors.

# zpool clear cybermen
# fmadm repaired zfs://pool=name/vdev=guid

Notifying ZFS of Device Availability

After a device is reattached to the system, ZFS might or might not automatically detect its availability. If the pool was previously UNAVAIL or SUSPENDED, or the system was rebooted as part of the attach procedure, then ZFS automatically rescans all devices when it tries to open the pool. If the pool was degraded and the device was replaced while the system was running, you must notify ZFS that the device is now available and ready to be reopened by using the zpool online command. For example:

# zpool online tank c0t1d0

For more information about bringing devices online, see Bringing a Device Online.