JavaScript is required to for searching.
Skip Navigation Links
Exit Print View
Oracle Solaris ZFS Administration Guide     Oracle Solaris 10 1/13 Information Library
search filter icon
search icon

Document Information

Preface

1.  Oracle Solaris ZFS File System (Introduction)

2.  Getting Started With Oracle Solaris ZFS

3.  Managing Oracle Solaris ZFS Storage Pools

4.  Installing and Booting an Oracle Solaris ZFS Root File System

5.  Managing Oracle Solaris ZFS File Systems

6.  Working With Oracle Solaris ZFS Snapshots and Clones

7.  Using ACLs and Attributes to Protect Oracle Solaris ZFS Files

8.  Oracle Solaris ZFS Delegated Administration

9.  Oracle Solaris ZFS Advanced Topics

10.  Oracle Solaris ZFS Troubleshooting and Pool Recovery

Identifying ZFS Problems

Resolving General Hardware Problems

Identifying Hardware and Device Faults

System Reporting of ZFS Error Messages

Identifying Problems With ZFS Storage Pools

Determining If Problems Exist in a ZFS Storage Pool

Reviewing zpool status Output

Overall Pool Status Information

ZFS Storage Pool Configuration Information

ZFS Storage Pool Scrubbing Status

ZFS Data Corruption Errors

Resolving ZFS Storage Device Problems

Resolving a Missing or Removed Device

Resolving a Removed Device

Physically Reattaching a Device

Notifying ZFS of Device Availability

Replacing or Repairing a Damaged Device

Determining the Type of Device Failure

Clearing Transient Device Errors

Replacing a Device in a ZFS Storage Pool

Determining If a Device Can Be Replaced

Devices That Cannot be Replaced

Replacing a Device in a ZFS Storage Pool

Viewing Resilvering Status

Resolving ZFS File System Problems

Resolving Data Problems in a ZFS Storage Pool

Checking ZFS File System Integrity

File System Repair

File System Validation

Controlling ZFS Data Scrubbing

Explicit ZFS Data Scrubbing

ZFS Data Scrubbing and Resilvering

Corrupted ZFS Data

Resolving ZFS Space Issues

ZFS File System Space Reporting

ZFS Storage Pool Space Reporting

Repairing Damaged Data

Identifying the Type of Data Corruption

Repairing a Corrupted File or Directory

Repairing Corrupted Data With Multiple Block References

Repairing ZFS Storage Pool-Wide Damage

Repairing a Damaged ZFS Configuration

Repairing an Unbootable System

11.  Recommended Oracle Solaris ZFS Practices

A.  Oracle Solaris ZFS Version Descriptions

Index

Resolving General Hardware Problems

Review the following sections to determine whether pool problems or file system unavailability is related to a hardware problem, such as faulty system board, memory, device, HBA, or a misconfiguration.

For example, a failing or faulty disk on a busy ZFS pool can greatly degrade overall system performance.

If you start by diagnosing and identifying hardware problems first, which can be easier to detect and all your hardware checks out, you can then move on to diagnosing pool and file system problems as described in the rest of this chapter. If your hardware, pool, and file system configurations are healthy, consider diagnosing application problems, which are generally more complex to unravel and are not covered in this guide.

Identifying Hardware and Device Faults

The Solaris Fault Manager tracks software, hardware and specific device problems by identifying error telemetry information that indicate a specific symptom in an error log and then reporting actual fault diagnosis when the error symptom results in an actual fault.

The following command identifies any software or hardware related fault.

# fmadm faulty

Use the above command routinely to identify failed services or devices.

Use the following command routinely to identify hardware or device related errors.

# fmdump -eV | more

Error messages in this log file that describe vdev.open_failed, checksum, or io_failure issues need your attention or they might evolve into actual faults that are displayed with the fmadm faulty command.

If the above indicates that a device is failing, then this is a good time to make sure you have a replacement device available.

You can also track additional device errors by using iostat command. Use the following syntax to identify a summary of error statistics.

# iostat -en
  ---- errors --- 
  s/w h/w trn tot device
    0   0   0   0 c0t5000C500335F95E3d0
    0   0   0   0 c0t5000C500335FC3E7d0
    0   0   0   0 c0t5000C500335BA8C3d0
    0  12   0  12 c2t0d0
    0   0   0   0 c0t5000C500335E106Bd0
    0   0   0   0 c0t50015179594B6F11d0
    0   0   0   0 c0t5000C500335DC60Fd0
    0   0   0   0 c0t5000C500335F907Fd0
    0   0   0   0 c0t5000C500335BD117d0

In the above output, errors are reported on an internal disk c2t0d0. Use the following syntax to display more detailed device errors.

# iostat -En
c0t5000C500335F95E3d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 
Vendor: SEAGATE  Product: ST930003SSUN300G Revision: 0B70 Serial No: 110672QFSB 
Size: 300.00GB <300000000000 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 
Illegal Request: 0 Predictive Failure Analysis: 0 
c0t5000C500335FC3E7d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 
Vendor: SEAGATE  Product: ST930003SSUN300G Revision: 0B70 Serial No: 110672TE67 
Size: 300.00GB <300000000000 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 
Illegal Request: 0 Predictive Failure Analysis: 0 
c0t5000C500335BA8C3d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 
Vendor: SEAGATE  Product: ST930003SSUN300G Revision: 0B70 Serial No: 110672SDF4 
Size: 300.00GB <300000000000 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 
Illegal Request: 0 Predictive Failure Analysis: 0 
c2t0d0           Soft Errors: 0 Hard Errors: 12 Transport Errors: 0 
Vendor: AMI      Product: Virtual CDROM    Revision: 1.00 Serial No:  
Size: 0.00GB <0 bytes>
Media Error: 0 Device Not Ready: 12 No Device: 0 Recoverable: 0 
Illegal Request: 2 Predictive Failure Analysis: 0 

System Reporting of ZFS Error Messages

In addition to persistently tracking errors within the pool, ZFS also displays syslog messages when events of interest occur. The following scenarios generate notification events:

If ZFS detects a device error and automatically recovers from it, no notification occurs. Such errors do not constitute a failure in the pool redundancy or in data integrity. Moreover, such errors are typically the result of a driver problem accompanied by its own set of error messages.