Solstice DiskSuite 4.2.1 User's Guide

How to Recover a Trans Metadevice With Hard Errors (Command Line)

Use this procedure to transition a trans metadevice to the "Okay" state.

Refer to "How to Check the Status of Metadevices and Hot Spare Pools (Command Line)" to check the status of a trans metadevice.

If either the master or log devices encounter errors while processing logged data, the device transitions from the "Okay" state to the "Hard Error" state. If the device is in the "Hard Error" or "Error" state, either a device error or file system panic occurred. Recovery from both scenarios is the same.


Note -

If a log (logging device) is shared, a failure in any of the slices in a trans metadevice will result in all slices or metadevices associated with the trans metadevice switching to an errored state.


The high-level steps in this procedure are:

  1. After checking the prerequisites ("Prerequisites for Maintaining DiskSuite Objects") and the preliminary information ("Repairing Trans Metadevice Problems"), run the lockfs(1M) command to determine which file systems are locked.


    # lockfs
    

    Affected file systems will be listed with a lock type of hard. Every file system sharing the same logging device will be hard locked.

  2. Unmount the affected file system(s).

    You can unmount locked file systems even if they were in use when the error occurred. If the affected processes try to access an opened file or directory on the hard locked or unmounted file system, an EIO error is returned.

  3. [Optional] Back up any accessible data.

    Before attempting to fix the device error, you may want to recover as much data as possible. If your backup procedure requires a mounted file system (such as tar or cpio), you can mount the file system read-only. If your backup procedure does not require a mounted file system (such as dump or volcopy), you can access the trans metadevice directly.

  4. Fix the device error.

    At this point, any attempt to open or mount the trans metadevice for read/write access starts rolling all accessible data on the logging device to the appropriate master device(s). Any data that cannot be read or written is discarded. However, if you open or mount the trans metadevice for read-only access, the log is simply rescanned and not rolled forward to the master device(s), and the error is not fixed. In other words, all of the data on the master and logging devices remains unchanged until the first read/write open or mount.

  5. Run fsck(1M) to repair the file system, or newfs(1M) if you need to restore data.

    Run fsck on all of the trans metadevices sharing the same logging device. When all of these trans metadevices have been repaired by fsck, they then revert to the "Okay" state.

    The newfs(1M) command will also transition the file system back to the "Okay" state, but will destroy all of the data on the file system. newfs(1M) is generally used when you plan to restore file systems from backup.

    The fsck(1M) or newfs(1M) commands must be run on all of the trans metadevices sharing the same logging device before these devices revert back to the "Okay" state.

  6. Run the metastat(1M) command to verify that the state of the affected devices has reverted to "Okay."

Example -- Logging Device Error


# metastat d5
d5: Trans
    State: Hard Error  
    Size: 10080 blocks
    Master Device: d4
    Logging Device: c0t0d0s6
 
d4: Mirror
    State: Okay
...
c0t0d0s6: Logging device for d5
    State: Hard Error
    Size: 5350 blocks
...
# fsck /dev/md/rdsk/d5
** /dev/md/rdsk/d5
** Last Mounted on /fs1
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
WARNING: md: logging device: /dev/dsk/c0t0d0s6 changed state to
Okay
4 files, 11 used, 4452 free (20 frags, 554 blocks, 0.4%
fragmentation)
# metastat d5
d5: Trans
    State: Okay
    Size: 10080 blocks
    Master Device: d4
    Logging Device: c0t0d0s6
 
d4: Mirror
    State: Okay
...
 
c0t0d0s6: Logging device for d5
    State: Okay
...

This example fixes a trans metadevice, d5, which has a logging device in the "Hard Error" state. You must run fsck on the trans device itself. This transitions the state of the trans metadevice to "Okay." The metastat confirms that the state is "Okay."