Because a trans metadevice is a "layered" metadevice, consisting of a master device and logging device, and because the logging device can be shared among file systems, repairing an errored trans metadevice requires special recovery tasks.
Any device errors or file system panics must be dealt with using the command line utilities.
If a file system detects any internal inconsistencies while it is in use, it will panic the system. If the file system is setup for UFS logging, it notifies the trans metadevice that it needs to be checked at reboot. The trans metadevice transitions itself to the "Hard Error" state. All other trans metadevices sharing the same logging device also go into the "Hard Error" state.
At reboot, fsck checks and repairs the file system and transitions the file system back to the "Okay" state. fsck does this for all trans metadevices listed in the /etc/vfstab file for the affected logging device.
Device errors can cause data loss. Read errors occurring on a logging device can cause significant data loss. For this reason, it is strongly recommended that you mirror the logging device.
If a device error occurs on either the master device or the logging device while the trans metadevice is processing logged data, the device transitions from the "Okay" state to the "Hard Error" state. If the device is either in the "Hard Error" or "Error" state, either a device error has occurred, or a file system panic has occurred.
Any devices sharing the errored logging device also go the "Error" state.
For file systems that fsck cannot repair, run fsck on each trans metadevice whose file systems share the affected logging device.
# fsck /dev/md/rdsk/trans |
Only after all of the affected trans metadevices have been checked and successfully repaired will fsck reset the state of the errored trans metadevice to "Okay."
Use this procedure to transition a trans metadevice to the "Okay" state.
Refer to "How to Check the Status of Metadevices and Hot Spare Pools (Command Line)" to check the status of a trans metadevice.
If either the master or log devices encounter errors while processing logged data, the device transitions from the "Okay" state to the "Hard Error" state. If the device is in the "Hard Error" or "Error" state, either a device error or file system panic occurred. Recovery from both scenarios is the same.
If a log (logging device) is shared, a failure in any of the slices in a trans metadevice will result in all slices or metadevices associated with the trans metadevice switching to an errored state.
The high-level steps in this procedure are:
Unmounting the affected file system(s)
Backing up any accessible data
Fixing the device error
Repairing the file system (fsck(1M) or newfs(1M))
After checking the prerequisites ("Prerequisites for Maintaining DiskSuite Objects") and the preliminary information ("Repairing Trans Metadevice Problems"), run the lockfs(1M) command to determine which file systems are locked.
# lockfs |
Affected file systems will be listed with a lock type of hard. Every file system sharing the same logging device will be hard locked.
Unmount the affected file system(s).
You can unmount locked file systems even if they were in use when the error occurred. If the affected processes try to access an opened file or directory on the hard locked or unmounted file system, an EIO error is returned.
[Optional] Back up any accessible data.
Before attempting to fix the device error, you may want to recover as much data as possible. If your backup procedure requires a mounted file system (such as tar or cpio), you can mount the file system read-only. If your backup procedure does not require a mounted file system (such as dump or volcopy), you can access the trans metadevice directly.
Fix the device error.
At this point, any attempt to open or mount the trans metadevice for read/write access starts rolling all accessible data on the logging device to the appropriate master device(s). Any data that cannot be read or written is discarded. However, if you open or mount the trans metadevice for read-only access, the log is simply rescanned and not rolled forward to the master device(s), and the error is not fixed. In other words, all of the data on the master and logging devices remains unchanged until the first read/write open or mount.
Run fsck(1M) to repair the file system, or newfs(1M) if you need to restore data.
Run fsck on all of the trans metadevices sharing the same logging device. When all of these trans metadevices have been repaired by fsck, they then revert to the "Okay" state.
The newfs(1M) command will also transition the file system back to the "Okay" state, but will destroy all of the data on the file system. newfs(1M) is generally used when you plan to restore file systems from backup.
The fsck(1M) or newfs(1M) commands must be run on all of the trans metadevices sharing the same logging device before these devices revert back to the "Okay" state.
Run the metastat(1M) command to verify that the state of the affected devices has reverted to "Okay."
# metastat d5 d5: Trans State: Hard Error Size: 10080 blocks Master Device: d4 Logging Device: c0t0d0s6 d4: Mirror State: Okay ... c0t0d0s6: Logging device for d5 State: Hard Error Size: 5350 blocks ... # fsck /dev/md/rdsk/d5 ** /dev/md/rdsk/d5 ** Last Mounted on /fs1 ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Cyl groups WARNING: md: logging device: /dev/dsk/c0t0d0s6 changed state to Okay 4 files, 11 used, 4452 free (20 frags, 554 blocks, 0.4% fragmentation) # metastat d5 d5: Trans State: Okay Size: 10080 blocks Master Device: d4 Logging Device: c0t0d0s6 d4: Mirror State: Okay ... c0t0d0s6: Logging device for d5 State: Okay ... |
This example fixes a trans metadevice, d5, which has a logging device in the "Hard Error" state. You must run fsck on the trans device itself. This transitions the state of the trans metadevice to "Okay." The metastat confirms that the state is "Okay."