Make sure you have met the prerequisites ("Prerequisites for Maintaining DiskSuite Objects"). Use the metastat(1M) command to view metadevice or hot spare pool status. Refer to the metastat(1M) man pages for more information.
Use the following to find an explanation of the command line output and possible actions to take.
Refer to Table 3-2 for an explanation of DiskSuite's general status keywords.
DiskSuite does not report a state change for a concatenation or a stripe, unless the concatenation or stripe is used as a submirror. Refer to "Stripe and Concatenation Status (DiskSuite Tool)" for more information.
Running metastat(1M) on a mirror displays the state of each submirror, the pass number, the read option, the write option, and the size of the total number of blocks in the mirror. Refer to "How to Change a Mirror's Options (Command Line)" to change a mirror's pass number, read option, or write option.
Here is sample mirror output from metastat.
# metastat d0: Mirror Submirror 0: d1 State: Okay Submirror 1: d2 State: Okay Pass: 1 Read option: roundrobin (default) Write option: parallel (default) Size: 5600 blocks d1: Submirror of d0 State: Okay Size: 5600 blocks Stripe 0: Device Start Block Dbase State Hot Spare c0t2d0s7 0 No Okay ... |
For each submirror in the mirror, metastat shows the state, an "invoke" line if there is an error, the assigned hot spare pool (if any), size in blocks, and information about each slice in the submirror.
Table 3-8 explains submirror states.
Table 3-8 Submirror States (Command Line)
State |
Meaning |
---|---|
Okay |
The submirror has no errors and is functioning correctly. |
Resyncing |
The submirror is actively being resynced. An error has occurred and been corrected, the submirror has just been brought back online, or a new submirror has been added. |
Needs Maintenance |
A slice (or slices) in the submirror has encountered an I/O error or an open error. All reads and writes to and from this slice in the submirror have been discontinued. |
Additionally, for each stripe in a submirror, metastat shows the "Device" (device name of the slice in the stripe); "Start Block" on which the slice begins; "Dbase" to show if the slice contains a state database replica; "State" of the slice; and "Hot Spare" to show the slice being used to hot spare a failed slice.
The slice state is perhaps the most important information when troubleshooting mirror errors. The submirror state only provides general status information, such as "Okay" or "Needs Maintenance." If the submirror reports a "Needs Maintenance" state, refer to the slice state. You take a different recovery action if the slice is in the "Maintenance" or "Last Erred" state. If you only have slices in the "Maintenance" state, they can be repaired in any order. If you have a slices in the "Maintenance" state and a slice in the "Last Erred" state, you must fix the slices in the "Maintenance" state first then the "Last Erred" slice. Refer to "Overview of Replacing and Enabling Slices in Mirrors and RAID5 Metadevices".
Table 3-9 explains the slice states for submirrors and possible actions to take.
Table 3-9 Submirror Slice States (Command Line)
State |
Meaning |
Action |
---|---|---|
Okay |
The slice has no errors and is functioning correctly. |
None. |
Resyncing |
The slice is actively being resynced. An error has occurred and been corrected, the submirror has just been brought back online, or a new submirror has been added. |
If desired, monitor the submirror status until the resync is done. |
Maintenance |
The slice has encountered an I/O error or an open error. All reads and writes to and from this slice have been discontinued. |
Enable or replace the errored slice. See "How to Enable a Slice in a Submirror (Command Line)", or "How to Replace a Slice in a Submirror (Command Line)". Note: The metastat(1M) command will show an invoke recovery message with the appropriate action to take with the metareplace(1M) command. You can also use the metareplace -e command. |
Last Erred |
The slice has encountered an I/O error or an open error. However, the data is not replicated elsewhere due to another slice failure. I/O is still performed on the slice. If I/O errors result, the mirror I/O will fail. |
First, enable or replace slices in the "Maintenance" state. See "How to Enable a Slice in a Submirror (Command Line)", or "How to Replace a Slice in a Submirror (Command Line)". Usually, this error results in some data loss, so validate the mirror after it is fixed. For a file system, use the fsck(1M) command to validate the "metadata" then check the user-data. An application or database must have its own method of validating the metadata. |
Running the metastat(1M) command on a RAID5 metadevice shows the status of the metadevice. Additionally, for each slice in the RAID5 metadevice, metastat shows the "Device" (device name of the slice in the stripe); "Start Block" on which the slice begins; "Dbase" to show if the slice contains a state database replica; "State" of the slice; and "Hot Spare" to show the slice being used to hot spare a failed slice.
Here is sample RAID5 metadevice output from metastat.
# metastat d10: RAID State: Okay Interlace: 32 blocks Size: 10080 blocks Original device: Size: 10496 blocks Device Start Block Dbase State Hot Spare c0t0d0s1 330 No Okay c1t2d0s1 330 No Okay c2t3d0s1 330 No Okay |
Table 3-10 explains RAID5 metadevice states.
Table 3-10 RAID5 States (Command Line)
State |
Meaning |
---|---|
Initializing |
Slices are in the process of having all disk blocks zeroed. This is necessary due to the nature of RAID5 metadevices with respect to data and parity interlace striping.
Once the state changes to the "Okay," the initialization process is complete and you are able to open the device. Up to this point, applications receive error messages. |
Okay |
The device is ready for use and is currently free from errors. |
Maintenance |
A single slice has been marked as errored due to I/O or open errors encountered during a read or write operation. |
The slice state is perhaps the most important information when troubleshooting RAID5 metadevice errors. The RAID5 state only provides general status information, such as "Okay" or "Needs Maintenance." If the RAID5 reports a "Needs Maintenance" state, refer to the slice state. You take a different recovery action if the slice is in the "Maintenance" or "Last Erred" state. If you only have a slice in the "Maintenance" state, it can be repaired without loss of data. If you have a slice in the "Maintenance" state and a slice in the "Last Erred" state, data has probably been corrupted. You must fix the slice in the "Maintenance" state first then the "Last Erred" slice. Refer to "Overview of Replacing and Enabling Slices in Mirrors and RAID5 Metadevices".
Table 3-11 explains the slice states for a RAID5 metadevice and possible actions to take.
Table 3-11 RAID5 Slice States (Command Line)
State |
Meaning |
Action |
---|---|---|
Initializing |
Slices are in the process of having all disk blocks zeroed. This is necessary due to the nature of RAID5 metadevices with respect to data and parity interlace striping. |
Normally none. If an I/O error occurs during this process, the device goes into the "Maintenance" state. If the initialization fails, the metadevice is in the "Initialization Failed" state and the slice is in the "Maintenance" state. If this happens, clear the metadevice and recreate it. |
Okay |
The device is ready for use and is currently free from errors. |
None. Slices may be added or replaced, if necessary. |
Resyncing |
The slice is actively being resynced. An error has occurred and been corrected, a slice has been enabled, or a slice has been added. |
If desired, monitor the RAID5 metadevice status until the resync is done. |
Maintenance |
A single slice has been marked as errored due to I/O or open errors encountered during a read or write operation. |
Enable or replace the errored slice. See "How to Enable a Slice in a RAID5 Metadevice (Command Line)", or "How to Replace a RAID5 Slice (Command Line)". Note: The metastat(1M) command will show an invoke recovery message with the appropriate action to take with the metareplace(1M) command. |
Maintenance/ Last Erred |
Multiple slices have encountered errors. The state of the errored slices is either "Maintenance" or "Last Erred." In this state, no I/O is attempted on the slice that is in the "Maintenance" state, but I/O is attempted to the slice marked "Last Erred" with the outcome being the overall status of the I/O request. |
Enable or replace the errored slices. See "How to Enable a Slice in a RAID5 Metadevice (Command Line)", or "How to Replace a RAID5 Slice (Command Line)". Note: The metastat(1M) command will show an invoke recovery message with the appropriate action to take with the metareplace(1M) command, which must be run with the -f flag. This indicates that data might be fabricated due to multiple errored slices. |
Running the metastat(1M) command on a trans metadevice shows the status of the metadevice.
Here is sample trans metadevice output from metastat:
# metastat d20: Trans State: Okay Size: 102816 blocks Master Device: c0t3d0s4 Logging Device: c0t2d0s3 Master Device Start Block Dbase c0t3d0s4 0 No c0t2d0s3: Logging device for d0 State: Okay Size: 5350 blocks Logging Device Start Block Dbase c0t2d0s3 250 No |
The metastat command also shows master and logging devices. For each device, the following information is displayed: the "Device" (device name of the slice or metadevice); "Start Block" on which the device begins; "Dbase" to show if the device contains a state database replica; and for the logging device, the "State."
Table 3-12 explains trans metadevice states and possible actions to take.
Table 3-12 Trans Metadevice States (Command Line)
State |
Meaning |
Action |
---|---|---|
Okay |
The device is functioning properly. If mounted, the file system is logging and will not be checked at boot. |
None. |
Attaching |
The logging device will be attached to the trans metadevice when the trans is closed or unmounted. When this occurs, the device is transitioned to the Okay state. |
Refer to the metattach(1M) man page. |
Detached |
The trans metadevice does not have a logging device. All benefits from UFS logging are disabled. |
fsck(1M) automatically checks the device at boot time. Refer to the metadetach(1M) man page. |
Detaching |
The logging device will be detached from the trans metadevice when the trans is closed or unmounted. When this occurs, the device transitions to the Detached state. |
Refer to the metadetach(1M) man page. |
Hard Error |
A device error or file system panic has occurred while the device was in use. An I/O error is returned for every read or write until the device is closed or unmounted. The first open causes the device to transition to the Error state. |
Fix the trans metadevice. See "How to Recover a Trans Metadevice With a File System Panic (Command Line)", or "How to Recover a Trans Metadevice With Hard Errors (Command Line)". |
Error |
The device can be read and written. The file system can be mounted read-only. However, an I/O error is returned for every read or write that actually gets a device error. The device does not transition back to the Hard Error state, even when a later device error of file system panic occurs. |
Fix the trans metadevice. See "How to Recover a Trans Metadevice With a File System Panic (Command Line)", or "How to Recover a Trans Metadevice With Hard Errors (Command Line)". Successfully completing fsck(1M) or newfs(1M) transitions the device into the Okay state. When the device is in the Hard Error or Error state, fsck automatically checks and repairs the file system at boot time. newfs destroys whatever data may be on the device. |
Running the metastat(1M) command on a hot spare pool shows the status of the hot spare pool and its hot spares.
Here is sample hot spare pool output from metastat.
# metastat hsp001 hsp001: 1 hot spare c1t3d0s2 Available 16800 blocks |
Table 3-13 explains hot spare pool states and possible actions to take.
Table 3-13 Hot Spare Pool States (Command Line)
State |
Meaning |
Action |
---|---|---|
Available |
The hot spares are running and ready to accept data, but are not currently being written to or read from. |
None. |
In-use |
Hot spares are currently being written to and read from. |
Diagnose how the hot spares are being used. Then repair the slice in the metadevice for which the hot spare is being used. |
Attention |
There is a problem with a hot spare or hot spare pool, but there is no immediate danger of losing data. This status is also displayed if there are no hot spares in the Hot Spare Pool or all the hot spares are in use or any are broken. |
Diagnose how the hot spares are being used or why they are broken. You can add more hot spares to the hot spare pool if desired. |