A P P E N D I X  A

Troubleshooting Sun StorEdge SAM-FS

This appendix describes some tools and procedures that can be used to troubleshoot issues with the Sun StorEdge SAM-FS file system. Specifically, it contains the following topics:

For more complete Sun StorEdge SAM-FS troubleshooting information, see the Sun StorEdge SAM-FS Troubleshooting Guide.


Checking File System Integrity and Repairing File Systems

Sun StorEdge SAM-FS file systems write validation data in the following records that are critical to file system operations: directories, indirect blocks, and inodes. If the file system detects corruption while searching a directory, it issues an EDOM error, and the directory is not processed. If an indirect block is not valid, it issues an ENOCSI error, and the file is not processed. TABLE A-1 summarizes these error indicators.


TABLE A-1 Error Indicators

Error

Solaris OS Meaning

Sun StorEdge SAM-FS Meaning

EDOM

Argument is out of domain.

Values in validation records are out of range.

ENOCSI

No CSI structure is available.

Links between structures are invalid.


In addition, inodes are validated and cross checked with directories.

You should monitor the following files for error conditions:

If a discrepancy is noted, you should unmount the file system and check it using the samfsck(1M) command.



Note - The samfsck(1M) command can be issued on a mounted file system, but the results cannot be trusted. Because of this, you are encouraged to run the command on an unmounted file system only.




procedure icon  To Check a File System

single-step bulletUse the samfsck(1M) command to perform a file systems check.

Use this command in the following format:


samfsck -V family-set-name

For family-set-name, specify the name of the file system as specified in the mcf(4) file.

You can send output from samfsck(1M) to both your screen and to a file by using it in conjunction with the tee(1) command, as follows.

Nonfatal errors returned by samfsck(1M) are preceded by NOTICE. Nonfatal errors are lost blocks and orphans. The file system is still consistent if NOTICE errors are returned. You can repair these nonfatal errors during a convenient, scheduled maintenance outage.

Fatal errors are preceded by ALERT. These errors include duplicate blocks, invalid directories, and invalid indirect blocks. The file system is not consistent if these errors occur. Notify Sun if the ALERT errors cannot be explained by a hardware malfunction.

If the samfsck(1M) command detects file system corruption and returns ALERT messages, you should determine the reason for the corruption. If hardware is faulty, repair it before repairing the file system.

For more information about the samfsck(1M) and tee(1) commands, see the samfsck(1M) and tee(1) man pages.


procedure icon  To Repair a File System

1. Use the umount(1M) command to unmount the file system.

Run the samfsck(1M) command when the file system is not mounted. For information about unmounting a file system, see Unmounting a File System.

2. Use the samfsck(1M) command to repair a file system. If you are repairing a shared file system, issue the command from the metadata server.

You can issue the samfsck(1M) command in the following format to repair a file system:


# samfsck -F -V fsname

For fsname, specify the name of the file system as specified in the mcf(4) file.