System Administration Guide: Basic Administration

Chapter 43 Checking UFS File System Consistency (Tasks)

This chapter provides overview information and step-by-step instructions about checking UFS file system consistency.

This is a list of step-by-step instructions in this chapter.

This is a list of the overview information in this chapter.

For information about fsck error messages, see “Resolving UFS File System Inconsistencies (Tasks)” in System Administration Guide: Advanced Administration.

For background information on the UFS file system structures referred to in this chapter, see Chapter 44, UFS File System (Reference).

File System Consistency

The UFS file system relies on an internal set of tables to keep track of inodes used and available blocks. When these internal tables are not properly synchronized with data on a disk, inconsistencies result and file systems need to be repaired.

File systems can be inconsistent because of abrupt termination of the operating system in these ways:

File system inconsistencies, while serious, are not common. When a system is booted, a check for file system consistency is automatically performed (with the fsck command). Most of the time, this file system check repairs problems it encounters.

The fsck command places files and directories that are allocated but unreferenced in the lost+found directory. A inode number is assigned as the name of unreferenced file and directory. If the lost+found directory does not exist, the fsck command creates it. If there is not enough space in the lost+found directory, the fsck command increases its size.

For a description of inodes, see Inodes.

How the File System State Is Recorded

The fsck command uses a state flag, which is stored in the superblock, to record the condition of the file system. This flag is used by the fsck command to determine whether a file system needs to be checked for consistency. The flag is used by the /sbin/rcS script during booting and by the fsck -m command. If you ignore the result from the fsck -m command, all file systems can be checked regardless of the setting of the state flag.

For a description of the superblock, see The Superblock.

The possible state flag values are described in the following table.

Table 43–1 Values of File System State Flags

State Flag Values 

Description 

FSACTIVE

When a file system is mounted and then modified, the state flag is set to FSACTIVE. The file system might contain inconsistencies. A file system is marked as FSACTIVE before any modified metadata is written to the disk. When a file system is unmounted gracefully, the state flag is set to FSCLEAN. A file system with the FSACTIVE flag must be checked by the fsck command because it might be inconsistent.

FSBAD

If the root (/) file system is mounted when its state is not FSCLEAN or FSSTABLE, the state flag is set to FSBAD. The kernel will not change this file system state to FSCLEAN or FSSTABLE. If a root (/) file system is flagged FSBAD as part of the boot process, it will be mounted read-only. You can run the fsck command on the raw root device. Then remount the root (/) file system with read and write access.

FSCLEAN

If a file system is unmounted properly, the state flag is set to FSCLEAN. Any file system with an FSCLEAN state flag is not checked when the system is booted.

FSLOG

If a file system is mounted with UFS logging, the state flag is set to FSLOG. Any file system with an FSLOG state flag is not checked when the system is booted.

FSSTABLE

The file system is (or was) mounted but has not changed since the last checkpoint (sync or fsflush) that normally occurs every 30 seconds. For example, the kernel periodically checks if a file system is idle and, if so, flushes the information in the superblock back to the disk and marks it as FSSTABLE. If the system crashes, the file system structure is stable, but users might lose a small amount of data. File systems that are marked as FSSTABLE can skip the checking before mounting. The mount command will not mount a file system for read and write access if the file system state is not FSCLEAN, FSSTABLE, or FSLOG.

The following table shows how the state flag is modified by the fsck command, based on its initial state.

Table 43–2 How the State Flag is Modified by fsck

Initial State: Before fsck

 

State After fsck

 

 

 

No Errors 

All Errors Corrected 

 

Uncorrected Errors 

unknown 

FSSTABLE

FSSTABLE

unknown 

FSACTIVE

FSSTABLE

FSSTABLE

FSACTIVE

FSSTABLE

FSSTABLE

FSSTABLE

FSACTIVE

FSCLEAN

FSCLEAN

FSSTABLE

FSACTIVE

FSBAD

FSSTABLE

FSSTABLE

FSBAD

FSLOG

FSLOG

FSLOG

FSLOG

What the fsck Command Checks and Tries to Repair

This section describes what happens in the normal operation of a file system, what can go wrong, what problems the fsck command (the checking and repair utility) looks for, and how this command corrects the inconsistencies it finds.

Why Inconsistencies Might Occur

Every working day hundreds of files might be created, modified, and removed. Each time a file is modified, the operating system performs a series of file system updates. These updates, when written to the disk reliably, yield a consistent file system.

When a user program does an operation to change the file system, such as a write, the data to be written is first copied into an in-core buffer in the kernel. Normally, the disk update is handled asynchronously. The user process is allowed to proceed even though the data write might not happen until long after the write system call has returned. Thus, at any given time, the file system, as it resides on the disk, lags behind the state of the file system that is represented by the in-core information.

The disk information is updated to reflect the in-core information when the buffer is required for another use or when the kernel automatically runs the fsflush daemon (at 30-second intervals). If the system is halted without writing out the in-core information, the file system on the disk might be in an inconsistent state.

A file system can develop inconsistencies in several ways. The most common causes are operator error and hardware failures.

Problems might result from an unclean shutdown, if a system is shut down improperly, or when a mounted file system is taken offline improperly. To prevent unclean shutdowns, the current state of the file systems must be written to disk (that is, “synchronized”) before you shut down the system, physically take a disk pack out of a drive, or take a disk offline.

Inconsistencies can also result from defective hardware. Blocks can become damaged on a disk drive at any time, or a disk controller can stop functioning correctly.

The UFS Components That Are Checked for Consistency

This section describes the kinds of consistency checks that the fsck command applies to these UFS file system components: superblock, cylinder group blocks, inodes, indirect blocks, and data blocks.

For information about UFS file system structures, see The Structure of Cylinder Groups for UFS File Systems.

Superblock Checks

The superblock stores summary information, which is the most commonly corrupted component in a UFS file system. Each change to the file system inodes or data blocks also modifies the superblock. If the CPU is halted and the last command is not a sync command, the superblock almost certainly becomes corrupted.

The superblock is checked for inconsistencies in the following:

File System Size and Inode List Size Checks

The file system size must be larger than the number of blocks used by the superblock and the list of inodes. The number of inodes must be less than the maximum number allowed for the file system. An inode represents all the information about a file. The file system size and layout information are the most critical pieces of information for the fsck command. Although there is no way to actually check these sizes because they are statically determined when the file system is created. However, the fsck command can check that the sizes are within reasonable bounds. All other file system checks require that these sizes be correct. If the fsck command detects corruption in the static parameters of the primary superblock, it requests the operator to specify the location of an alternate superblock.

For more information about the structure of the UFS file system, see The Structure of Cylinder Groups for UFS File Systems.

Free Block Checks

Free blocks are stored in the cylinder group block maps. The fsck command checks that all the blocks marked as free are not claimed by any files. When all the blocks have been accounted for, the fsck command checks to see if the number of free blocks plus the number of blocks that are claimed by the inodes equal the total number of blocks in the file system. If anything is wrong with the block maps, the fsck command rebuilds them, leaving out blocks already allocated.

The summary information in the superblock includes a count of the total number of free blocks within the file system. The fsck command compares this count to the number of free blocks it finds within the file system. If the counts do not agree, the fsck command replaces the count in the superblock with the actual free-block count.

Free Inode Checks

The summary information in the superblock contains a count of the free inodes within the file system. The fsck command compares this count to the number of free inodes it finds within the file system. If the counts do not agree, fsck replaces the count in the superblock with the actual free inode count.

Inodes

The list of inodes is checked sequentially starting with inode 2 (inode 0 and inode 1 are reserved). Each inode is checked for inconsistencies in the following:

Format and Type of Inodes

Each inode contains a mode word, which describes the type and state of the inode. Inodes might be one of nine types:

Inodes might be in one of three states:

When the file system is created, a fixed number of inodes are set aside, but they are not allocated until they are needed. An allocated inode is one that points to a file. An unallocated inode does not point to a file and, therefore, should be empty. The partially allocated state means that the inode is incorrectly formatted. An inode can get into this state if, for example, bad data is written into the inode list because of a hardware failure. The only corrective action the fsck command can take is to clear the inode.

Link Count Checks

Each inode contains a count of the number of directory entries linked to it. The fsck command verifies the link count of each inode by examining the entire directory structure, starting from the root directory, and calculating an actual link count for each inode.

Discrepancies between the link count stored in the inode and the actual link count as determined by the fsck command might be of three types:

Duplicate Block Checks

Each inode contains a list, or pointers to lists (indirect blocks), of all the blocks claimed by the inode. Because indirect blocks are owned by an inode, inconsistencies in indirect blocks directly affect the inode that owns the indirect block.

The fsck command compares each block number claimed by an inode to a list of allocated blocks. If another inode already claims a block number, the block number is put on a list of duplicate blocks. Otherwise, the list of allocated blocks is updated to include the block number.

If there are any duplicate blocks, the fsck command makes a second pass of the inode list to find the other inode that claims each duplicate block. (A large number of duplicate blocks in an inode might be caused by an indirect block not being written to the file system.) It is not possible to determine with certainty which inode is in error. The fsck command prompts you to choose which inode should be kept and which should be cleared.

Bad Block Number Checks

The fsck command checks each block number claimed by an inode to see that its value is higher than that of the first data block and lower than that of the last data block in the file system. If the block number is outside this range, it is considered a bad block number.

Bad block numbers in an inode might be caused by an indirect block not being written to the file system. The fsck command prompts you to clear the inode.

Inode Size Checks

Each inode contains a count of the number of data blocks that it references. The number of actual data blocks is the sum of the allocated data blocks and the indirect blocks. The fsck command computes the number of data blocks and compares that block count against the number of blocks that the inode claims. If an inode contains an incorrect count, the fsck command prompts you to fix it.

Each inode contains a 64-bit size field. This field shows the number of characters (data bytes) in the file associated with the inode. A rough check of the consistency of the size field of an inode is done by using the number of characters shown in the size field to calculate how many blocks should be associated with the inode, and then comparing that to the actual number of blocks claimed by the inode.

Indirect Blocks

Indirect blocks are owned by an inode. Therefore, inconsistencies in an indirect block affect the inode that owns it. Inconsistencies that can be checked are the following:

These consistency checks listed are also performed for indirect blocks.

Data Blocks

An inode can directly or indirectly reference three kinds of data blocks. All referenced blocks must be of the same kind. The three types of data blocks are the following:

Plain data blocks contain the information stored in a file. Symbolic-link data blocks contain the path name stored in a symbolic link. Directory data blocks contain directory entries. The fsck command can check only the validity of directory data blocks.

Directories are distinguished from regular files by an entry in the mode field of the inode. Data blocks associated with a directory contain the directory entries. Directory data blocks are checked for inconsistencies involving the following:

Directory Unallocated Checks

If the inode number in a directory data block points to an unallocated inode, the fsck command removes the directory entry. This condition can occur if the data blocks that contain a new directory entry are modified and written out, but the inode does not get written out. This condition can occur if the CPU is shutdown abruptly.

Bad Inode Number Checks

If a directory entry inode number points beyond the end of the inode list, the fsck command removes the directory entry. This condition can occur when bad data is written into a directory data block.

Incorrect “.” and “..” Entry Checks

The directory inode number entry for “.” must be the first entry in the directory data block. The directory inode number must reference itself; that is, its value must be equal to the inode number for the directory data block.

The directory inode number entry for “..” must be the second entry in the directory data block. The directory inode number value must be equal to the inode number of the parent directory (or the inode number of itself if the directory is the root directory).

If the directory inode numbers for “.” and “..” are incorrect, the fsck command replaces them with the correct values. If there are multiple hard links to a directory, the first hard link found is considered the real parent to which “..” should point. In this case, the fsck command recommends that you have it delete the other names.

Disconnected Directories

The fsck command checks the general connectivity of the file system. If a directory is found that is not linked to the file system, the fsck command links the directory to the lost+found directory of the file system. This condition can occur when inodes are written to the file system, but the corresponding directory data blocks are not.

Regular Data Blocks

Data blocks associated with a regular file hold the contents of the file. The fsck command does not attempt to check the validity of the contents of a regular file's data blocks.

The fsck Summary Message

When you run the fsck command interactively and it completes successfully, a message similar to the following is displayed:


# fsck /dev/rdsk/c0t0d0s7
** /dev/rdsk/c0t0d0s7
** Last Mounted on /export/home
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
2 files, 9 used, 2833540 free (20 frags, 354190 blocks, 0.0% fragmentation)
# 

The last line of fsck output describes the following information about the file system:

# files

Number of inodes in use 

# used

Number of fragments in use 

# free

Number of unused fragments 

# frags

Number of unused non-block fragments 

# blocks

Number of unused full blocks 

% fragmentation

Percentage of fragmentation, where: free fragments x 100 / total fragments in the file system 

For information about fragments, see Fragment Size.

Interactively Checking and Repairing a UFS File System

You might need to interactively check file systems in the following instances:

When an in-use file system develops inconsistencies, error messages might be displayed in the console window or the system might crash.

Before using the fsck command, you might want to refer to Syntax and Options for the fsck Command and “Resolving UFS File System Inconsistencies (Tasks)” in System Administration Guide: Advanced Administration for more information.

How to See If a File System Needs Checking

  1. Become superuser or assume an equivalent role.

  2. Unmount the file system if it is mounted.


    # umount /mount-point
    
  3. Check the file system.


    # fsck -m /dev/rdsk/device-name
    

    The state flag in the superblock of the file system you specify is checked to see whether the file system is clean or requires checking.

    If you omit the device argument, all the UFS file systems listed in the /etc/vfstab file with a fsck pass value greater than 0 are checked.

Example—Seeing If a File System Needs Checking

The following example shows that the file system needs checking.


# fsck -m /dev/rdsk/c0t0d0s6
** /dev/rdsk/c0t0d0s6
ufs fsck: sanity check: /dev/rdsk/c0t0d0s6 needs checking

How to Check File Systems Interactively

  1. Become superuser or assume an equivalent role.

  2. Unmount the local file systems except root (/) and /usr.


    # umountall -l
    
  3. Check the file systems.


    # fsck
    

    All file systems in the /etc/vfstab file with entries in the fsck pass field greater than 0 are checked. You can also specify the mount point directory or /dev/rdsk/device-name as arguments to the fsck command. Any inconsistency messages are displayed.

    For information about how to respond to the error message prompts while interactively checking one or more UFS file systems, see “Resolving UFS File System Inconsistencies (Tasks)” in System Administration Guide: Advanced Administration.


    Caution – Caution –

    Running the fsck command on a mounted file system might cause a system to crash if the fsck command makes any changes, unless stated otherwise, such as running the fsck command in single-user mode to repair a file system.


  4. If you corrected any errors, type fsck and press Return.

    The fsck command might be unable to fix all errors in one execution. If you see the message FILE SYSTEM STATE NOT SET TO OKAY, run the command again. If that does not work, see Fixing a UFS File System That the fsck Command Cannot Repair.

  5. Rename and move any files put in the lost+found directory.

    Individual files put in the lost+found directory by the fsck command are renamed with their inode numbers. If possible, rename the files and move them where they belong. You might be able to use the grep command to match phrases with individual files and the file command to identify file types. When whole directories are put into the lost+found directory, it is easier to figure out where they belong and to move them back.

Example—Checking File Systems Interactively

The following example shows how to check the /dev/rdsk/c0t0d0s6 file system and corrects the incorrect block count.


# fsck /dev/rdsk/c0t0d0s6
checkfilesys: /dev/rdsk/c0t0d0s6
** Phase 1 - Check Block and Sizes
INCORRECT BLOCK COUNT I=2529 (6 should be 2)
CORRECT? y

** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Cylinder Groups
929 files, 8928 used, 2851 free (75 frags, 347 blocks, 0.6%
fragmentation)
/dev/rdsk/c0t0d0s6 FILE SYSTEM STATE SET TO OKAY
 
***** FILE SYSTEM WAS MODIFIED *****

Preening UFS File Systems

The fsck -o p command (p is for preen) checks UFS file systems and automatically fixes the problems that normally result from an unexpected system shutdown. This command exits immediately if it encounters a problem that requires operator intervention. This command also permits parallel checking of file systems.

You can run the fsck -o p command to preen the file systems after an unclean shutdown. In this mode, the fsck command does not look at the clean flag and does a full check. These actions are a subset of the actions that the fsck command takes when it runs interactively.

How to Preen a UFS File System

  1. Become superuser or assume an equivalent role.

  2. Unmount the UFS file system.


    # umount /mount-point
    
  3. Check the UFS file system with the preen option.


    # fsck -o p /dev/rdsk/device-name
    

    You can preen individual file systems by using /mount-point or /dev/rdsk/device-name as arguments to the fsck command.

Example—Preening a UFS File System

The following example shows how to preen the /usr file system.


# fsck -o p /usr

Fixing a UFS File System That the fsck Command Cannot Repair

Sometimes, you need to run the fsck command a few times to fix a file system because problems corrected on one pass might uncover other problems not found in earlier passes. The fsck command does not keep running until it comes up clean, so you must rerun it manually.

Pay attention to the information displayed by the fsck command. This information might help you fix the problem. For example, the messages might point to a damaged directory. If you delete the directory, you might find that the fsck command runs cleanly.

If the fsck command still cannot repair the file system, you can try to use the fsdb, ff, clri, and ncheck commands to figure out and fix what is wrong. For information about how to use these commands, see fsdb(1M), ff(1M), clri(1M), and ncheck(1M). You might, ultimately, need to re-create the file system and restore its contents from backup media.

For information about restoring complete file systems, see Chapter 49, Restoring Files and File Systems (Tasks).

If you cannot fully repair a file system but you can mount it read-only, try using the cp, tar, or cpio commands to retrieve all or part of the data from the file system.

If hardware disk errors are causing the problem, you might need to reformat and divide the disk into slices again before re-creating and restoring file systems. Hardware errors usually display the same error again and again across different commands. The format command tries to work around bad blocks on the disk. If the disk is too severely damaged, however, the problems might persist, even after reformatting. For information about using the format command, see format(1M). For information about installing a new disk, see Chapter 34, SPARC: Adding a Disk (Tasks) or Chapter 35, x86: Adding a Disk (Tasks).

Restoring a Bad Superblock

When the superblock of a file system becomes damaged, you must restore it. The fsck command tells you when a superblock is bad. Fortunately, copies of the superblock are stored within a file system. You can use the fsck -o b command to replace the superblock with one of the copies.

For more information about the superblock, see The Superblock.

If the superblock in the root (/) file system becomes damaged and you cannot restore it, you have two choices:

How to Restore a Bad Superblock

  1. Become superuser or assume an equivalent role.

  2. Determine whether the bad superblock is in the root (/) or /usr file system and select one of the following:

    1. Stop the system and boot from the network or a locally-connected CD if the bad superblock is in the root (/) or /usr file system.

      From a locally-connected CD, use the following command:


      ok boot cdrom -s
      

      From the network where a boot or install server is already setup, use the following command:


      ok boot net -s
      

      If you need help stopping the system, see SPARC: How to Stop the System for Recovery Purposes or x86: How to Stop a System for Recovery Purposes.

    2. Change to a directory outside the damaged file system and unmount the file system if the bad superblock is not in the root (/) or /usr file system.


      # umount /mount-point
      

      Caution – Caution –

      Be sure to use the newfs -N in the next step. If you omit the -N option, you will create a new, empty file system.


  3. Display the superblock values with the newfs -N command.


    # newfs -N /dev/rdsk/device-name
    

    The output of this command displays the block numbers that were used for the superblock copies when the newfs command created the file system, unless the file system was created with special parameters. For information on creating a customized file system, see Custom File System Parameters.

  4. Provide an alternate superblock with the fsck command.


    # fsck -F ufs -o b=block-number /dev/rdsk/device-name
    

    The fsck command uses the alternate superblock you specify to restore the primary superblock. You can always try 32 as an alternate block, or use any of the alternate blocks shown by the newfs -N command.

Example—Restoring a Bad Superblock

The following example shows how to restore the superblock copy 5264.


# newfs -N /dev/rdsk/c0t3d0s7
/dev/rdsk/c0t3d0s7: 163944 sectors in 506 cylinders of 9 tracks, 36 sectors
 83.9MB in 32 cyl groups (16 c/g, 2.65MB/g, 1216 i/g)
super-block backups (for fsck -b #) at:
 32, 5264, 10496, 15728, 20960, 26192, 31424, 36656, 41888,
 47120, 52352, 57584, 62816, 68048, 73280, 78512, 82976, 88208,
 93440, 98672, 103904, 109136, 114368, 119600, 124832, 130064, 135296,
 140528, 145760, 150992, 156224, 161456,
# fsck -F ufs -o b=5264 /dev/rdsk/c0t3d0s7
Alternate superblock location: 5264.
** /dev/rdsk/c0t3d0s7
** Last Mounted on
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
36 files, 867 used, 75712 free (16 frags, 9462 blocks, 0.0% fragmentation)
/dev/rdsk/c0t3d0s7 FILE SYSTEM STATE SET TO OKAY
 
***** FILE SYSTEM WAS MODIFIED *****
# 

Syntax and Options for the fsck Command

The fsck command checks and repairs inconsistencies in file systems. If you run the fsck command without any options, it interactively asks for confirmation before making repairs. This command has four options:

Command and Option 

Description 

fsck -m

Checks whether a file system can be mounted 

fsck -y

Assumes a yes response for all repairs 

fsck -n

Assumes a no response for all repairs 

fsck -o p

Noninteractively preens the file system, fixing all expected (innocuous) inconsistencies, but exits when a serious problem is encountered