6.3.4 Debugging File System Locks

The software described in this documentation is either in Extended Support or Sustaining Support. See https://www.oracle.com/us/support/library/enterprise-linux-support-policies-069172.pdf for more information.
Oracle recommends that you upgrade the software described by this documentation as soon as possible.

6.3.4 Debugging File System Locks

If an OCFS2 volume hangs, you can use the following steps to help you determine which locks are busy and the processes that are likely to be holding the locks.

Mount the debug file system.

# mount -t debugfs debugfs /sys/kernel/debug

Dump the lock statuses for the file system device (/dev/sdx1 in this example).

# echo "fs_locks" | debugfs.ocfs2 /dev/sdx1 >/tmp/fslocks 62
Lockres: M00000000000006672078b84822 Mode: Protected Read
Flags: Initialized Attached
RO Holders: 0 EX Holders: 0
Pending Action: None Pending Unlock Action: None
Requested Mode: Protected Read Blocking Mode: Invalid

The Lockres field is the lock name used by the DLM. The lock name is a combination of a lock-type identifier, an inode number, and a generation number. The following table shows the possible lock types.

Identifier	Lock Type
`D`	File data.
`M`	Metadata.
`R`	Rename.
`S`	Superblock.
`W`	Read-write.

Use the Lockres value to obtain the inode number and generation number for the lock.

# echo "stat <M00000000000006672078b84822>" | debugfs.ocfs2 -n /dev/sdx1
Inode: 419616   Mode: 0666   Generation: 2025343010 (0x78b84822)
...

Determine the file system object to which the inode number relates by using the following command.
```
# echo "locate <419616>" | debugfs.ocfs2 -n /dev/sdx1
419616 /linux-2.6.15/arch/i386/kernel/semaphore.c
```
Obtain the lock names that are associated with the file system object.
```
# echo "encode /linux-2.6.15/arch/i386/kernel/semaphore.c" | \
  debugfs.ocfs2 -n /dev/sdx1
M00000000000006672078b84822 D00000000000006672078b84822 W00000000000006672078b84822  
```
In this example, a metadata lock, a file data lock, and a read-write lock are associated with the file system object.

Determine the DLM domain of the file system.

# echo "stats" | debugfs.ocfs2 -n /dev/sdX1 | grep UUID: | while read a b ; do echo $b ; done
82DA8137A49A47E4B187F74E09FBBB4B

Use the values of the DLM domain and the lock name with the following command, which enables debugging for the DLM.
```
# echo R 82DA8137A49A47E4B187F74E09FBBB4B \
  M00000000000006672078b84822 > /proc/fs/ocfs2_dlm/debug  
```

Examine the debug messages.

# dmesg | tail
struct dlm_ctxt: 82DA8137A49A47E4B187F74E09FBBB4B, node=3, key=965960985
  lockres: M00000000000006672078b84822, owner=1, state=0 last used: 0, 
  on purge list: no granted queue:
      type=3, conv=-1, node=3, cookie=11673330234144325711, ast=(empty=y,pend=n), 
      bast=(empty=y,pend=n) 
    converting queue:
    blocked queue:

The DLM supports 3 lock modes: no lock (type=0), protected read (type=3), and exclusive (type=5). In this example, the lock is controlled by node 1 (owner=1) and node 3 has been granted a protected-read lock on the file-system resource.

Run the following command, and look for processes that are in an uninterruptable sleep state as shown by the D flag in the STAT column.
```
# ps -e -o pid,stat,comm,wchan=WIDE-WCHAN-COLUMN
```
At least one of the processes that are in the uninterruptable sleep state will be responsible for the hang on the other node.

If a process is waiting for I/O to complete, the problem could be anywhere in the I/O subsystem from the block device layer through the drivers to the disk array. If the hang concerns a user lock (flock()), the problem could lie in the application. If possible, kill the holder of the lock. If the hang is due to lack of memory or fragmented memory, you can free up memory by killing non-essential processes. The most immediate solution is to reset the node that is holding the lock. The DLM recovery process can then clear all the locks that the dead node owned, so letting the cluster continue to operate.