4.2.4 Parameters that Control Kernel Panics

The following parameters control the circumstances under which a kernel panic can occur:

kernel.hung_task_panic

If set to 1, the kernel panics if any user or kernel thread sleeps in the TASK_UNINTERRUPTIBLE state (D state) for more than kernel.hung_task_timeout_secs seconds. A process remains in D state while waiting for I/O to complete. You cannot kill or interrupt a process in this state.

The default value is 0, which disables the panic.

Tip

To diagnose a hung thread, you can examine /proc/PID/stack, which displays the kernel stack for both kernel and user threads.

kernel.hung_task_timeout_secs

Specifies how long a user or kernel thread can remain in D state before a message is generated or the kernel panics (if the value of kernel.hung_task_panic is 1). The default value is 120 seconds.

kernel.panic

Specifies the number of seconds after a panic before a system will automatically reset itself.

If the value is 0, the system hangs, which allows you to collect detailed information about the panic for troubleshooting. This is the default value.

To enable automatic reset, set a non-zero value. If you require a memory image (vmcore), allow enough time for Kdump to create this image. The suggested value is 30 seconds, although large systems will require a longer time.

kernel.panic_on_oops

If set to 0, the system tries to continue operations if the kernel encounters an oops or BUG condition. When set to 1 (default), the system delays a few seconds to give the kernel log daemon, klogd, time to record the oops output before the panic occurs.

In an OCFS2 cluster. set the value to 1 to specify that a system must panic if a kernel oops occurs. If a kernel thread required for cluster operation crashes, the system must reset itself. Otherwise, another node might not be able to tell whether a node is slow to respond or unable to respond, causing cluster operations to hang.

vm.panic_on_oom

If set to 0 (default), the kernel’s OOM-killer scans through the entire task list and attempts to kill a memory-hogging process to avoid a panic. When set to 1, the kernel panics but can survive under certain conditions. If a process limits allocations to certain nodes by using memory policies or cpusets, and those nodes reach memory exhaustion status, the OOM-killer can kill one process. No panic occurs in this case because other nodes’ memory might be free and the system as a whole might not yet be out of memory. When set to 2, the kernel always panics when an OOM condition occurs. Settings of 1 and 2 are for intended for use with clusters, depending on your preferred failover policy.