This section describes how to troubleshoot a totally unresponsive Directory Server process. A totally unresponsive process is called a hang, and there are two types of hang you might experience:
Active hang, when the CPU level is at 100%. For example, the process encounters an infinite loop meaning it waits forever waiting for and servicing a request.
Passive hang, when the CPU level is at 0%. For example, the process encounters a deadlock where two or more threads of a process are waiting for the other to finish, and thus neither ever does.
The remainder of this section describes how to troubleshoot each of these types of process hang.
A hang is active if the top or vmstat 1 output show CPU levels of over 95%.
This section describes the causes of an active hang, how to collect information about an active hang, and out to analyze this data.
Possible causes of an active hang include the following:
An infinite loop
Retry of an unsuccessful operation, such as a replication operation or a bad commit
On a Solaris system, collect several traces of the Directory Server process stack that is hanging using the Solaris pstack utility. Run the command from the root-dir/bin/slapd/server directory. You should also collect statistics about the active process using the Solaris prstat utility. You must collect this information while the server is hanging.
The consecutive pstack and prstat data should be collected every second.
A hang is passive if the top or vmstat 1 output show low CPU levels.
Possible causes of a passive hang include the following:
A deadlock resulting from locks or conditional variables
A defunct thread
On a Solaris system, collect several traces of the Directory Server process stack that is hanging using the Solaris pstack utility. Run the command from the root-dir/bin/slapd/server directory. You must collect this information while the server is hanging. The consecutive pstack data should be collected every three seconds.
Collect several core files that show the state of the server threads while the server is hanging. Do this by generating a core file using the gcore command, changing the name of the core file, waiting 30 seconds, and generating another core file. Repeat the process as least once to get a minimum of three sets of core files and related data.
For more information about generating a core file, see Generating a Core File.