Troubleshooting a Hung or Unresponsive Directory Proxy Server Process (Sun Java System Directory Server Enterprise Edition 6.1 Troubleshooting Guide)

Sun Java System Directory Server Enterprise Edition 6.1 Troubleshooting Guide

Troubleshooting a Hung or Unresponsive Directory Proxy Server Process

This section describes how to troubleshoot a unresponsive or hung Directory Proxy Server process. A totally unresponsive process is called a hang. The remainder of this section describes how to collect and analyze data about a hang.

Collecting Data About a Directory Proxy Server 6.1 Hang on Solaris

The jstat tool tells you the amount of CPU being used for each thread. If you collect a thread stack using the jstack utility at the same time you run the jstat tool, you can then use the jstack output to see what the thread was doing when it had trouble. If you run the jstack and jstat tools simultaneously several times, you can see over time if the same thread was causing the problem and if it was encountering the problem during the same function call.

To get the process ID of the running Directory Proxy Server, use the jps command. For example, the command is run as follows on Solaris:

# jps
8393 DistributionServerMain
2115 ContainerPrivate
21535 startup.jar
16672 Jps
13953 swupna.jar

Collect usage information as follows:

# ./scp DPS-PID

The DPS-PID field specifies the PID of the unresponsive process.

On Solaris and other UNIX platforms, show system calls that occur during the crash using the truss command as follows:

truss -o /tmp/trace.txt -ealf -rall -wall -vall -p 21362

The value 21362 corresponds to the PID of the unresponsive process.

Collecting and Analyzing Data About a Directory Proxy Server 5.x Hang on Solaris

The prstat tool tells you the amount of CPU being used for each thread. If you collect a process stack using the pstack utility at the same time you run the prstat tool, you can then use the pstack output to see what the thread was doing when it had trouble. If you run the prstat and pstack tools simultaneously several times, then you can see over time if the same thread was causing the problem and if it was encountering the problem during the same function call.

Note –

On Linux, use the lsstack or pstack commands instead of the Solaris pstack utility.

The following script automates the process of running these tools:

cat scp
#!/bin/sh  

i=0 
while [ "$i" -lt "10" ] 
do
         echo "$i\n"
         date=`date "+%y%m%d:%H%M%S"`
         prstat -L -p $1 0 1 > /tmp/prstat.$date
         pstack $1 > /tmp/pstack.$date
         i=`expr $i + 1`;
         sleep 1 
done

The value 10 in the [ "$i" -lt "10" ] line can be increased or decreased to suit the time during which the problem you are troubleshooting occurs. This adjustment allows to you collect a full set of process data to help troubleshoot the issue. Thus enabling a full process data set to be captured around the issue.

Collect usage information as follows:

# ./scp DPS-PID

The DPS-PID field specifies the PID of the unresponsive process.

On Solaris and other UNIX platforms, show system calls that occur during the crash using the truss command as follows:

truss -o /tmp/trace.txt -ealf -rall -wall -vall -p 21362

The value 21362 corresponds to the PID of the unresponsive ldapfwd process.

Analyzing Data About a Hang

Whenever the Directory Proxy Server crashes, it generates a core. With this core file and the process stack of the core file you obtained, you can analyze the problem. For information about analyzing a core file, see Examining a Core File on Solaris. However, rather than running the utility from the ns-slapd binary directory, you must run it from the

For example, the output of the truss command shows that no systems calls have been made at the time of the crash, suggesting a passive hang. Looking at the core file and the jstack or pstack information, you can identify several threads that are waiting for a lock to be freed to continue processing. By comparing the out put of the various tools you can safely guess that the cause of the problem is a deadlock. With this information, Sun Support can better help you resolve your problem in a timely fashion.