This section describes how to troubleshoot a unresponsive or hung Directory Proxy Server process. A totally unresponsive process is called a hang. The remainder of this section describes how to collect and analyze data about a hang.
The jstat tool tells you the amount of CPU being used for each thread. If you collect a thread stack using the jstack utility at the same time you run the jstat tool, you can then use the jstack output to see what the thread was doing when it had trouble. If you run the jstack and jstat tools simultaneously several times, you can see over time if the same thread was causing the problem and if it was encountering the problem during the same function call.
To get the process ID of the running Directory Proxy Server, use the jps command. For example, the command is run as follows on Solaris:
# jps 8393 DistributionServerMain 2115 ContainerPrivate 21535 startup.jar 16672 Jps 13953 swupna.jar |
Collect usage information as follows:
# ./scp DPS-PID |
The DPS-PID field specifies the PID of the unresponsive process.
On Solaris and other UNIX platforms, show system calls that occur during the crash using the truss command as follows:
truss -o /tmp/trace.txt -ealf -rall -wall -vall -p 21362 |
The value 21362 corresponds to the PID of the unresponsive process.
The prstat tool tells you the amount of CPU being used for each thread. If you collect a process stack using the pstack utility at the same time you run the prstat tool, you can then use the pstack output to see what the thread was doing when it had trouble. If you run the prstat and pstack tools simultaneously several times, then you can see over time if the same thread was causing the problem and if it was encountering the problem during the same function call.
On Linux, use the lsstack or pstack commands instead of the Solaris pstack utility.
The following script automates the process of running these tools:
cat scp #!/bin/sh i=0 while [ "$i" -lt "10" ] do echo "$i\n" date=`date "+%y%m%d:%H%M%S"` prstat -L -p $1 0 1 > /tmp/prstat.$date pstack $1 > /tmp/pstack.$date i=`expr $i + 1`; sleep 1 done |
The value 10 in the [ "$i" -lt "10" ] line can be increased or decreased to suit the time during which the problem you are troubleshooting occurs. This adjustment allows to you collect a full set of process data to help troubleshoot the issue. Thus enabling a full process data set to be captured around the issue.
Collect usage information as follows:
# ./scp DPS-PID |
The DPS-PID field specifies the PID of the unresponsive process.
On Solaris and other UNIX platforms, show system calls that occur during the crash using the truss command as follows:
truss -o /tmp/trace.txt -ealf -rall -wall -vall -p 21362 |
The value 21362 corresponds to the PID of the unresponsive ldapfwd process.
Whenever the Directory Proxy Server crashes, it generates a core. With this core file and the process stack of the core file you obtained, you can analyze the problem. For information about analyzing a core file, see Examining a Core File on Solaris. However, rather than running the utility from the ns-slapd binary directory, you must run it from the
For example, the output of the truss command shows that no systems calls have been made at the time of the crash, suggesting a passive hang. Looking at the core file and the jstack or pstack information, you can identify several threads that are waiting for a lock to be freed to continue processing. By comparing the out put of the various tools you can safely guess that the cause of the problem is a deadlock. With this information, Sun Support can better help you resolve your problem in a timely fashion.