Sun Java System Directory Server Enterprise Edition 6.3 Troubleshooting Guide

Troubleshooting Problems With the Directory Proxy Server Process

This section describes procedures for the following:

Overview of Process Troubleshooting Tools

Some tools are provided with Solaris and Java which may help you troubleshoot process issues. The following sections provide an overview of some of the most useful tools

Using Java Tools With Directory Proxy Server 6.3

As Directory Proxy Server 6.3 is a pure Java application, you can use the Java tools that are delivered with the JDK 1.5 to help troubleshoot problems. These tools include the following:

jstack. This tool provides information about the Directory Proxy Server thread stack.
jmap. This tool provides information about memory. For example, running jmap —histo PID prints a histogram of the heap.
jinfo. This tool provides you with information about the JVM environment.
jstat. This tool displays performance statistics for a JVM.

On Solaris, you can find these tools in the following location:

/usr/lang/JAVA/jdk1.5.0_03/solaris-sparc/bin

The JVM also includes a graphical tool for monitoring the Java virtual machine called the Java Monitoring and Management Console (JConsole) tool. This tool uses the Java virtual machine to provide information on performance and resource consumption of applications running on the Java platform using Java Management Extension (JMX) technology. JConsole can be used to observe information about an application running on the Java platform. The JConsole provides information and charts about memory use, thread use, class loading, and JVM parameters

On Unix platforms, if the kill -QUIT process-id command is used to get thread dump and it does not work, use jstack.

Using Solaris Tools With Directory Proxy Server 5.x

Solaris includes a collection of process tools to help you collect more information about process problems, such as a hung process, crashed process, or memory usage problems. These tools include the following:

pmap — shows the process map, which includes a list of virtual addresses, where the dynamic libraries are loaded, and where the variables are declared.
pstack — shows the process stack. For each thread in the process, it describes the exact stack of instruction the thread was executing at the moment when the process died or when the pstack command was executed.
pfiles— reports information about all open files in each process.
pldd — list the dynamic libraries linked into each process.

Troubleshooting a Hung or Unresponsive Directory Proxy Server Process

This section describes how to troubleshoot a unresponsive or hung Directory Proxy Server process. A totally unresponsive process is called a hang. The remainder of this section describes how to collect and analyze data about a hang.

Collecting Data About a Directory Proxy Server 6.3 Hang on Solaris

The jstat tool tells you the amount of CPU being used for each thread. If you collect a thread stack using the jstack utility at the same time you run the jstat tool, you can then use the jstack output to see what the thread was doing when it had trouble. If you run the jstack and jstat tools simultaneously several times, you can see over time if the same thread was causing the problem and if it was encountering the problem during the same function call.

To get the process ID of the running Directory Proxy Server, use the jps command. For example, the command is run as follows on Solaris:

# jps
8393 DistributionServerMain
2115 ContainerPrivate
21535 startup.jar
16672 Jps
13953 swupna.jar

Collect usage information as follows:

# ./scp DPS-PID

The DPS-PID field specifies the PID of the unresponsive process.

On Solaris and other UNIX platforms, show system calls that occur during the crash using the truss command as follows:

truss -o /tmp/trace.txt -ealf -rall -wall -vall -p 21362

The value 21362 corresponds to the PID of the unresponsive process.

Collecting and Analyzing Data About a Directory Proxy Server 5.x Hang on Solaris

The prstat tool tells you the amount of CPU being used for each thread. If you collect a process stack using the pstack utility at the same time you run the prstat tool, you can then use the pstack output to see what the thread was doing when it had trouble. If you run the prstat and pstack tools simultaneously several times, then you can see over time if the same thread was causing the problem and if it was encountering the problem during the same function call.

Note –

On Linux, use the lsstack or pstack commands instead of the Solaris pstack utility.

The following script automates the process of running these tools:

cat scp
#!/bin/sh  

i=0 
while [ "$i" -lt "10" ] 
do
         echo "$i\n"
         date=`date "+%y%m%d:%H%M%S"`
         prstat -L -p $1 0 1 > /tmp/prstat.$date
         pstack $1 > /tmp/pstack.$date
         i=`expr $i + 1`;
         sleep 1 
done

The value 10 in the [ "$i" -lt "10" ] line can be increased or decreased to suit the time during which the problem you are troubleshooting occurs. This adjustment allows to you collect a full set of process data to help troubleshoot the issue. Thus enabling a full process data set to be captured around the issue.

Collect usage information as follows:

# ./scp DPS-PID

The DPS-PID field specifies the PID of the unresponsive process.

On Solaris and other UNIX platforms, show system calls that occur during the crash using the truss command as follows:

truss -o /tmp/trace.txt -ealf -rall -wall -vall -p 21362

The value 21362 corresponds to the PID of the unresponsive ldapfwd process.

Analyzing Data About a Hang

Whenever the Directory Proxy Server crashes, it generates a core. With this core file and the process stack of the core file you obtained, you can analyze the problem. For information about analyzing a core file, see Examining a Core File on Solaris. However, rather than running the utility from the ns-slapd binary directory, you must run it from the

For example, the output of the truss command shows that no systems calls have been made at the time of the crash, suggesting a passive hang. Looking at the core file and the jstack or pstack information, you can identify several threads that are waiting for a lock to be freed to continue processing. By comparing the out put of the various tools you can safely guess that the cause of the problem is a deadlock. With this information, Sun Support can better help you resolve your problem in a timely fashion.

Troubleshooting a Crashed Directory Proxy Server Process

Core file and crash dumps are generated when a process or application terminate abnormally. Analyzing these files can help you identify the cause of your problem.

This section includes the following topics:

Getting the Core and Shared Libraries

Get all the libraries and binaries associated with the Directory Proxy Server process for core file analysis. You must configure your system to allows Directory Proxy Server to generate a core file if the server crashes. For more information about generating a core file, see Generating a Core File.

Collect the libraries using the pkg_app script . The pkg_app script packages an executable and all of its shared libraries into one compressed tar file. You provide the process ID of the application and, optionally, the name of the core file to be opened. For more information about the pkg_app script see Using the pkg_app Script on Solaris.

As superuser, run the pkg_app script as follows:

# pkg_app pid core-file

Note –

You can also run the pkg_app script without a core file. This reduces the size of the script output. You need to later set the variable to the correct location of the core file.

Analyzing the Directory Proxy Server 6.3 Core Data on Solaris

Once you have obtained a core file, run the jstack and jpmap Java tools on the file.

Run the jstack utility as follows:

# jstack process-ID

For example, the jstack utility creates the following output:

# jstack 8393
Attaching to process ID 8393, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 1.5.0_03-b07
Thread t@1: (state = BLOCKED)

Thread t@42: (state = IN_NATIVE) 
 - sun.nio.ch.ServerSocketChannelImpl.accept0(java.io.FileDescriptor, \
java.io.FileDescriptor, java.net.InetSocketAddress[]) (Interpreted frame)
 - sun.nio.ch.ServerSocketChannelImpl.accept0(java.io.FileDescriptor, \
java.io.FileDescriptor, java.net.InetSocketAddress[]) (Interpreted frame)
 - sun.nio.ch.ServerSocketChannelImpl.accept() @bci=130, line=145 \
(Interpreted frame)
 - com.sun.directory.proxy.extensions.ExtendedTCPClientListener.run() \
@bci=267, line=190 (Interpreted frame)

Thread t@41: (state = IN_NATIVE)
 - sun.nio.ch.ServerSocketChannelImpl.accept0(java.io.FileDescriptor, \
java.io.FileDescriptor, java.net.InetSocketAddress[]) (Interpreted frame)
 - sun.nio.ch.ServerSocketChannelImpl.accept0(java.io.FileDescriptor, \
java.io.FileDescriptor, java.net.InetSocketAddress[]) (Interpreted frame)
 - sun.nio.ch.ServerSocketChannelImpl.accept() @bci=130, line=145 \
(Interpreted frame)
 - com.sun.directory.proxy.extensions.ExtendedTCPClientListener.run() \
@bci=267, line=190 (Interpreted frame)

Analyzing the Directory Proxy Server 5.x Core Data on Solaris

Once you have obtained a core file, run the pstack and pmap Solaris utilities on the file.

Run the pstack utility as follows:

# pstack core-file > /tmp/pstack.txt

For example, the pstack utility creates the following output:

core '/var/core/core_dps-dr-zone1_ldapfwd_0_0_1156942096_3167' of 3167: ./ldapfwd 
-t /var/opt/mps/serverroot/dps-dps-dr-zone1/etc/tailor.txt
-----------------  lwp# 1 / thread# 1  --------------------
 fedc0b6c __pollsys (ffbff680, 2, ffbff610, 0, 0, 1770) + 8
 fed5cea8 poll     (ffbff680, 2, 1770, 10624c00, 0, 0) + 7c
 ff19c610 _pr_poll_with_poll (1770, ffbff680, 927c0, ffbff91c, 2, 1) + 42c
 00039504 __1cLCAI_LF_CmgrDrun6F_v_ (75, 115d74, 11202c, 2, 88c00, 116510) + 1a8
 00062070 ldapfwdMain (0, ffbffa84, c, 9952c, feb60740, feb60780) + 1c
 0002f968 _start   (0, 0, 0, 0, 0, 0) + 108
-----------------  lwp# 3 / thread# 3  --------------------
 fedc0b6c __pollsys (fea19b70, 3, fea19b00, 0, 0, 3e8) + 8
 fed5cea8 poll     (fea19b70, 3, 3e8, 10624c00, 0, 0) + 7c
 ff19c610 _pr_poll_with_poll (3e8, fea19b70, 186a0, fea19e8c, 3, 2) + 42c
 0005de38 __1cIidarPollEwait6MI_i_ (fea19e6c, 186a0, 16e360, 1, 0, 1) + 20
 000639ec __1cJHeartbeatJheartbeat6Fpv_1_ (158f40, 1, 14da70, faa36, 16e360, faa5d) + 114
 fedbfd9c _lwp_start (0, 0, 0, 0, 0, 0)
-----------------  lwp# 136734 / thread# 136734  --------------------
 fedbfe40 __lwp_park (0, 0, 116710, 0, 0, 1) + 14
 00076548 __1cMCAI_LF_MutexHacquire6M_v_ (116708, fd2bc, 0, 46, fea24800, 1000) + 34
 00076158 __1cPCAI_LF_RefCountGAddRef6M_v_ (116708, 156510, 2400, 26dc, 11667c, 800) + 24
 0006ddb0 __1cVCAI_LF_ReferralServer2t5B6MpnPCAI_LF_ConnPair_pnVCAI_LF_RequestMessage_pnNCAI_LF_Server
_ipcibbi_v_ (197198, fe7790d8, 1541c0, 156510, 185, 1565b8) + 50
 00046324 __1cPCAI_LF_ConnPairYstart_referral_operation6MipCpnNCAI_LF_Server_i_i_ (fe7790d8, 1, 1565b8
, 156510, 1, 197198) + 26c
 0004f2a0 __1cPCAI_LF_ConnPairLsend_result6MrnOCAI_LF_Message_rnVCAI_LF_RequestMessage__v_ (fe7790d8, 
156eb8, 1541c0, 2400, 0, 0) + 354
 00044758 __1cPCAI_LF_ConnPairOinner_activity6MpnOCAI_LF_Message__v_ (fe7790d8, 156eb8, 11202c, f4f27,
 fe7790f8, 1) + 114c
 00045c24 __1cPCAI_LF_ConnPairDrun6M_v_ (fe7797c8, 198383, 170c78, fe7790d8, fe779e54, fe779740) + 6cc
 00046cd0 CAI_LF_StartFunction (157500, 11202c, 1, 0, 46ab4, 1) + 21c
 fedbfd9c _lwp_start (0, 0, 0, 0, 0, 0)
-----------------  lwp# 136735 / thread# 136735  --------------------
 fedc0b6c __pollsys (fe9e8d98, 3, fe9e8d28, 0, 0, 1d4c0) + 8
 fed5cea8 poll     (fe9e8d98, 3, 1d4c0, 10624c00, 0, 0) + 7c
 ff19c610 _pr_poll_with_poll (1d4c0, fe9e8d98, b71b00, fe9e9e74, 3, 2) + 42c
 0005de38 __1cIidarPollEwait6MI_i_ (fe9e9e54, b71b00, 1, f4f27, fe9e90f8, 12d8d0) + 20
 00045cbc __1cPCAI_LF_ConnPairDrun6M_v_ (fe9e97c8, 17bb43, 130ff8, fe9e90d8, fe9e9e54, fe9e9740) + 764
 00046cd0 CAI_LF_StartFunction (157500, 11202c, 1, 0, 46ab4, 1) + 21c
 fedbfd9c _lwp_start (0, 0, 0, 0, 0, 0)
-----------------  lwp# 136738 / thread# 136738  --------------------
 fedc0b6c __pollsys (fe9a8d98, 3, fe9a8d28, 0, 0, 1d4c0) + 8
 fed5cea8 poll     (fe9a8d98, 3, 1d4c0, 10624c00, 0, 0) + 7c
 ff19c610 _pr_poll_with_poll (1d4c0, fe9a8d98, b71b00, fe9a9e74, 3, 2) + 42c
 0005de38 __1cIidarPollEwait6MI_i_ (fe9a9e54, b71b00, 1, f4f27, fe9a90f8, 0) + 20
 00045cbc __1cPCAI_LF_ConnPairDrun6M_v_ (fe9a97c8, 198123, 130fd8, fe9a90d8, fe9a9e54, fe9a9740) + 764
 00046cd0 CAI_LF_StartFunction (157500, 11202c, 1, 0, 46ab4, 1) + 21c
 fedbfd9c _lwp_start (0, 0, 0, 0, 0, 0)
-----------------  lwp# 136788 / thread# 136788  --------------------
 00197d68 ???????? (116708, 1, 156510, 0, 0, 1000)
 0006dee4 __1cVCAI_LF_ReferralServer2T6M_v_ (155780, 1000, fedecbc0, fea25c00, fe779780, 0) + 34
 000707c4 __SLIP.DELETER__A (155780, 1, 4, a6c, 1, 2400) + 4
 00046a94 CAI_LF_ReferralStartFunction (155780, fe6da000, 0, 0, 46a64, 1) + 30
 fedbfd9c _lwp_start (0, 0, 0, 0, 0, 0)

You can also use the mdb or adb command instead of the pstack command to show the stack of the core. The mdb command is a modular debugger and the adb command is a general purpose debugger that is part of Solaris. Run the mdb command as follows:

# mdb $path-to-executable $path-to-core
$C to show the core stack
$q to quit

The output of the mdb and the pstack commands provide helpful information about the process stack at the time of the crash. The mdb $C command output provides the exact thread that was executing at the time of the crash.

On Solaris 8 and 9, the first thread of the pstack output often contains the thread responsible for the crash. On Solaris 10, use mdb to find the crashing thread or, if using the pstack command, analyze the stack by looking for threads that do not contain lwp-park, poll, and pollsys.

On Solaris, you can also use the Solaris dbx symbolic debugger, which is a developer tool available free from http://sun.com/. The dbx tool provides symbolic debugging, and includes variables that can be manipulated. For example, the dbx debugger produces the following output:

current thread: t@2482121
=>[1] 0x0(0x156138, 0x0, 0xff000000, 0xfedefad4, 0x2, 0x2), at 0xffffffffffffffff
  [2] CAI_LF_RefCount::Release(0x116708, 0x1, 0x156138, 0x0, 0x0, 0x1000), at 0x7629c
  [3] CAI_LF_ReferralServer::~CAI_LF_ReferralServer(0x241270, 0x1000, 0xfedecbc0, 0xfe097c00, 
0xfe8d9780, 0x0), at 0x6dee4
  [4] __SLIP.DELETER__A(0x241270, 0x1, 0x4, 0xa6c, 0x1, 0x2400), at 0x707c4
  [5] CAI_LF_ReferralStartFunction(0x241270, 0xfe81a000, 0x0, 0x0, 0x46a64, 0x1), at 0x46a94

Analyzing the Directory Proxy Server 5.x Core Data on Linux

On Linux, use the lsstack or pstack commands instead of the Solaris pstack utility. For example, run the lsstack command as follows:

# lsstack /tmp/core-file

You can also use the GNU project debugger, gdb, to see what is happening at the time of the crash. Run this debugger as follows:

# gdb ./ldapfwd /tmp/core-file

For more information about the useful commands available with the gdb tool, see the gdb man page.

Analyzing the Directory Proxy Server 5.x Core Data on HP-UX

As for Linux, on HP-UX you can also use the GNU project debugger to see what is happening at the time of the crash. Run this debugger as follows:

# gdb ./ldapfwd /tmp/core-file

For more information about the useful commands available with the gdb tool, see the gdb man page.

Analyzing the Directory Proxy Server 5.x Core Data on Windows

On Windows, you can use the WinDbg debugger, which provides a UI for kernel and NT debugging. It can function both as a kernel-mode and user-mode debugger. You run this debugger on a Windows crash dump file.