Sun Java System Directory Server Enterprise Edition 6.2 Troubleshooting Guide

Chapter 4 Troubleshooting Directory Proxy Server

This chapter describes how to troubleshoot problem you encounter with Directory Proxy Server. It contains the following sections:

Collecting Generic Directory Proxy Server Data

No matter the type of problem you are encountering, there is a minimum set of data that needs to be collected and, if necessary, provided to Sun Support.

Collecting Version Information for Directory Proxy Server

The following sections describe how to collect configuration information on current and previous versions of Directory Proxy Server.

Collecting Directory Proxy Server 6.2 Version Information

Collect the Directory Proxy Server 6.2 version information. This information is available in the instance-dir/logs/error file. For example, the error log displays the version information as follows:


[21/May/2007:18:01:27 +0200] - STARTUP    - INFO  - \
Sun-Java(tm)-System-Directory-Proxy-Server/6.1 B2007.134.2156 started \
on host server1 in directory /local/dps.3333

Collecting Directory Proxy Server 5.x Version Information

If you are using migrated Directory Proxy Server 5.x instances, collect the version information as follows:


# install-path/bin/dps/server/bin/ldapfwd -v

On UNIX and Linux systems, you might see the following error:


ld.so.1: ldapfwd: fatal: libnss3.so: open failed: No such file or directory

If you see this error, set the LD_LIBRARY_PATH to include Directory Proxy Server libraries in your load path. For example, if you use sh, use the following command:


# export LD_LIBRARY_PATH=install-path/lib

Running the dpadm Command in Verbose Mode

Running the dpadm command in verbose mode will provide information to help troubleshoot problems that occur during instance creation or deletion, data backup, and so on. Run the dpadm is verbose mode as follows:


# dpadm -v

Collecting Directory Proxy Server Configuration Information

The following sections describe how to collect configuration information on current and previous versions of Directory Proxy Server.

Collecting Configuration Information on Directory Proxy Server 6.2

Collect the Directory Proxy Server 6.2 configuration information. This information is available in the instance-dir/logs/error file. For example, the error log displays the configuration information as follows:


user@server1 local]$ more dps.3333/logs/errors
[21/May/2007:18:01:27 +0200] - STARTUP    - INFO  - Global log level INFO (from config)
[21/May/2007:18:01:27 +0200] - STARTUP    - INFO  - Logging Service configured
[21/May/2007:18:01:27 +0200] - STARTUP    - INFO  - Java Version: 1.5.0_09 (Java Home: \
/local/jre)
[21/May/2007:18:01:27 +0200] - STARTUP    - INFO  - Java Heap Space: Total Memory (-Xms) \
= 246MB, Max Memory (-Xmx) = 246MB
[21/May/2007:18:01:27 +0200] - STARTUP    - INFO  - Operating System: \
Linux/i386 2.6.17-1.2139_FC5smp

Collecting Configuration Information on Directory Proxy Server 5.x

Collect the Directory Proxy Server 5.x configuration information as follows:


# cd install-path/bin/dps_utilities
# ./dpsconfig2ldif -t install-path/dps-name/etc/tailor.txt.backup \
-o /tmp/DPS_tailor_Config.ldif

The DPS_tailor_Config.ldif file contains configuration information formatted as follows:


Begin configuration_url:file:///server-root/instance/
etc/tailor.ldif#cn=dps-instance1,cn=Sun ONE Directory Proxy Server,
cn=ServerGroup (1),cn=instance1.example.com,ou=example.com,
o=NetscapeRoot End

Collecting Directory Proxy Server Log Information

Collect the Directory Proxy Server logs. By default, the logs are stored in the following directory:


instance-path/logs

If you are providing this information to Sun Support, you should also include the generic Directory Server data from the various Directory Servers involved. This generic data includes the Directory Server version and the Directory Server access, error, and audit logs. For more information about collecting the Directory Server generic information, see Collecting Generic Data.

Include generic information about any other backend servers you may be using, such as JDBC backends, a SQL database, or an Oracle database.

Troubleshooting Directory Proxy Server Installation Problems

This section describes procedures to help you debug problems installing Directory Proxy Server. It includes the following:

Troubleshooting Directory Proxy Server 5.2 Installation Failures

Installation may fail if the password contains a dollar sign ($) character, such as pa$$word. For example, you might get the following error message:


[4] stderr > Can't read "word" no such variable

When the installer parses the password it interprets the text after the dollar sign, $word, as a variable, and this variable does not exist.

Change the password to not include the dollar sign character.

Troubleshooting Problems Starting Directory Proxy Server 5.2 on Windows

If Directory Proxy Server 5.2 fails to start on Windows, check the following:

  1. Check that no keys remain in the registry, such as dps, sunOne, idar, dar and iplanet.

  2. Remove the product registry file located in the C:\WINNT\System32 directory.

  3. Reboot your machine.

    Try to reinstall Directory Proxy Server

You may also need to manually remove the Localmachine->system->controlset001->services->admin52 entry. This entry may be causing your problem if you receive the following error in the Admin server install log:


Error: Writing Administration Server service keys to the Windows registry... failed. 

Troubleshooting Problems With the Directory Proxy Server Process

This section describes procedures for the following:

Overview of Process Troubleshooting Tools

Some tools are provided with Solaris and Java which may help you troubleshoot process issues. The following sections provide an overview of some of the most useful tools

Using Java Tools With Directory Proxy Server 6.2

As Directory Proxy Server 6.2 is a pure Java application, you can use the Java tools that are delivered with the JDK 1.5 to help troubleshoot problems. These tools include the following:

On Solaris, you can find these tools in the following location:


/usr/lang/JAVA/jdk1.5.0_03/solaris-sparc/bin

The JVM also includes a graphical tool for monitoring the Java virtual machine called the Java Monitoring and Management Console (JConsole) tool. This tool uses the Java virtual machine to provide information on performance and resource consumption of applications running on the Java platform using Java Management Extension (JMX) technology. JConsole can be used to observe information about an application running on the Java platform. The JConsole provides information and charts about memory use, thread use, class loading, and JVM parameters

On Unix platforms, if the kill -QUIT process-id command is used to get thread dump and it does not work, use jstack.

Using Solaris Tools With Directory Proxy Server 5.x

Solaris includes a collection of process tools to help you collect more information about process problems, such as a hung process, crashed process, or memory usage problems. These tools include the following:

Troubleshooting a Hung or Unresponsive Directory Proxy Server Process

This section describes how to troubleshoot a unresponsive or hung Directory Proxy Server process. A totally unresponsive process is called a hang. The remainder of this section describes how to collect and analyze data about a hang.

Collecting Data About a Directory Proxy Server 6.2 Hang on Solaris

The jstat tool tells you the amount of CPU being used for each thread. If you collect a thread stack using the jstack utility at the same time you run the jstat tool, you can then use the jstack output to see what the thread was doing when it had trouble. If you run the jstack and jstat tools simultaneously several times, you can see over time if the same thread was causing the problem and if it was encountering the problem during the same function call.

To get the process ID of the running Directory Proxy Server, use the jps command. For example, the command is run as follows on Solaris:


# jps
8393 DistributionServerMain
2115 ContainerPrivate
21535 startup.jar
16672 Jps
13953 swupna.jar

Collect usage information as follows:


# ./scp DPS-PID

The DPS-PID field specifies the PID of the unresponsive process.

On Solaris and other UNIX platforms, show system calls that occur during the crash using the truss command as follows:


truss -o /tmp/trace.txt -ealf -rall -wall -vall -p 21362

The value 21362 corresponds to the PID of the unresponsive process.

Collecting and Analyzing Data About a Directory Proxy Server 5.x Hang on Solaris

The prstat tool tells you the amount of CPU being used for each thread. If you collect a process stack using the pstack utility at the same time you run the prstat tool, you can then use the pstack output to see what the thread was doing when it had trouble. If you run the prstat and pstack tools simultaneously several times, then you can see over time if the same thread was causing the problem and if it was encountering the problem during the same function call.


Note –

On Linux, use the lsstack or pstack commands instead of the Solaris pstack utility.


The following script automates the process of running these tools:


cat scp
#!/bin/sh  

i=0 
while [ "$i" -lt "10" ] 
do
         echo "$i\n"
         date=`date "+%y%m%d:%H%M%S"`
         prstat -L -p $1 0 1 > /tmp/prstat.$date
         pstack $1 > /tmp/pstack.$date
         i=`expr $i + 1`;
         sleep 1 
done  

The value 10 in the [ "$i" -lt "10" ] line can be increased or decreased to suit the time during which the problem you are troubleshooting occurs. This adjustment allows to you collect a full set of process data to help troubleshoot the issue. Thus enabling a full process data set to be captured around the issue.

Collect usage information as follows:


# ./scp DPS-PID

The DPS-PID field specifies the PID of the unresponsive process.

On Solaris and other UNIX platforms, show system calls that occur during the crash using the truss command as follows:


truss -o /tmp/trace.txt -ealf -rall -wall -vall -p 21362

The value 21362 corresponds to the PID of the unresponsive ldapfwd process.

Analyzing Data About a Hang

Whenever the Directory Proxy Server crashes, it generates a core. With this core file and the process stack of the core file you obtained, you can analyze the problem. For information about analyzing a core file, see Examining a Core File on Solaris. However, rather than running the utility from the ns-slapd binary directory, you must run it from the

For example, the output of the truss command shows that no systems calls have been made at the time of the crash, suggesting a passive hang. Looking at the core file and the jstack or pstack information, you can identify several threads that are waiting for a lock to be freed to continue processing. By comparing the out put of the various tools you can safely guess that the cause of the problem is a deadlock. With this information, Sun Support can better help you resolve your problem in a timely fashion.

Troubleshooting a Crashed Directory Proxy Server Process

Core file and crash dumps are generated when a process or application terminate abnormally. Analyzing these files can help you identify the cause of your problem.

This section includes the following topics:

Getting the Core and Shared Libraries

Get all the libraries and binaries associated with the Directory Proxy Server process for core file analysis. You must configure your system to allows Directory Proxy Server to generate a core file if the server crashes. For more information about generating a core file, see Generating a Core File.

Collect the libraries using the pkg_app script . The pkg_app script packages an executable and all of its shared libraries into one compressed tar file. You provide the process ID of the application and, optionally, the name of the core file to be opened. For more information about the pkg_app script see Using the pkg_app Script on Solaris.

As superuser, run the pkg_app script as follows:


# pkg_app pid core-file

Note –

You can also run the pkg_app script without a core file. This reduces the size of the script output. You need to later set the variable to the correct location of the core file.


Analyzing the Directory Proxy Server 6.2 Core Data on Solaris

Once you have obtained a core file, run the jstack and jpmap Java tools on the file.

Run the jstack utility as follows:


# jstack process-ID

For example, the jstack utility creates the following output:


# jstack 8393
Attaching to process ID 8393, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 1.5.0_03-b07
Thread t@1: (state = BLOCKED)

Thread t@42: (state = IN_NATIVE)
 - sun.nio.ch.ServerSocketChannelImpl.accept0(java.io.FileDescriptor, java.io.FileDescriptor, \
java.net.InetSocketAddress[]) (Interpreted frame)
 - sun.nio.ch.ServerSocketChannelImpl.accept0(java.io.FileDescriptor, java.io.FileDescriptor, \
java.net.InetSocketAddress[]) (Interpreted frame)
 - sun.nio.ch.ServerSocketChannelImpl.accept() @bci=130, line=145 (Interpreted frame)
 - com.sun.directory.proxy.extensions.ExtendedTCPClientListener.run() @bci=267, \
line=190 (Interpreted frame)

Thread t@41: (state = IN_NATIVE)
 - sun.nio.ch.ServerSocketChannelImpl.accept0(java.io.FileDescriptor, java.io.FileDescriptor, \
java.net.InetSocketAddress[]) (Interpreted frame)
 - sun.nio.ch.ServerSocketChannelImpl.accept0(java.io.FileDescriptor, java.io.FileDescriptor, \
java.net.InetSocketAddress[]) (Interpreted frame)
 - sun.nio.ch.ServerSocketChannelImpl.accept() @bci=130, line=145 (Interpreted frame)
 - com.sun.directory.proxy.extensions.ExtendedTCPClientListener.run() @bci=267, \
line=190 (Interpreted frame)

Analyzing the Directory Proxy Server 5.x Core Data on Solaris

Once you have obtained a core file, run the pstack and pmap Solaris utilities on the file.

Run the pstack utility as follows:


# pstack core-file > /tmp/pstack.txt

For example, the pstack utility creates the following output:


core '/var/core/core_dps-dr-zone1_ldapfwd_0_0_1156942096_3167' of 3167: ./ldapfwd 
-t /var/opt/mps/serverroot/dps-dps-dr-zone1/etc/tailor.txt
-----------------  lwp# 1 / thread# 1  --------------------
 fedc0b6c __pollsys (ffbff680, 2, ffbff610, 0, 0, 1770) + 8
 fed5cea8 poll     (ffbff680, 2, 1770, 10624c00, 0, 0) + 7c
 ff19c610 _pr_poll_with_poll (1770, ffbff680, 927c0, ffbff91c, 2, 1) + 42c
 00039504 __1cLCAI_LF_CmgrDrun6F_v_ (75, 115d74, 11202c, 2, 88c00, 116510) + 1a8
 00062070 ldapfwdMain (0, ffbffa84, c, 9952c, feb60740, feb60780) + 1c
 0002f968 _start   (0, 0, 0, 0, 0, 0) + 108
-----------------  lwp# 3 / thread# 3  --------------------
 fedc0b6c __pollsys (fea19b70, 3, fea19b00, 0, 0, 3e8) + 8
 fed5cea8 poll     (fea19b70, 3, 3e8, 10624c00, 0, 0) + 7c
 ff19c610 _pr_poll_with_poll (3e8, fea19b70, 186a0, fea19e8c, 3, 2) + 42c
 0005de38 __1cIidarPollEwait6MI_i_ (fea19e6c, 186a0, 16e360, 1, 0, 1) + 20
 000639ec __1cJHeartbeatJheartbeat6Fpv_1_ (158f40, 1, 14da70, faa36, 16e360, faa5d) + 114
 fedbfd9c _lwp_start (0, 0, 0, 0, 0, 0)
-----------------  lwp# 136734 / thread# 136734  --------------------
 fedbfe40 __lwp_park (0, 0, 116710, 0, 0, 1) + 14
 00076548 __1cMCAI_LF_MutexHacquire6M_v_ (116708, fd2bc, 0, 46, fea24800, 1000) + 34
 00076158 __1cPCAI_LF_RefCountGAddRef6M_v_ (116708, 156510, 2400, 26dc, 11667c, 800) + 24
 0006ddb0 __1cVCAI_LF_ReferralServer2t5B6MpnPCAI_LF_ConnPair_pnVCAI_LF_RequestMessage_pnNCAI_LF_Server
_ipcibbi_v_ (197198, fe7790d8, 1541c0, 156510, 185, 1565b8) + 50
 00046324 __1cPCAI_LF_ConnPairYstart_referral_operation6MipCpnNCAI_LF_Server_i_i_ (fe7790d8, 1, 1565b8
, 156510, 1, 197198) + 26c
 0004f2a0 __1cPCAI_LF_ConnPairLsend_result6MrnOCAI_LF_Message_rnVCAI_LF_RequestMessage__v_ (fe7790d8, 
156eb8, 1541c0, 2400, 0, 0) + 354
 00044758 __1cPCAI_LF_ConnPairOinner_activity6MpnOCAI_LF_Message__v_ (fe7790d8, 156eb8, 11202c, f4f27,
 fe7790f8, 1) + 114c
 00045c24 __1cPCAI_LF_ConnPairDrun6M_v_ (fe7797c8, 198383, 170c78, fe7790d8, fe779e54, fe779740) + 6cc
 00046cd0 CAI_LF_StartFunction (157500, 11202c, 1, 0, 46ab4, 1) + 21c
 fedbfd9c _lwp_start (0, 0, 0, 0, 0, 0)
-----------------  lwp# 136735 / thread# 136735  --------------------
 fedc0b6c __pollsys (fe9e8d98, 3, fe9e8d28, 0, 0, 1d4c0) + 8
 fed5cea8 poll     (fe9e8d98, 3, 1d4c0, 10624c00, 0, 0) + 7c
 ff19c610 _pr_poll_with_poll (1d4c0, fe9e8d98, b71b00, fe9e9e74, 3, 2) + 42c
 0005de38 __1cIidarPollEwait6MI_i_ (fe9e9e54, b71b00, 1, f4f27, fe9e90f8, 12d8d0) + 20
 00045cbc __1cPCAI_LF_ConnPairDrun6M_v_ (fe9e97c8, 17bb43, 130ff8, fe9e90d8, fe9e9e54, fe9e9740) + 764
 00046cd0 CAI_LF_StartFunction (157500, 11202c, 1, 0, 46ab4, 1) + 21c
 fedbfd9c _lwp_start (0, 0, 0, 0, 0, 0)
-----------------  lwp# 136738 / thread# 136738  --------------------
 fedc0b6c __pollsys (fe9a8d98, 3, fe9a8d28, 0, 0, 1d4c0) + 8
 fed5cea8 poll     (fe9a8d98, 3, 1d4c0, 10624c00, 0, 0) + 7c
 ff19c610 _pr_poll_with_poll (1d4c0, fe9a8d98, b71b00, fe9a9e74, 3, 2) + 42c
 0005de38 __1cIidarPollEwait6MI_i_ (fe9a9e54, b71b00, 1, f4f27, fe9a90f8, 0) + 20
 00045cbc __1cPCAI_LF_ConnPairDrun6M_v_ (fe9a97c8, 198123, 130fd8, fe9a90d8, fe9a9e54, fe9a9740) + 764
 00046cd0 CAI_LF_StartFunction (157500, 11202c, 1, 0, 46ab4, 1) + 21c
 fedbfd9c _lwp_start (0, 0, 0, 0, 0, 0)
-----------------  lwp# 136788 / thread# 136788  --------------------
 00197d68 ???????? (116708, 1, 156510, 0, 0, 1000)
 0006dee4 __1cVCAI_LF_ReferralServer2T6M_v_ (155780, 1000, fedecbc0, fea25c00, fe779780, 0) + 34
 000707c4 __SLIP.DELETER__A (155780, 1, 4, a6c, 1, 2400) + 4
 00046a94 CAI_LF_ReferralStartFunction (155780, fe6da000, 0, 0, 46a64, 1) + 30
 fedbfd9c _lwp_start (0, 0, 0, 0, 0, 0) 

You can also use the mdb or adb command instead of the pstack command to show the stack of the core. The mdb command is a modular debugger and the adb command is a general purpose debugger that is part of Solaris. Run the mdb command as follows:


# mdb $path-to-executable $path-to-core
$C to show the core stack
$q to quit

The output of the mdb and the pstack commands provide helpful information about the process stack at the time of the crash. The mdb $C command output provides the exact thread that was executing at the time of the crash.

On Solaris 8 and 9, the first thread of the pstack output often contains the thread responsible for the crash. On Solaris 10, use mdb to find the crashing thread or, if using the pstack command, analyze the stack by looking for threads that do not contain lwp-park, poll, and pollsys.

On Solaris, you can also use the Solaris dbx symbolic debugger, which is a developer tool available free from http://sun.com/. The dbx tool provides symbolic debugging, and includes variables that can be manipulated. For example, the dbx debugger produces the following output:


current thread: t@2482121
=>[1] 0x0(0x156138, 0x0, 0xff000000, 0xfedefad4, 0x2, 0x2), at 0xffffffffffffffff
  [2] CAI_LF_RefCount::Release(0x116708, 0x1, 0x156138, 0x0, 0x0, 0x1000), at 0x7629c
  [3] CAI_LF_ReferralServer::~CAI_LF_ReferralServer(0x241270, 0x1000, 0xfedecbc0, 0xfe097c00, 
0xfe8d9780, 0x0), at 0x6dee4
  [4] __SLIP.DELETER__A(0x241270, 0x1, 0x4, 0xa6c, 0x1, 0x2400), at 0x707c4
  [5] CAI_LF_ReferralStartFunction(0x241270, 0xfe81a000, 0x0, 0x0, 0x46a64, 0x1), at 0x46a94

Analyzing the Directory Proxy Server 5.x Core Data on Linux

On Linux, use the lsstack or pstack commands instead of the Solaris pstack utility. For example, run the lsstack command as follows:


# lsstack /tmp/core-file

You can also use the GNU project debugger, gdb, to see what is happening at the time of the crash. Run this debugger as follows:


# gdb ./ldapfwd /tmp/core-file

For more information about the useful commands available with the gdb tool, see the gdb man page.

Analyzing the Directory Proxy Server 5.x Core Data on HP-UX

As for Linux, on HP-UX you can also use the GNU project debugger to see what is happening at the time of the crash. Run this debugger as follows:


# gdb ./ldapfwd /tmp/core-file

For more information about the useful commands available with the gdb tool, see the gdb man page.

Analyzing the Directory Proxy Server 5.x Core Data on Windows

On Windows, you can use the WinDbg debugger, which provides a UI for kernel and NT debugging. It can function both as a kernel-mode and user-mode debugger. You run this debugger on a Windows crash dump file.