This technical note describes how to use SunTM Gathering Debug Data (Sun GDD or GDD) to collect data that the Sun Support Center requires in order to debug problems with Sun JavaTM System Directory Server software.
By collecting this data before you open a Service Request, you can reduce substantially the time needed to analyze and resolve the problem. For more information on how this document and associated scripts can help you in better dealing with Directory Server problems, see:
http://www.sun.com/services/gdd/index.html
This document is intended for anyone who needs to open a Service Request about Directory Server software with the Sun Support Center.
This technical note contains the following sections:
Version |
Date |
Description of Changes |
---|---|---|
1.0 |
December 2006 |
Initial release of this technical note. |
1.1 |
January 2007 |
This document covers Sun Java System Directory Server 5 on all supported platforms.
You can use this document in all types of environments, including test, pre-production, and production. Verbose debugging is not used so as to avoid performance impact, except when it is deemed necessary. In some cases, it is possible that the problem could disappear when you configure logging for debug mode. However, this is the minimum to understand the problem. In the majority of cases, the debug data described in this document is sufficient to analyze the problem.
This document does not provide workarounds nor techniques or tools to analyze debug data. It provides some troubleshooting, but you should not use this guide as an approach to troubleshooting Directory Server problems.
If your problem does not conveniently fit into any of the specific categories, supply the general information described in To Collect Required Debug Data For Any Directory Server Problem and clearly explain your problem.
If the information you initially provide is not sufficient to find the root cause of the problem, Sun will ask for more details, as needed.
The prerequisites for collecting debug data for Directory Server are as follows.
Make sure you have superuser privileges.
For the Solaris OS platform obtain the dirtracer and pkg_app scripts from the following location.
http://www.sun.com/bigadmin/scripts/indexSjs.html
See To Run the pkg_app Script for instructions on using pkg_app. See the documentation bundled with dirtracer for information on using that script.
On the Windows platform, download the free Debugging Tools for Windows to help in analyzing process hang problems. The debugger Dr. Watson is not useful for process hang problems because it cannot generate a crash dump on a running process. Download the free Debugging Tools from the following location.
http://www.microsoft.com/whdc/devtools/debugging/default.mspx
Install the last version of Debugging Tools and the OS Symbols for your version of Windows. Also, you must add the environment variable NT_SYMBOL_PATH.
Use the command drwtsn32 -i to select Dr. Watson as the default debugger. Use the command drwtsn32, check all options, and choose the path for crash dumps.
The following describes the variables used in the procedures in this document. Gather the values of the variables if you don't already know them before you try to do the procedures.
The base file system location for Directory Server or Directory Server.
The base file system location for Windows Debugging Tools.
Many paths specified in this document use the forward slash format of UNIX. If you are running software on a Windows system, use the equivalent backslash format.
Collecting debug data for any Directory Server problem involves these basic operations:
Collecting basic problem and system information.
Collecting specific problem information.
Creating a tar.gz file of all the information and uploading the file to the Sun Support Center.
Creating a Service Request with the Sun Support Center.
When you create a Service Request with the Sun Support Center, either online or by phone, provide the following information:
A clear problem description
Details of the state of the system, both before and after the problem started
Impact on end users
All recent software and hardware changes
Any actions already attempted
Whether the problem is reproducible; when reproducible, provide the detailed test case
Whether a pre-production or test environment is available
Name and location of the archive file containing the debug data
Upload your debug data archive file to one of the following locations:
When opening a Service Request by phone with the Sun Support Center, provide a summary of the problem. Also provide the details in a text file named Description.txt. Be sure to include Description.txt in the archive along with the rest of your debug data.
For more information on how to upload files, see: http://supportfiles.sun.com/show?target=faq
This section describes the kinds of debug data you need to provide based on the problem you are experiencing.
This section contains the following tasks:
To Collect Required Debug Data For Any Directory Server Problem
To Collect Required Debug Data For Directory Server Installation Problems
To Collect Required Debug Data For an Unresponsive or Hung Directory Server Process
To Collect Required Debug Data For a Crashed Directory Server Process
To Collect Required Debug Data For Directory Server Replication Problems
To Collect Required Debug Data For Directory Server Schema Problems
All problems described in this technical note need basic information collected about when the problem occurred and about the system having the problem. Use this task to collect that basic information.
Note the time or times the problem occurred.
If possible, collect explorer output from SUNWexplo software on the system where the problem occurred.
Provide graphical representation of your deployment.
The graphical representation of your deployment is key to understanding the context of the problem. Show the following in the graphical representation.
All computers involved, with their IP addresses, hostnames, roles in the deployment, operating systems, and versions used.
All other relevant systems, including load balancers, firewalls, and so forth.
If possible, collect a copy of the database on spare disk space or backup media to be used later if necessary.
Database files and the transaction log are often indispensable for diagnosing deadlocks involving database locks.
Note the operating system version.
uname -a
uname -r
cat /etc/redhat-release
C:\Program Files\Common files\Microsoft Shared\MSInfo\msinfo32.exe /report C:\report.txt
Note the patch level.
showrev -p
swlist
rpm -qa
Already provided in C:\report.txt.
Collect Directory Server version information.
server-root/bin/slapd/server/ns-slapd -D instance-dir -V
The server executable is slapd.exe on Windows systems.
Collect the Directory Server configuration file, server-root/slapd-serverID/config/dse.ldif.
Collect Directory Server access, errors, and audit logs.
By default, you find these logs in the following locations:
server-root/slapd-serverID/logs/access
server-root/slapd-serverID/logs/errors
server-root/slapd-serverID/logs/audit (if enabled)
If these log files are not in the default locations, examine the Directory Server configuration file, server-root/slapd-/serverID/config/dse.ldif, to find the paths to the logs. The paths are specified as the values of attributes nsslapd-accesslog, nsslapd-errorlog, and nsslapd-auditlog.
This section describes what data to collect when you cannot complete Directory Server installation.
If the problem concerns a general installation failure for Java Enterprise System, first check the installation troubleshooting chapter in the Installation Guide for your version of Java Enterprise System.
For compressed archive installations, collect installation output showing system calls.
Reinstall while using the appropriate command for your system from the following list. Collect the output of the command that is displayed during installation.
truss -ealf -rall -wall -vall -o /tmp/install-directory-truss.out ./setup
tusc -v -fealT -rall -wall -o /tmp/install-directory-tusc.out ./setup
strace -fv -o /tmp/install-directory-strace.out ./setup
Use DebugView.
DebugView is available at http://www.sysinternals.com/Utilities/DebugView.html.
For Java Enterprise System installations, collect installation error logs.
The log file is named after the date and time that the installation failed. For example, a log file for an installation that failed on December 16 at 3:32 p.m. would have a name like Java_Enterprise_System*_install.B12161532.
On Solaris systems, installation logs are located under /var/sadm/install/logs.
On Red Hat and HP-UX systems, installation logs are located under /var/opt/sun/install/logs.
On Windows systems, installation logs are located under C:\Documents and Settings\current-user\Local Settings\Temp.
This procedure describes what data to collect when Directory Server is still running, but is no longer responding to client application requests.
Collect the data describe in this procedure while the server is hanging.
Note the time during which the hang is seen to occur.
Collect information about the port used during the hang.
netstat -an | grep ds-port
netstat -an
Collect statistics about the system running Directory Server.
ps -aux | grep server-root
vmstat 5 5
iostat -x
top
uptime
ps -aux | grep server-root
vmstat 5 5
iostat -x
top
sar
ps -aux | grep server-root
vmstat 5 5
top
uptime
sar
Get the process ID using the tlist.exe command, then get process details using the same command.
win-dbg-root\tlist.exe pid
Collect swap information.
swap -l
swapinfo
free
Already provided in C:\report.txt.
On Solaris systems, collect output from pstack and pmap five times, once every ten seconds.
pstack pid
pmap -x pid
Get output showing system calls during the hang, by letting each of the commands listed here run for about a minute, then stopping them by typing Control-C.
truss -ealf -rall -wall -vall -o /tmp/truss.out -p pid
tusc -v -fealT -rall -wall -o /tmp/truss.out -p pid
strace -fv -o /tmp/strace.out -p pid
Use Debug View.
DebugView is available at http://www.sysinternals.com/Utilities/DebugView.html.
Collect core files or crash dumps, and related command output.
When the server is hanging, attempt to get several core files that show the state of the server threads over time. You can do this by generating a core file, changing the name of the core file, waiting 30 seconds to a minute, and generating another core file. Repeat the process at least once to get a minimum of three sets of core files and related data.
cd server-root/bin/slapd/server
gcore -o /tmp/directory-core pid
pstack /tmp/directory-core
For each core file on Solaris OS, collect output from the following commands.
cd server-root/bin/slapd/server
file core-file
pstack core-file
pmap core-file
pflags core-file
For at least one of the core files on Solaris OS, collect output from the pkg_app script.
./pkg_app.ksh pid /tmp/directory-core
Here, pid is the server process ID number. See To Run the pkg_app Script for instructions on using pkg_app.
cd server-root/bin/slapd/server
If you have the patches PHKL_31876 and PHCO_32173 patches installed, generate the core file using the gcore command.
gcore -p pid
Otherwise, use the gdb command to generate the core file.
gdb
(gdb) attach pid
Attaching to process pid
No executable file name was specified
(gdb) dumpcore
Dumping core to the core file core.pid
(gdb) quit
The program is running. Quit anyway (and detach it)? (y or n) y
Detaching from program: , process pid
cd server-root/bin/slapd/server
gdb
(gdb) attach pid
Attaching to process pid
No executable file was specified
(gdb) gcore
Saved corefile core.pid
(gdb) backtrace
(gdb) quit
win-dbg-root\tlist.exe
win-dbg-root\adplus.vbs -hang -p pid -o C:\dump-dir
Collect everything in the folder under C:\dump-dir.
Collect output from idsktune.
The idsktune command provides information on system parameters, patch level, tuning recommendations, and so forth. The command is described in the product documentation.
server-root/bin/slapd/server/idsktune
This procedure describes what data to collect when a Directory Server process unexpectedly dies.
Try to restart Directory Server and record the results.
Collect statistics about the system running Directory Server.
ps -aux | grep server-root
vmstat 5 5
iostat -x
top
uptime
ps -aux | grep server-root
vmstat 5 5
iostat -x
top
sar
ps -aux | grep server-root
vmstat 5 5
top
uptime
sar
Get the process ID using the tlist.exe command, then get process details using the same command.
win-dbg-root\tlist.exe pid
Collect swap information.
swap -l
swapinfo
free
Already provided in C:\report.txt.
Collect system logs.
/var/adm/messages
/var/log/syslog
/var/adm/syslog/syslog.log
/var/log/messages
/var/log/syslog
Retrieve the Event log files.
From the Control Panel, open the Event Viewer. Select the event log. Then select Action > Save log file as to save the log.
Collect core files.
For instructions on preparing your system to produce core files or crash dumps in the event of a crash, see 1.6 Configuring the Operating System to Generate Core Files.
For each core file on Solaris OS, collect output from the following commands.
cd server-root/bin/slapd/server
file core-file
pstack core-file
pmap core-file
pflags core-file
For at least one of the core files on Solaris OS, collect output from the pkg_app script.
./pkg_app.ksh pid core-file
Here, pid is the server process ID number. See To Run the pkg_app Script for instructions on using pkg_app.
This procedure describes what data to collect when experiencing inconsistencies and other issues involving Directory Server replication.
Directory Server commands used here are described in the product documentation.
Collect the hostname, IP address, and serverID for each server instance in the replication topology.
Collect a copy of the schema folder, server-root/slapd-serverID/config/schema, and all the files in the folder.
Collect access logs configured to show replication information.
Before you collect access logs as described in To Collect Required Debug Data For Any Directory Server Problem, adjust the log level to keep replication information.
You can use the Console to adjust the access log level.
Alternatively, you can use the ldapmodify command as follows.
$ ldapmodify -h host -p port -D "cn=Directory Manager" -w password dn: cn=config changetype: modify replace: nsslapd-infolog-area # nsslapd-errorlog-level in 5.1 nsslapd-infolog-area: 8192 |
To return to the default log level, use the following command.
$ ldapmodify -h host -p port -D "cn=Directory Manager" -w password dn: cn=config changetype: modify replace: nsslapd-infolog-area # nsslapd-errorlog-level in 5.1 nsslapd-infolog-area: 0 |
You must restart Directory Server for the change to take effect.
Provide a listing of the changelog directory.
ls -la changelog-dir
dir changelog-dir
If you cannot find the change log, examine the Directory Server configuration file, server-root/slapd-serverID/config/dse.ldif, to find the path. The path is specified as the values of attribute nsslapd-changelogdir.
Collect output from the insync command.
The insync command indicates the state of synchronization between a master replica and one or more consumer replicas. The following command shows the state over a period of 30 seconds.
server-root/shared/bin/insync -s "cn=Directory Manager:password@hostname1:ldap-port" -c "cn=Directory Manager:password@hostname2:ldap-port" 30
Collect output from the repldisc command.
The repldisc command displays the replication topology, building a graph of all known replica, then showing the result as a matrix.
server-root/shared/bin/repldisc -D "cn=Directory Manager" -w password -b base-dn -s host:ldap-port
Here, base-dn is the DN of the replicated suffix.
This procedure describes what data to collect when experiencing Directory Server schema violation errors.
Collect a copy of the schema folder, server-root/slapd-serverID/config/schema, and all the files in the folder.
Indicate whether you have the same problem both in the Console and on the command line.
Provide a list of the last changes made to the schema files.
If the schema violation occurred during an add, modify, delete, or replace operation, provide an LDIF representation of the changes and a list of the commands used.
Core files and crash dumps are generated when a process or application terminates abnormally. You can also generate core files and crash dumps to help diagnose why a process does not respond to client application requests.
This section includes the following procedures:
This procedure shows you how to use the coreadm command to configure the system so that all process core files are placed in a single system directory. This means it is easier to track problems by examining core files in a specific directory whenever a Solaris OS process or daemon terminates abnormally.
Make sure that the /var file system where the core files are generated has sufficient space. Once you configure Solaris OS to generate core files as shown here, all processes that crash write a core file to the /var/cores directory.
Run the following commands as root.
mkdir -p /var/cores coreadm -g /var/cores/%f.%n.%p.%t.core -e global -e global-setid -e log -d process -d proc-setid
In this command:
Specifies the global core file name pattern. Unless a per-process pattern or setting overrides it, core files are stored in the specified directory with a name such as program.node.pid.time.core, for example: mytest.myhost.1234.1102010309.core.
Specifies options to enable. The preceding command enables:
Use of the global (that is, system-wide) core file name pattern (and thereby location)
Capability of setuid programs to also dump core as per the same pattern
Generation of a syslog message by any attempt to dump core (successful or not)
Specifies options to disable. The preceding command disables:
Core dumps per the per-process core file pattern
Per-process core dumps of setuid programs
The preceding command stores all core dumps in a central location with names identifying what process dumped core and when. These changes only impact processes started after you run the coreadm command. Use the coreadm -u command after the preceding command to apply the settings to all existing processes.
Display the core configuration.
# coreadmglobal core file pattern: /var/cores/%f.%n.%p.%t.core init core file pattern: core global core dumps: enabled per-process core dumps: disabled global setid core dumps: enabled per-process setid core dumps: disabled global core dump logging: enabled |
See the coreadm man page for further information.
Set the file size as large as possible, using the ulimit command.
# ulimit -c unlimited # ulimit -a coredump(blocks) unlimited |
See the ulimit(1) man page for details.
Verify that applications can in fact generate core files.
# cd /var/cores # sleep 100000 & [1] pid # kill -8 pid # ls |
You should see a core file for the sleep process you killed.
On Red Hat systems, you can enable core files to be generated on a per user basis.
Open the ~/.bash_profile file for the server user in a text editor.
Search for a line using the ulimit command as follows.
ulimit -S -c 0 > /dev/null 2>&1
Either comment out the line, or set your own limit for core file size.
The native debug tool on Windows systems, Dr. Watson, allows you to generate crash dumps.
However, Dr. Watson does not allow generation of crash dumps on a running process. To generate crash dumps from a running process, install the Debugging Tools. The Debugging Tools are freely available from the Windows web site at http://www.microsoft.com/whdc/devtools/debugging/default.mspx.
You can use Dr. Watson for crash dumps generated when a process dies.
Use the Window Debugging Tools to generate crash dumps of a running process.
Enable generation of a crash dump for your application.
Get the process ID of the application using the tlist.exe command, then enable the crash dump.
win-dbg-root\tlist.exe
win-dbg-root\adplus.vbs -crash -FullOnFirst -p pid -o C:\dump-dir
The adplus.vbs command tracks the application with process ID pid. The adplus.vbs command generates a dmp file in the event of a crash.
When collecting crash dump information, take the complete folder generated under C:\dump-dir.
The pkg_app script packages an executable and all of its shared libraries into one compressed tar file given the process ID of the application, and optionally the name of the core file to be opened. The files are stripped of their directory paths, and are stored under a relative directory named app/ with their names only, allowing them to be unpacked in one directory.
On Solaris 9 and Solaris 10, the list of files is derived from the core file rather than the process image if it is specified. You must still provide the process ID of the running application to assist in path resolution.
Two scripts are created to facilitate opening the core file when the tar file is unpacked.
opencore. This is the script to be executed once unpacked. It sets the name of the core file and the linker path to use the app/ directory, and then invokes dbx with the dbxrc file as the argument.
dbxrc. This script contains the dbx initialization commands to open the core file.
Copy the script to a temporary directory on the system where the server is installed.
Become superuser.
Run the pkg_app script in one of the following ways.
Use the following email aliases to report problems with this document and its associated scripts:
For feedback on this document
To report problems in gathering debug data
The Sun web site provides information about the following additional resources:
Documentation (http://www.sun.com/documentation/)
Support (http://www.sun.com/support/)
Training (http://www.sun.com/training/)
Third-party URLs are referenced in this document and provide additional, related information.
Sun is not responsible for the availability of third-party web sites mentioned in this document. Sun does not endorse and is not responsible or liable for any content, advertising, products, or other materials that are available on or through such sites or resources. Sun will not be responsible or liable for any actual or alleged damage or loss caused or alleged to be caused by or in connection with use of or reliance on any such content, goods, or services that are available on or through such sites or resources.
Besides searching for Sun product documentation from the docs.sun.com web site, you can use a search engine of your choice by typing the following syntax in the search field:
search-term site:docs.sun.com
For example, to search for Directory Server, type the following:
"Directory Server" site:docs.sun.com
To include other Sun web sites in your search, such as java.sun.com, www.sun.com, and developers.sun.com, use sun.com in place of docs.sun.com in the search field.
Sun is interested in improving its documentation and welcomes your comments and suggestions. To share your comments, go to http://docs.sun.com and click Send Comments. In the online form, provide the full document title and part number. The part number is a 7-digit or 9-digit number that can be found on the book's title page or in the document's URL. For example, the part number of this book is 820-0437-10.