Skip Navigation Links | |
Exit Print View | |
Oracle Directory Server Enterprise Edition Troubleshooting Guide 11g Release 1 (11.1.1.5.0) |
1. Overview of Troubleshooting Directory Server Enterprise Edition
2. Troubleshooting Installation and Migration Problems
3. Troubleshooting Replication
4. Troubleshooting Directory Proxy Server
Collecting Generic Directory Proxy Server Data
Collecting Version Information for Directory Proxy Server
Running the dpadm Command in Verbose Mode
Collecting Directory Proxy Server Configuration Information
Collecting Directory Proxy Server Log Information
Troubleshooting Problems With the Directory Proxy Server Process
Overview of Process Troubleshooting Tools
Using Java Tools With Directory Proxy Server 11g Release 1 (11.1.1.5.0)
Using Solaris Tools With Directory Proxy Server
Troubleshooting a Hung or Unresponsive Directory Proxy Server Process
Collecting Data About a Directory Proxy Server 11g Release 1 (11.1.1.5.0) Hang on Solaris
Troubleshooting Directory Proxy Server for Refused Connections
Troubleshooting Directory Proxy Server Using Data Under cn=monitor
5. Troubleshooting Directory Server Problems
6. Troubleshooting Data Management Problems
7. Troubleshooting Identity Synchronization for Windows
8. Troubleshooting DSCC Problems
9. Directory Server Error Log Message Reference
This section describes procedures for the following:
Troubleshooting a Hung or Unresponsive Directory Proxy Server Process
Troubleshooting Directory Proxy Server for Refused Connections
Some tools are provided with Solaris and Java which may help you troubleshoot process issues. The following sections provide an overview of some of the most useful tools
As Directory Proxy Server 11g Release 1 (11.1.1.5.0) is a pure Java application, you can use the Java tools that are delivered with the JDK 1.5 to help troubleshoot problems. These tools include the following:
jstack. This tool provides information about the Directory Proxy Server thread stack.
jmap. This tool provides information about memory. For example, running jmap —histo PID prints a histogram of the heap.
jinfo. This tool provides you with information about the JVM environment.
jstat. This tool displays performance statistics for a JVM.
The JVM also includes a graphical tool for monitoring the Java virtual machine called the Java Monitoring and Management Console (JConsole) tool. This tool uses the Java virtual machine to provide information on performance and resource consumption of applications running on the Java platform using Java Management Extension (JMX) technology. JConsole can be used to observe information about an application running on the Java platform. The JConsole provides information and charts about memory use, thread use, class loading, and JVM parameters
On Unix platforms, if the kill -QUIT process-id command is used to get thread dump and it does not work, use jstack.
Solaris includes a collection of process tools to help you collect more information about process problems, such as a hung process, crashed process, or memory usage problems. These tools include the following:
pmap — shows the process map, which includes a list of virtual addresses, where the dynamic libraries are loaded, and where the variables are declared.
pstack — shows the process stack. For each thread in the process, it describes the exact stack of instruction the thread was executing at the moment when the process died or when the pstack command was executed.
pfiles— reports information about all open files in each process.
pldd — list the dynamic libraries linked into each process.
This section describes how to troubleshoot a unresponsive or hung Directory Proxy Server process. A totally unresponsive process is called a hang. The remainder of this section describes how to collect and analyze data about a hang.
The jstat tool tells you the amount of CPU being used for each thread. If you collect a thread stack using the jstack utility at the same time you run the jstat tool, you can then use the jstack output to see what the thread was doing when it had trouble. If you run the jstack and jstat tools simultaneously several times, you can see over time if the same thread was causing the problem and if it was encountering the problem during the same function call.
To get the process ID of the running Directory Proxy Server, use the jps command. For example, the command is run as follows on Solaris:
# jps 8393 DistributionServerMain 2115 ContainerPrivate 21535 startup.jar 16672 Jps 13953 swupna.jar
The following script automates the process of running these tools:
cat scpTools #!/bin/sh i=0 while [ "$i" -lt "10" ] do echo "$i\n" date=`date "+%y%m%d:%H%M%S"` prstat -L -p $1 0 1 > /tmp/prstat.$date pstack $1 > /tmp/pstack.$date i=`expr $i + 1`; sleep 1 done
The value 10 in the [ "$i" -lt "10" ] line can be increased or decreased to suit the time during which the problem you are troubleshooting occurs. This adjustment allows to you collect a full set of process data to help troubleshoot the issue. Thus enabling a full process data set to be captured around the issue.
Collect usage information as follows:
# ./scpTools DPS-PID
The DPS-PID field specifies the PID of the unresponsive process. The Directory Proxy Server PID contains the line DistributionServerMain.
On Solaris and other UNIX platforms, show system calls that occur during the crash using the truss command as follows:
truss -o /tmp/trace.txt -ealf -rall -wall -vall -p 21362
The value 21362 corresponds to the PID of the unresponsive process.
With the use of the following diagram, this section describes how operations are processed within the server and which resources are being involved in such processing. The resource usage can be dumped to the error log file by sending a USR2 signal to Directory Proxy Server process.
The Clientlistener detects any new incoming connections from the clients and stores them in a buffer of pending connections. From time to time, the ConnectionHandler fetches all the pending connection and put them in the list of connections to process (a Java Selector). The following resource dump excerpt shows some figures around incoming connections:
0.0.0.0:2389 useSSL:false Thread[Connection Handler 0 for Listener Thread 0.0.0.0:2389,5,main] ConnectionHandler pending connections = 0 ConnectionHandler pending connections 2 = 0 ConnectionHandler connections in selector = 1 Thread[Connection Handler 1 for Listener Thread 0.0.0.0:2389,5,main] ConnectionHandler pending connections = 0 ConnectionHandler pending connections 2 = 0 ConnectionHandler connections in selector = 0
By default, Directory Proxy Server has two client listeners, one for normal connection and one for secure connection, and each client listener has two connection handlers.
The ConnectionHandler reads bytes in the file descriptor and puts them in the WorkQueue after getting a full LDAP operation. The operations in the queue are retrieved by the WorkerThreads for processing. At any time, the WorkQueue keeps the following information available to the resource dumper:
WorkQueue Norm inQ = 0 number of operations in the Q WorkQueue Norm peak = 1 the peak of operations in the Q WorkQueue Norm totalIn = 1875 the total # of operations put by the connection handlers WorkQueue Norm totalOut = 1875 the total # of operations get by the workers WorkQueue High inQ = 0 -- same but foe the "high priority" Q WorkQueue High peak = 0 -- same but foe the "high priority" Q WorkQueue High totalIn = 0 -- same but foe the "high priority" Q WorkQueue High totalOut = 0 -- same but foe the "high priority" Q WorkQueue abandonRequests = 0 the number of abandon requests WorkQueue abandonSuccess = 0 the number of succeeded abandons
When the WorkQueue is empty, the WorkerThreads are idle. As soon as a WorkerThread has got an operation from the WorkQueue it becomes busy. The resource dumper provides the state of the WorkerThreads:
WorkerThread: idle = 49 --> all the WorkerThreads are idle but 1 WorkerThread: busy = 1
In the first step of processing, the WorkerThread gets a list of data views where the operation can be routed to. This step is not described here. Then each elected data view goes through a data source pool to get an LDAP server. The choice of the LDAP server is done by the Load Balancing algorithm. For example, if the Proportional load balancing was in use then the statistics would look like the following:
Data Source Pool pool1 pool1 - ProportionalLB - total connections - Bind (provided=0 refused=0) pool1 - ProportionalLB - total connections - Add (provided=0 refused=0) pool1 - ProportionalLB - total connections - Search (provided=0 refused=0) pool1 - ProportionalLB - total connections - Compare (provided=0 refused=0) pool1 - ProportionalLB - total connections - Delete (provided=0 refused=0) pool1 - ProportionalLB - total connections - Modify (provided=0 refused=0) pool1 - ProportionalLB - total connections - ModifyDN (provided=0 refused=0) pool1 - ProportionalLB - Connections per server for Bind pool1 - ProportionalLB - ds1 (provided=0 refused=0) pool1 - ProportionalLB - Connections per server for Add pool1 - ProportionalLB - ds1 (provided=0 refused=0) pool1 - ProportionalLB - Connections per server for Search pool1 - ProportionalLB - ds1 (provided=0 refused=0) pool1 - ProportionalLB - Connections per server for Compare pool1 - ProportionalLB - ds1 (provided=0 refused=0) pool1 - ProportionalLB - Connections per server for Delete pool1 - ProportionalLB - ds1 (provided=0 refused=0) pool1 - ProportionalLB - Connections per server for Modify pool1 - ProportionalLB - ds1 (provided=0 refused=0) pool1 - ProportionalLB - Connections per server for ModifyDN pool1 - ProportionalLB - ds1 (provided=0 refused=0)
The chosen LDAP server is requested to provide a connection to the remote backend. The connections to remote backends are managed through a two pools of connections (ConnectionPool). One pool for the normal connections and another for the secure connections, for example, . If Directory Proxy Server is configured to have only secured connections to remote backends then the second pool is not used and the first pool contains the secured connections. Each pool contains connections dedicated to BIND operations, READ operations, and WRITE operations. For each of these sets, the resource dumper reports the current number of connections in the pool and the number of the connections available. The number of connections can be increased when needed but cannot exceed the maximum number of connections, that is, 1024 by default.
BackendConnectionPool [woz:8389/:pool1-DS1] BIND (max=1024 cur=10 avail=10) BackendConnectionPool [woz:8389/:pool1-DS1] READ (max=1024 cur=10 avail=10) BackendConnectionPool [woz:8389/:pool1-DS1] WRITE (max=1024 cur=10 avail=10) BackendConnectionPool [woz:8389/:pool1-DS1] Bound connections = 0 BackendConnectionPool [woz:8389/:pool2-DS1] BIND (max=1024 cur=0 avail=0) BackendConnectionPool [woz:8389/:pool2-DS1] READ (max=1024 cur=0 avail=0) BackendConnectionPool [woz:8389/:pool2-DS1] WRITE (max=1024 cur=0 avail=0) BackendConnectionPool [woz:8389/:pool2-DS1] Bound connections = 0
The LDAP server keeps some statistics around the usage of the pools.
bindConnectionsRequested = 0 bindConnectionsProvided = 0 bindConnectionsRefused = 0 bindConnectionWaitsRequired = 0 bindConnectionsReturnedValid = 0 bindConnectionsReturnedInvalid = 0 readConnectionsRequested = 0 readConnectionsProvided = 0 readConnectionsRefused = 0 readConnectionWaitsRequired = 0 readConnectionsReturnedValid = 0 readConnectionsReturnedInvalid = 0 writeConnectionsRequested = 0 writeConnectionsProvided = 0 writeConnectionsRefused = 0 writeConnectionWaitsRequired = 0 writeConnectionsReturnedValid = 0 writeConnectionsReturnedInvalid = 0
At any time, we have the requested number of connections as requested = provided + refused. Sometimes, the WorkerThread has to wait a bit for a connection to be available. The WorkerThread after completing its job, returns the connection to the pool. If the connection is no more valid, then the connection is returned as invalid and cannot be reused.
These figures around connections to backend can help in the server resource tuning. For example:
totalReadConnections: 1024 availableReadConnections: 0 readConnectionsRequested: 2121 readConnectionsProvided: 1612 readConnectionsRefused: 509 readConnectionWaitsRequired: 1019 readConnectionsReturnedValid: 1612 readConnectionsReturnedInvalid: 0
After analyzing the data provided, the following is concluded:
There are no more connections available in the pool, and the pool has reached its maximum size, that is, 1024 connections.
There are 2121 requests and only 1612 connections are provided, which is bad for scalability.
The worker threads had to wait 1019 times for a connection to be available, which is bad for performance.
Any refused connection will end with a SERVER_ERROR returned to the client.
To avoid the refused connections, raise the maximum number of connections allowed in a pool to avoid the available connections to be exhausted. If this cannot be done, for example, the server has not enough file descriptors then reduce the number of WorkerThreads using the following command:
$ dpconf set-server-prop -e -h host -p port number-of-worker-threads:number
This command sets the numWorkerThreads attribute in cn=config in the conf.ldif file.
The client will not receive SERVER_ERROR status code anymore, at the expense of response time though.