Sun ONE logo      Previous      Contents      Index      Next     

Sun ONE Identity Synchronization for Windows Installation and Configuration Guide

Chapter 8
Troubleshooting

This chapter provides Identity Synchronization for Windows troubleshooting information.

It includes the following sections:


Troubleshooting Checklist


Note

Administrators: When you are debugging problems, adjust the logging level (as described in Logs and Status) to ensure the log reflects all events that may be causing problems.

Some events (such as the program failing to synchronize a user change because the user was not included in the SUL) are not included in a log file until you adjust the log level to FINE or higher. The log level should be left at INFO during all idsync linkusers and idsync resync operations.


  1. Are there any problems reported in the central error.log?
  2. isw-hostname/logs/central/error.log.

    Almost all errors will be reported in this log file. More information on any error is usually available in the audit.log file. To ease correlation of related log entries, this file also includes all entries in the error log.

  3. The Release Notes document many known issues. Is this problem explained there?
  4. Was the installation performed on a clean machine? Problems might occur when this product is reinstalled if the uninstallation of the previous configuration was not complete. Please refer to Removing the Software for more instructions on how to clean up previous installations.
  5. Was the core properly installed? If core installation completed successfully, then log files will exist in the isw-hostname/logs/central/ directory.
  6. Was the Directory Server running during resource configuration?
  7. Is the core, including the Sun ONE Message Queue and the System Manager, currently running? On Windows, check for the appropriate service name. On Solaris, check for the appropriate daemon name. Use the idsync printstat command to verify that the Sun ONE Message Queue and System Manager are active.
  8. Was a configuration saved successfully? If the idsync printstat command lists connectors, then a configuration was saved successfully.
  9. Were all connectors installed? One connector must be installed for each directory source being synchronized.
  10. Were all subcomponents installed? Sun ONE Directory Server and Windows NT connectors require subcomponents to be installed after the connector installation. The Sun ONE Directory Server plugin must be installed in each Sun ONE Directory Server replica.
  11. Were post-installation procedures followed? The Sun ONE Directory Server must be restarted after the Directory Server plugin is installed. The Windows NT Primary Domain Controller must be rebooted after the Windows NT subcomponents are installed.
  12. Was synchronization started either from the console or command line?
  13. Are all connectors currently running?
  14. Verify that all connectors are in the SYNCING state using the console or idsync printstat.
  15. Are the directory sources being synchronized currently running?
  16. Verify using the console that modifications and/or creates are synchronized in the expected direction(s).
  17. If synchronizing users that existed in only one directory source, were these users created in the other directory source using the idsync resync command

    Note

    You must run idsync resync whenever there are existing users (even after running idsync linkusers). If you do not resynchronize existing users, resynchronization behavior remains undefined.


  18. If synchronizing users that existed in both directory sources, were these users linked using the idsync linkusers command?
  19. If user creates fail from Active Directory or Windows NT to the Sun ONE Directory Server, verify that all mandatory attributes in the Sun ONE Directory Server objectclass are specified as creation attributes and values for the corresponding attributes are present in the original user entry.
  20. If synchronizing creates from Directory Server to Windows NT and the user creation succeeded, but the account is unusable, verify that the user name does not violate Windows NT requirements.
  21. For example, if you specify a name that exceeds the maximum allowable length for Windows NT, the user will be created on NT but will remain unusable and uneditable until you rename the user (User -> Rename).

  22. For the Windows NT SAM Change Detector subcomponent to be effective, you must turn on the NT audit log. Under Start > Programs > Administrative Tools > User Manager, select Policies > Audit Policies.
    Select Audit These Events and then both the Success and Failure boxes for User and Group Management.
  23. Under Event Log Settings in the Event Viewer>Event Log Wrapping, select Overwrite Events as Needed.

  24. Are the users that fail to synchronize within a Synchronization User List? I.e. do they match the base DN and filter of a Synchronization User List? In deployments that include Active Directory, on-demand password synchronization fails silently if the Sun ONE Directory Server entry is not in any Synchronization User List. This most often occurs because the filter on the Synchronization User List is incorrect.
  25. Were the synchronization settings changed? If the synchronization settings changed from only synchronizing users from Active Directory to the Sun ONE Directory Server to synchronizing users from the Sun ONE Directory Server to Active Directory, then the Active Directory SSL CA certificate must be added to the connector’s certificate database. The idsync certinfo command reports what SSL certificates must been installed based on the current SSL settings.
  26. Are all host names properly specified and resolvable in DNS? The Active Directory domain controller should be DNS-resolvable from the machine where the Active Directory connector is running and the machine where the Sun ONE Directory Server plugin is running.
  27. Does the IP address of the Active Directory domain controller resolve to the same name that the connector uses to connect to it?
  28. Does the source connector detect the change to the user? Use the central audit.log to determine if the connector for the directory source where the user was added or modified detects the modification.
  29. Does the destination connector process this modification?
  30. Are multiple Synchronization User Lists configured? If so, are these in conflict? More specific Synchronization User Lists should be ordered before less specific ones using the console.
  31. If flow is set to bidirectional or from Sun to Windows and there are Active Directory data sources in your deployment, are the connectors configured to use SSL communication?
  32. If memory problems are suspected on Solaris environments check the processes. To view which components are running as different processes, enter
  33. /usr/ucb/ps -gauxwww | grep com.sun.directory.wps

    The output gives the full details including the ID of connectors, system manager and central logger. This can be useful to see if any of the processes are consuming excessive memory.

  34. If you are creating or editing the Sun ONE Directory source, and the Directory Server does not display in the Choose a known server drop-down list, check that the Directory Server is running. The Directory Server must be running to appear in the drop down list of available hosts.
  35. If the server in question is down temporarily, type the host and port into the Specify a server by providing a hostname and port field.

  36. Do you receive the following error while running uninstaller program?
  37. ./runInstaller.sh

    IOException while making /tmp/SolarisNativeToolkit_5.5.1_1 executable:java.io.IOException: Not enough space

    java.io.IOException: Not enough space

Increase the size of the swap file mounted at /tmp.


Troubleshooting Connectors

How to determine the ID of a connector managing a directory source

Using the central logs

Determine the connector IDs of the directory sources being synchronized by looking in the central audit log. At startup, the central logger logs the IDs of each connector and the directory source that it manages. Look for the last instance of the startup banner for the most recent information. For example, in the following log message there are two connectors: CNN101 is a Sun Directory connector that manages dc=airius,dc=com, and CNN100 is an Active Directory connector that manages the airius.com domain.

[2003/03/19 00:00:00.722 -0600] INFO 16 "System Component Information: SysMgr_100 is the system manager (CORE); console is the Product Console User Interface; CNN101 is the connector that manages [dc=airius,dc=com (ldap://host1.airius.com:389)]; CNN100 is the connector that manages [airius.com (ldaps://host2.airius.com:636)];"

Using idsync printstat

The connector IDs and status are also available from the idsync printstat command. A sample output of this command is shown below.


Connector ID: CNN100
  Type: Active Directory
  Manages: airius.com (ldaps://host2.airius.com:636)
  State: READY

Connector ID: CNN101
  Type: Sun ONE Directory
  Manages: dc=airius,dc=com (ldap://host1.airius.com:389)
  State: READY

Sun ONE Message Queue Status: Started

Checking the System Manager status over the Sun ONE Message Queue.

System Manager Status: Started

SUCCESS

How to determine a connector’s current state.

Determine the current state of the connectors involved in the synchronization. This can be done using the status pane in the console, the idsync printstat command as shown above, or by looking in the central audit.log. Search for the last message in the audit.log that reports the state of the connector. For example, in this log message we see that connector CNN101 is in the READY state.

[2003/03/19 10:20:16.889 -0600] INFO 13 SysMgr_100 host1 "Connector [CNN101] is now in state "READY"."

Table 8-1  Connector State Meanings

State

Meaning

UNINSTALLED

The connector has not be installed.

INSTALLED

The connector has been installed, but it has not received its configuration.

READY

The connector has been installed and has received its configuration, but it has not started to synchronize.

SYNCING

The connector has been installed, has received its configuration, and has attempted to start synchronizing.

What to do if the connector is in the UNINSTALLED state.

Install the connector.

What to do if the connector is in the INSTALLED state.

If a connector remains in the installed state for a long period of time, then most likely it is not running, or it is unable to communicate with the Sun ONE Message Queue.

At the machine where the connector was installed, look in the connector’s logs (audit.log and error.log) for potential errors. If the connector cannot connect to the Sun ONE Message Queue, then that error will be reported here. If this is the case, see "Troubleshooting Sun ONE Message Queue" for possible causes.

If the most recent messages in the audit log are old, then perhaps the connector is not running. See "Troubleshooting Components".

What to do if the connector is in the READY state.

A connector remains in the READY state until synchronization has been started and all of its subcomponents have been installed and have connected to the connector. If synchronization has not been started, then start it using the console or command line utility.

If synchronization has been started, but a connector does not enter the SYNCING state, then there is likely a problem with subcomponent. See "Troubleshooting Subcomponents".

What to do if the connector is in the SYNCING state.

If all connectors are in the SYNCING state, but modifications are not being synchronized, then verify that the synchronization settings are correct:


Troubleshooting Components

On Windows:

Using the Service control panel, check that the “Sun ONE Identity Synchronization for Windows” service is started. If it is not started, then Identity Synchronization for Windows is not running on that machine, and should be started. If the service is started, then verify using the Task Manager that pswwatchdog.exe is running and that the expected number of java.exe processes are running:

On Solaris:

The command /usr/ucb/ps -auxww | grep com.sun.directory.wps will list all of the Identity Synchronization for Windows processes running. This table shows which processes should be running.

Table 8-2  Identity Synchronization for Windows Processes

Java Process Class Name

Component

When Present

com.sun.directory.wps.watchdog.server.WatchDog

System Watchdog

always

com.sun.directory.wps.centrallogger.CentralLoggerManager

Central Logger

only where core is installed

com.sun.directory.wps.manager.SystemManager

System Manager

only where core is installed

com.sun.directory.wps.controller.AgentHarness

Connector

one for each connector installed

If the expected number of processes are not running, then issue the following commands to restart all Identity Synchronization for Windows processes.

# /etc/init.d/isw stop

# /etc/init.d/isw start

If the WatchDog process is running, but the expected number of java.exe processes are not running, then see the “Examining WatchList.properties” section below to verify that all components were installed properly.

Like other system components, the Sun ONE Directory Server plugin sends log records over the bus that are managed by the central logger for end-user viewing. However, the plugin also logs some messages that may not show up over the bus (for instance when the subcomponent cannot contact the connector). In this case the log messages only show up in the plugin’s log directory on the file system, which should look something like <server root>/isw-<host>/logs/SUBC<id>.

Since the plugin runs in process with the directory server, there could potentially be a problem for the plugin’s ability to write into its log directory. This happens if the directory server runs as a different user than the owner of the log directory. In this case, it may be necessary to give the plugin permission explicitly by changing the directories permission or owner using native operating system tools.

Examining WatchList.properties

On each machine where a Identity Synchronization for Windows component is installed, the isw-<machine-name>/resources/WatchList.properties file enumerates the components that should run on that machine. The process.name[n] properties name the components that should be running.

On machines where core is installed, WatchList.properties will include entries for the Central Logger and System Manager:

process.name[1]=Central Logger

process.name[2]=System Manager

On machines where connectors are installed, WatchList.properties will include a separate entry for each connector. The process.name property is the connector ID:

process.name[3]=CNN100

process.name[4]=CNN101

If there is a mismatch between the entries in WatchList.properties and the actively running processes, then restart the Identity Synchronization for Windows daemon or service.

If there are fewer than expected entries in WatchList.properties (e.g. only one connector entry even though two were installed), then examine the installation logs for possible installation failures. On Solaris, these logs are in /var/sadm/install/logs/ and on Windows, they are in the %TEMP% directory.


Troubleshooting Subcomponents

  1. Have all subcomponents been installed?
  2. Subcomponent installation must be done after the connector is installed:

  3. For Active Directory connectors, no subcomponents are installed.
  4. For Sun ONE Directory connectors, the plugin must be installed at the Sun ONE Directory Server being synchronized.
  5. For Windows NT connectors, the Windows change detector and password filter plugins must be installed on the primary domain controller for each Windows NT domain being synchronized. These two subcomponents are installed together after the Windows NT connector has been installed.

  6. Note

    For the Windows NT SAM Change Detector subcomponent to be effective, you must turn on the NT audit log. Under Start > Programs > Administrative Tools > User Manager, select Policies > Audit Policies.
    Select Audit These Events and then both the Success and Failure boxes for User and Group Management.

    Under Event Log Settings in the Event Viewer>Event Log Wrapping, select Overwrite Events as Needed.


    .

  7. Have the subcomponent post installation steps been followed?
  8. After the Directory Server plugin has been installed at the Sun ONE Directory Server, the server must be restarted. After the NT change detector and password filter have been installed on the primary domain controller, the server must be rebooted.

  9. Are the subcomponents running?
  10. Is the Sun ONE Directory Server where the plugin was installed running? Is the Primary Domain Controller where the Change Detector and Password Filter were installed running?

  11. Have the subcomponents established a network connection to the connector?
  12. On the machine where the connector is running, verify that the connector is listening for the subcomponent’s connection by running netstat –n –a. The following examples show the results of this command for three different scenarios. (The connector was configured to listen on port 9999.)

    1. The connector is listening for incoming connections, and the subcomponent has successfully connected:
    2. > netstat –n –a | grep 9999

      *.9999 *.* 0 0 65536 0 LISTEN

      12.13.1.2.44397 12.13.1.2.9999 73620 0 73620 0 ESTABLISHED

      12.13.1.2.9999 12.13.1.2.44397 73620 0 73620 0 ESTABLISHED

      This is the expected result.

    3. The connector is listening for incoming connections, but the subcomponent has not connected:
    4. # netstat –n –a | grep 9999

      *.9999 *.* 0 0 65536 0 LISTEN

      After verifying that the subcomponent is running, examine the subcomponent’s local logs for potential problems.

    5. The connector is not listening for incoming connections:
    6. # netstat –n –a | grep 9999

      <no output>

      Verify that the correct port number was specified. Verify that the connector is running and is in the READY state. Examine the connector’s local logs for potential problems.


Troubleshooting Sun ONE Message Queue

Verify that the Sun ONE Message Queue broker is running. Issuing a telnet command to the machine and port where the Sun ONE Message Queue broker is running will return a list of the active Message Queue services:

# telnet localhost 7676
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
101 psw-broker 3.0.1
cluster tcp CLUSTER 32914
admin tcp ADMIN 32912
portmapper tcp PORTMAPPER 7676
ssljms tls NORMAL 32913
jms tcp NORMAL 32911
.

Connection closed by foreign host.

If the “ssljms tcp NORMAL” service is not listed in the output, then examine the Sun ONE Message Queue logs for potential problems. If the core was installed on Solaris, then the Sun ONE Message Queue broker’s log is /var/imq/instances/psw-broker/log/log.txt. Otherwise, if the core was installed on Windows, then the broker’s log is:

<installation-root>\isw-machine-name\imq\var\instances\isw-broker\ log\log.txt.

If telnet command fails, then either the broker is not running or the wrong port was specified. Verify the port number by checking the port number in the broker’s log. The broker’s port is specified in the following line

[13/Mar/2003:18:17:09 CST] [B1004]: “Starting the portmapper service using tcp [ 7676, 50 ] with min threads 1 and max threads of 1”

If the broker is not running, then it can be started on Solaris by running /etc/init.d/imq start and on Windows by starting the iMQ Broker Windows service.

Troubleshooting Broker Configuration Directory Communication

The Sun ONE Message Queue broker authenticates clients against the Sun ONE Directory Server that stores the Identity Synchronization configuration. If the broker is unable to connect to this directory server, no clients will be able to connect to the Sun ONE Message Queue, and the broker log will mention some javax.naming exception, such as “javax.naming.CommunicationException” or “javax.naming.NameNotFoundException”. If this occurs, do the following

Troubleshooting Broker Memory Settings

During normal operation, the Sun ONE Message Queue broker consumes a modest amount of memory. However during idsync linkusers and idsync resync operations, the broker’s memory requirements increase. If the broker reaches its memory limit, undelivered messages will accumulate, the idsync linkusers or idsync resync operation will slow down dramatically or stop completely, and the Identity Synchronization system might be unresponsive after this. When the broker enters a low memory state, the following messages will appear in its log

[03/Nov/2003:14:07:51 CST] [B1089]: In low memory condition, Broker is attempting to free up resources

[03/Nov/2003:14:07:51 CST] [B1088]: Entering Memory State [B0024]: RED from previous state [B0023]: ORANGE - current memory is 1829876K, 90% of total memory

To avoid this situation,

If the broker does run out of memory, follow these steps to recover:

  1. Verify that the broker has a backlog of undelivered messages by examining its persistent message store. On Solaris, the broker’s persistent message store is in the /var/imq/instances/psw-broker/filestore/message/ directory, and on Windows it is in the<installation-root>\isw-machine-name\imq\var\ instances\isw-broker\filestore\message\ directory. Each file in this directory contains a single undelivered message. If there are more than 10000 files in this directory, then the broker has a backlog of messages.1 Otherwise, there is another problem with the broker.
  2. The backlog of messages are most likely only log files related to an idsync linkuser and idsync resync operation and can safely be removed.
  3. Stop the Sun ONE Message Queue broker as described in Starting and Stopping Services.
  4. Remove all files in the persistent message store. This can most easily be done by recursively removing the message/ directory and then recreating it.
  5. Restart the Sun ONE Message Queue broker.
  6. Follow the steps above to make sure the broker does not run out of memory again.


Troubleshooting SSL Problems

When diagnosing problems with SSL, also see the Configuring Security, which describes how to setup SSL between components in Sun ONE Identity Synchronization. This section contains:

SSL Between Core Components

The Identity Synchronization for Windows installer cannot verify that the SSL port provided during core installation is correct. If you incorrectly type the SSL port during core installation, then the core components will not be able to communicate properly. You may not notice any problem till you try to save the configuration for the first time. The console will alert you with the following warning: “The configuration was successfully saved, however, the System Manager could not be notified of the new configuration.”

The system manager logs will have the following entry:

[10/Nov/2003:10:24:35.137 -0600] WARNING 14 example "Failed to connect to the configuration registry because "Unable to connect: (-5981) Connection refused by peer.". Will retry shortly."

In this situation, uninstall the core and install it again with the correct SSL port number.

SSL between Connectors and the Sun ONE Directory Server or Active Directory

If a connector is unable to connect over SSL to the Sun ONE Directory Server or Active Directory, then this message will appear in the central error log:

[06/Oct/2003:14:02:48.911 -0600] WARNING 14 CNN100 host1 "failed to open connection to ldaps://host2.airius.com:636."

Untrusted Certificates

More information will be available in the central audit log. For example, if the LDAP server’s SSL certificate is not trusted this message will be logged

[06/Oct/2003:14:02:48.951 -0600] INFO 14 CNN100 host1 "failed to open connection to ldaps://host2.airius.com:636, error(91): Cannot connect to the LDAP server, reason: SSL_ForceHandshake failed: (-8179) Peer’s Certificate issuer is not recognized."

In most situations, the CA certificate has not been added to the connector’s certificate database. This can be confirmed by running the certutil program that ships with the Sun ONE Directory Server.2 In this example, the certificate database contains no certificates:3

# /usr/sunone/servers/shared/bin/certutil -L -d /usr/sunone/servers/ isw-host1/etc/CNN100

Certificate Name                                      Trust Attributes

p Valid peer
P Trusted peer (implies p)
c Valid CA
T Trusted CA to issue client certs (implies c)
C Trusted CA to certs(only server certs for ssl) (implies c)
u User cert
w Send warning

In the following example, the certificate database contains only the Active Directory CA certificate:

# /usr/sunone/servers/shared/bin/certutil -L -d /usr/sunone/servers/ isw-host1/etc/CNN100

Certificate Name                                 Trust Attributes

airius.com CA                                    C,c,

p Valid peer
P Trusted peer (implies p)
c Valid CA
T Trusted CA to issue client certs (implies c)
C Trusted CA to certs(only server certs for ssl) (implies c)
u User cert
w Send warning

As shown here, the trust flags of the CA certificate must be “C,,”. If the certificate exists and the trust flags are set properly, but the connector still cannot connect, then first verify that the connector was restarted after adding the certificate, and then use the ldapsearch command that ships with the Sun ONE Directory to help diagnose the problem. If ldapsearch does not accept the certificate, then neither will the connector. For example, ldapsearch can reject certificates if they are not trusted

# /usr/sunone/servers/shared/bin/ldapsearch -Z -P /usr/sunone/ servers/isw-host1/etc/CNN100 -h host2 -b "" -s base "(objectclass=*)"
ldap_search: Can’t contact LDAP server
    SSL error -8179 (Peer’s Certificate issuer is not recognized.)

The -P option directs ldapsearch to use connector CNN100’s certificate database for SSL certificate validation. After the correct certificate is added to the connector’s certificate database, verify that ldapsearch accepts the certificate, and then restart the connector.

Expired Certificates

If the server’s certificate has expired, this message will be logged

[06/Oct/2003:14:06:47.130 -0600] INFO 20 CNN100 host1 "failed to open connection to ldaps://host2.airius.com:636, error(91): Cannot connect to the LDAP server, reason: SSL_ForceHandshake failed: (-8181) Peer’s Certificate has expired."

In this case, the server must be issued a new certificate.

SSL between the Sun ONE Directory Server Plugin and Active Directory

By default the Sun ONE Directory Server does not communicate with Active Directory over SSL when performing on-demand password synchronization. If the default is overridden to protect this communication with SSL, then the Active Directory CA certificate must be added to the Sun ONE Directory Server certificate database of each master replica as described in Configuring Security. If this certificate is not added, users will fail to bind to the Sun ONE Directory Server with the error “DSA is unwilling to perform.”, and the plugin’s log (e.g. isw-hostname/logs/SUBC100/pluginwps_log_0.txt) will report

[06/Nov/2003:15:56:16.310 -0600] INFO td=0x0376DD74 logCode=81 ADRepository.cpp:310 "unable to open connection to Active Directory server at ldaps://host2.airius.com:636, reason: "

In this situation, the Active Directory CA certificate must be added to the directory server’s certificate database and the directory server restarted.

1Even if all messages have been delivered, the broker might maintain up to 10000 message files to avoid the performance penalty of creating and deleting files.

2Before running this command on Solaris, the <installation-root>/lib directory must be added to the LD_LIBRARY_PATH environment variable.

3The default certificate databases for the Sun ONE Directory Server and Windows NT connectors include two certificates, saint-cert100 and saintRootCA. These certificates are not used in this release.



Previous      Contents      Index      Next     


Copyright 2003 Sun Microsystems, Inc. All rights reserved.