Sun Directory Server Enterprise Edition 7.0 Troubleshooting Guide

Analyzing Replication Halt Data

Use the data you collected to determine if the replication halt is the result of a problem on the supplier or the consumer.

Use the nsds50ruv attribute output that you collected to determine the last CSN that was replicated to a particular consumer. Then, use the consumer's access and errors logs, with the logs set to collect replication level output, to determine the last CSN that was replicated. From this CSN, you can determine the next CSN that the replication process is failing to provide. For example, replication may be failing because the supplier is not replicating the CSN, because the network is blocking the CSN, or because the consumer is refusing to accept the update.

Maybe the CSN cannot be updated on the consumer. Try to grep the CSN that the supplier can not update on the consumer as follows:

grep csn=xxxxxxxx consumer-access-log

If you do not find the CSN, try searching for the previous successful CSN committed to the supplier and consumer that are currently failing. Using CSNs, you can narrow your search for the error.

By using the grep command to search for CSNs in the access and errors logs, you can determine if an error is only transient. Always match the error messages in the errors log with its corresponding access log activity.

If analysis proves that replication is always looping in the same CSN with an etime=0 and an err=32 or err=16, the replication halt is likely to be a critical error. If the replication halt arises from a problem on the consumer, you can run the replck tool to fix the problem by patching the contents of the looping entry in the physical database.

If instead analysis proves that replication is not providing any report of the CSN in the consumer logs, then the problem is likely the result of something on the supplier side or network. If the problem originates with the supplier, you can sometimes restart replication by forcing the replication agreement to send updates to the remote replica or by restarting the supplier. Otherwise, a reinitialization may be required.

To force updates to the remote replica from the local suffix, use the following command:

# dsconf update-repl-dest-now -h host -p port suffix-DN host:port

Resolving a Problem With the Schema

If the error log contains messages indicating a problem with the schema, then collect further schema related information. Before changes are sent from a supplier to a consumer, the supplier verifies that the change adheres to the schema. When an entry does not comply to the schema and the supplier tries to update this entry, a loop can occur.

To remedy a problem that arises because of the schema, get a single supplier that can act as the master reference for schema. Take the contents of its /install-path/resources/schema directory. Tar the directory as follows:

# tar -cvs schema schema.tar

Use FTP to export this tar file to all of the other suppliers and consumers in your topology. Remove the /install-path/resources/schema directory on each of the servers and replace it with the tar file you created on the master schema reference.