Sun Java System Directory Server Enterprise Edition 6.3 Troubleshooting Guide

About Replica Update Vectors (RUVs)

Any replica in a replication topology stores its current replication state in a replica update vector (RUV). The RUV is stored in memory by a process that is running and provides the exact knowledge this replica has of itself and every other participant in the replication topology. The RUV entry on a given server contains a line for each master participating in a replication topology. Each line contains an identifier of one of the masters, the URL of the replica, and the CSN of the first and last changes made on the server. The CSN records only the first and last changes known by the server, not necessarily the most recent changes made by the master.

The state of the RUV entry is physically updated every 30 seconds in the following entry:


nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff,suffix-name

The RUV is also stored in memory and accessed using ldapsearch on the cn=replica,cn=suffix,cn=mapping tree,cn=config entry. For example, an ldapsearch for the ou=people suffix might yield the following results:


# ldapsearch -h host1 -p 1389 -D cn=Directory Manager -w secret \
-b cn=replica, 
cn=ou=people,cn=mapping tree,cn=config -s base objectclass=* nsds50ruv
nsds50ruv: {replicageneration} 45e8296c000000010000
nsds50ruv: {replica 1 ldap://server1:1389} 45ed8751000000010000 4600f252000000010000
nsds50ruv: {replica 2 ldap://server1:2389} 45eec0e1000000020000 45f03214000000020000

For clarity, we will simplify the RUV syntax to CSNchange-number-replica-id. The change-number shows which change the RUV corresponds to in the successive changes that occurred on the master. For example, 45ed8751000000010000 can be written as CSN05-1. In the previous illustration, master 1 contains the following RUVs:


replica 1: CSN05-1 CSN43-1
replica 2: CSN05-2 CSN40-2

The first line provides information about the first change and the last change that this replica knows about from itself, master 1, as indicated by the replica ID 1. The second line provides information about the first change and the last change that it knows about from master 2. The information that is most interesting to us is the last change. In normal operations, master 1 should know more about the updates it received than master 2. We confirm this by looking at the RUV for master 2:


replica 2: CSN05-2 CSN50-2
replica 1: CSN01-1 CSN35-1

Looking at the last change, we see that master 2 knows more about the last change it received (CSN50-2) than master 1 (which shows the last change as having occurred at CSN40-2). By contrast, master 1 knows more about its last change (CSN43-1) than master 2 (CSN35-1).

When troubleshooting problems with replication, the CSNs can be useful in identifying the problem. Master 1 should always know at least as much about its own replica ID as any other participant in the replication topology because the change was first applied on master 1 and then replicated. So, CSN43-1 should be the highest value attributed to replica ID 1 in the topology.

A problem is identified if, for example, after 30 minutes the RUV on master 1 is still CSN40-2 but on master 2 the RUV has increased significantly to CSN67-2. This indicates that replication is not happening from master 2 to master 1.

If a failure occurs and you need to reinitialize the topology while saving as much data as possible, you can use the RUV picture to determine which machine contains the most recent changes. For example, in the replication topology described previously you have a hub that contains the following RUV:


2: CSN05-2 CSN50-2
1: CSN05-1 CSN43-1

In this case, hub 1 seems like a good candidate for providing the most recent changes.

Using the nsds50ruv Attribute to Troubleshoot 5.x Replication Problems

When a server stops, the nsds50ruv attribute is not stored in the cn=replica entry. At least every 30 seconds, it is stored in the nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff,suffix-name entry as described above, as an LDAP subentry. This information is stored in the suffix instead of the configuration file because this is the only way to export this information into a file. When you initialize a topology, this occurs when the servers are off line. The data is exported into an LDIF file then reimported. If this attribute was not stored in the exported file, then the new replica would not have the correct information after an import.

Whenever you use the db2ldif -r command, the nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff,suffix-name entry is included.

Using the nsds50ruv and ds6ruv Attributes to Troubleshoot 6.x Replication Problems

In 6.0 and later versions of Directory Server, you can also use the nsds50ruv attribute to see the internal state of the consumer, as described in the previous section. If you are using the replication priority feature, you can use the ds6ruv attribute, which contains information about the priority operations. When replication priority is configured, you create replication rules to specify that certain changes, such as updating the user password, are replicated with high priority, For example, the RUV appears as follows:


nsds50ruv: {replicageneration} 4405697d000000010000
nsds50ruv: {replica 2 ldap://server1:2389}
nsds50ruv: {replica 1 ldap://server1:1390} 440569aa000000010000 44056a23000200010000
ds6ruv: {PRIO 2 ldap://server1:2389}
ds6ruv: {PRIO 1 ldap://server1:1390} 440569b6000100010000 44056a30000800010000 

To see the replication information, export the following file:


# dsadm export instance-path suffix-dn [suffix-dn] ldif-file