Sun Java System Directory Server Enterprise Edition 6.0 Reference

Chapter 4 Directory Server Replication

This chapter includes the following sections:

Introduction to Replication

This introduction to replication addresses the following topics:

Types of Replica

A database that participates in replication is called a replica. There are three kinds of replica:

A master replica is a read-write database that contains a master copy of the directory data. A master replica can perform the following tasks:
- Respond to update requests and read requests from directory clients
- Maintain historical information and a change log for the replica
- Initiate replication to consumers or hubs
A consumer replica is a read-only database that contains a copy of the information held in a master replica. A consumer replica can perform the following tasks:
- Respond to read requests
- Maintain historical information for the replica
- Refer update requests to servers that contain a master replica
A hub replica is a read-only database, like a consumer replica, but stored on a directory server that supplies one or more consumer replicas. A hub replica can perform the following tasks:
- Respond to read requests
- Maintain historical information and a change log for the replica
- Initiate replication to consumers
- Refer update requests to servers that contain a master replica

A single instance of Directory Server can be configured to manage several replicas.

A replica can act as a supplier of updates, or a consumer of updates, or both.

A supplier is a replica that copies information to another replica.

A master replica can be a supplier to a hub replica and a consumer replica. A hub replica can be a supplier to a consumer replica. In multi-master replication, one master replica can be a supplier to another master replica.
A consumer is a replica to which another replica copies information.

A hub replica and a consumer replica can be consumers of a master replica. A consumer replica can be a consumer of a hub replica. In multi-master replication, one master replica can be a consumer of another master replica.

A replica can be promoted or demoted to change its behavior with respect to other replicas. Dedicated consumers can be promoted to hubs, and hubs can be promoted to masters. Masters can be demoted to hubs, and hubs can be demoted to dedicated consumers.

A server that contains a consumer replica only is called a dedicated consumer.

Unit of Replication

The smallest logical unit of replication is a suffix, also known as a naming context. The term suffix arises from the way the base DN for the naming context is a suffix for all DNs in that context. For example, the suffix dc=example,dc=com contains all directory entries in the Example.com naming context.

The replication mechanism requires one suffix to correspond to one database. The unit of replication applies to both suppliers and consumers. Therefore, two suffixes on a master replica cannot be replicated to one suffix on a consumer replica, and vice versa.

Replica Identity

Master replicas require a unique replica identifier that is a 16-bit integer between 1 and 65534. Consumer and hub replicas all have the replica ID of 65535. The replica ID identifies the replica on which changes are made.

If multiple suffixes are configured on one master, you can use the same replica ID for each suffix on the master. In this way, when a change is made on that replica ID, it is possible to identify the server on which change was made.

Replication Agreements

Replication agreements define the relationships between a supplier and a consumer. The replication agreement is configured on the supplier. A replication agreement contains the following replication parameters:

The suffix to replicate.
The consumer server to which the data is pushed.
The replication schedule.
The bind DN and credentials the master must use to bind to the consumer.
How the connection is secured.
Which attributes to exclude or include in fractional replication, if fractional replication is configured.
The group and window sizes to configure the number of changes you can group into one request and the number of requests that can be sent before consumer acknowledgement is required.
Information about the replication status for this agreement.
The level of compression used in replication on Solaris and Linux systems.

Replication Authentication

Before a master can update a consumer, the consumer authenticates the master by using a special entry called the Replication Manager entry. The master uses the Replication Manager entry to bind to the consumer.

The Replication Manager entry has a special user profile that bypasses all access control rules defined on the consumer server. The special user profile is only valid in the context of replication.

The Replication Manager entry has the following characteristics.

On a consumer server, the Replication Manager is the user who is allowed to perform updates. The entry for Replication Manager must be present for all replicas.
The bind DN of the Replication Manager entry is set in the replication agreement. The bind DN must be configured for hubs, or masters to point to an existing Replication Manager entry.
For initialization and security reasons, the Replication Manager entry cannot be part of the replicated data.

The Replication Manager entry is created by default when you configure replication through the browser-based interfaceDirectory Service Control Center. You can also create your own Replication Manager entry. For information about how to create a Replication Manager entry, see Using a Non-Default Replication Manager in Sun Java System Directory Server Enterprise Edition 6.0 Administration Guide.

Authentication can be performed in the following ways for SSL with replication.

For SSL server authentication, you must have a Replication Manager entry, and its associated password, in the server you are authenticating to.
For SSL client authentication, you must have an entry that contains a certificate in the server you are authenticating to. This entry may or may not be mapped to the Replication Manager entry.

Replication Change Log

All modifications received by a master replica are recorded in a change log. A change log is maintained on all master replicas and hub replicas.

In earlier versions of Directory Server, the change log was accessible over LDAP. In this version of the product, the change log is not accessible over LDAP but is stored in its own database. If your application needs to read the change log, use the retro change log plug-in for backward compatibility. For more information about the retro change log plug-in, see Replication and the Retro Change Log Plug-In.

Change Sequence Number

Each change to a master replica is identified by a change sequence number, CSN. The CSN is generated by the master server and is not visible to the client application. The CSN contains the timestamp, a sequence number, the replica ID, and a subsequence number. The change log is ordered by the CSN.

Replica Update Vector

The replica update vector, RUV, identifies the state of each replica in a topology. Stored on the supplier and on the consumer, the RUV is used to establish which changes need to be replicated. The RUV stores the URL of the supplier, the ID of the supplier, the minimum CSN, and the maximum CSN.

RUVs can be read through nsds50ruv(5dsconf) and nsds50ruv(5dsconf) attributes.

Deleted Entries: Tombstones

Directory entries deleted on one replica are maintained by Directory Server until no longer needed for replication. Such deleted entries are called tombstones, as they have objectclass: nsTombstone. In rare cases, you might need to remove tombstones manually over LDAP.

Tombstones visible only to Directory Manager. Furthermore, tombstones show up only in a search with filter (objectclass=nsTombstone). The following ldapsearch command returns tombstone entries under dc=example,dc=com.

$ ldapsearch -D "cn=Directory Manager" -b dc=example,dc=com "(objectclass=nsTombstone)"

Consumer Initialization and Incremental Updates

During consumer initialization, or total update, all data is physically copied from a master to a consumer. When you have created a replication agreement, the consumer defined by that agreement must be initialized. When a consumer has been initialized, the master can begin to replay, or replicate, update operations to the consumer. Under normal circumstances, the consumer should not require further initialization. However, if the data on a master is restored from a backup, it might be necessary to re-initialize the consumers that depend on that master.

In a multi-master replication topology, the default behavior of a read-write replica that has been re-initialized from a backup or from an LDIF file, is to refuse client update requests. By default, the replica remains in read-only mode until it is configured to accept updates again. You set the suffix property repl-accept-client-update-enabled to on using the dsconf set-suffix-prop command when the oldest updates are on the read-only replica.

When a consumer has been initialized, replication updates are sent to the consumer when the modifications are made on the supplier. These updates are called incremental updates. A consumer can be incrementally updated by several suppliers at once, provided that the updates originate from different replica IDs.

The binary copy feature can be used to clone master replicas or consumer replicas by using the binary backup files of one server to restore another server. For information about how to use binary copy for replication, see Initializing a Replicated Suffix by Using Binary Copy in Sun Java System Directory Server Enterprise Edition 6.0 Administration Guide.

Referrals and Replication

When a consumer receives a request to modify data, it does not forward the request to the server that contains the master replica. Instead, it returns to the client a list of the URLs of the masters that can satisfy the request. These URLs are called referrals.

The replication mechanism automatically configures consumers to return referrals for all known masters in the replication topology. However, you can also add your own referrals and overwrite the referrals set automatically by the server. The ability to control referrals helps enables you to perform the following tasks:

Point referrals to secure ports only
Point to a Directory Proxy Server instead for load balancing
Redirect to local servers only in the case of servers separated by a WAN
Limit referrals to a subset of masters in four-way multi-master topologies

Directory Proxy Server is able to follow referrals.

Replication Configurations

This section covers the following topics:

For information about planning your replication, see the Sun Java System Directory Server Enterprise Edition 6.0 Deployment Planning Guide.

Multi-Master Replication

In multi-master replication, replicas of the same data exist on more than one server. For information about multi-master replication, see the following sections:

Concepts of Multi-Master Replication

In a multi-master configuration, data is updated on multiple masters. Each master maintains a change log, and the changes made on each master are replicated to the other servers. Each master plays the role of supplier and consumer.

Multi-master replication can cause synchronization conflicts. Conflicts are usually resolved automatically by using the timestamp associated with each change, where the most recent change takes precedence. Some rare conflicts must be resolved manually. For more information, see Solving Common Replication Conflicts in Sun Java System Directory Server Enterprise Edition 6.0 Administration Guide.

Each supplier in a multi-master environment must have a replication agreement. The following figure shows two master servers and their replication agreements.

Figure 4–1 Multi-Master Replication Configuration (Two Masters)

Figures shows multi-master replication with two master
servers and their replication agreements.

In the preceding figure, Master A and Master B have a master replica of the same data. Each master has a replication agreement that specifies the replication flow. Master A acts as a master in the scope of Replication Agreement 1, and as a consumer in the scope of Replication Agreement 2.

Multi-master replication can be used for the following tasks:

To replicate updates by using the replica ID.

Updates by using the replica ID make it possible for a consumer to be updated by multiple suppliers at the same time, provided that the updates originate from different replica IDs.
To enable or disable a replication agreement.

Replication agreements can be configured but left disabled, then enabled rapidly when required. This feature provides flexibility in replication configuration. This can be done whether you use multiple masters or not.

Multi-Master Replication Over Wide Area Networks

Directory Server supports multi-master replication over WANs, enabling multi-master replication configurations across geographical boundaries in international, multiple data center deployments.

The replication protocol provides full asynchronous support, window and grouping mechanisms, and support for compression. These features render multi-master replication over WAN a viable deployment possibility.

In a multi-master replication over WAN configuration, all instances of Directory Server separated by a WAN must support multi-master replication over WANs.

Group Mechanism and Window Mechanism

The group mechanism and window mechanism can be used to group changes rather than send them individually. The group mechanism and window mechanism can also be used to specify a number of requests that can be sent to the consumer without the supplier waiting for an acknowledgement from the consumer.

For information about how to adjust the group size and window size, see Configuring Network Parameters in Sun Java System Directory Server Enterprise Edition 6.0 Administration Guide.

Replication Compression Mechanisms

Replication compression helps to streamline replication flow and avoid bottlenecks caused by limited bandwidth.

Fully Meshed Multi-Master Topology

In a fully meshed multi-master topology, each master is connected to each of the other masters. A fully meshed topology provides high availability and guaranteed data integrity. The following figure shows a fully meshed, four-way, multi-master replication topology with some consumers.

Figure 4–2 Fully Meshed, Four-Way, Multi-Master Replication Configuration

Figure shows a fully meshed, four-way, multi-master replication
topology

In Figure 4–2, the suffix is held on four masters to ensure that it is always available for modification requests. Each master maintains its own change log. When one of the masters processes a modification request from a client, it records the operation in its change log. The master then sends the replication update to the other masters, and in turn to the other consumers. Each master also stores a Replication Manager entry used to authenticate the other masters when they bind to send replication updates.

Each consumer stores one or more entries that correspond to the Replication Manager entries. The consumers use the entries to authenticate the masters when they bind to send replication updates. It is possible for each consumer to have just one Replication Manager entry that enables all masters to use the same Replication Manager entry for authentication. By default, the consumers have referrals set up for all masters in the topology. When consumers receive modification requests from the clients, they send the referrals to back to the client. For more information about referrals, see Referrals and Replication.

Figure 4–3 presents a detailed view of the replication agreements, change logs, and Replication Manager entries that must be set up on Master A.Figure 4–4 provides the same detailed view for Consumer E.

Figure 4–3 Replication Configuration for Master A (Fully Meshed Topology)

Figure shows the replication agreements, change logs,
and Replication Manager entries in a fully meshed replication topology.

Figure 4–4 Replication Configuration for Consumer Server E (Fully Meshed Topology)

Figure shows a detailed view of the Replication Manager
entries that must be set up on Consumer E in a fully meshed topology.

Master A requires the following:

A master replica
A change log
Replication Manager entries for Masters B, C, and D, unless you use the same Replication Manager entry on each replica
Replication agreements for Masters B, C, and D, and for Consumers E, and F

Consumer E requires the following:

A consumer replica
Replication Manager entries to authenticate Masters A, and B when they bind to send replication updates

Cascading Replication

In a cascading replication configuration, a server acting as a hub receives updates from a server acting as a supplier. The hub replays those updates to consumers. The following figure illustrates a cascading replication configuration.

Figure 4–5 Cascading Replication Configuration

Cascading replication is useful in the following scenarios:

When there are a lot of consumers.

Because the masters in a replication topology handle all update traffic, it could put them under a heavy load to support replication traffic to the consumers. You can off-load replication traffic to several hubs that can each service replication updates to a subset of the consumers.
To reduce connection costs by using a local hub in geographically distributed environments.

The following figure shows cascading replication to a large number of consumers.

Figure 4–6 Cascading Replication to a Large Number of Consumers

Figure shows cascading replication to a large number
of consumers with replication traffic through several hubs..

In Figure 4–6, hubs 1 and 2 relay replication updates to consumers 1 through 10, leaving the master replicas with more resources to process directory updates.

The masters and the hubs maintain a change log. However, only the masters can process directory modification requests from clients. The hubs contains a Replication Manager entry for each master that sends updates to them. Consumers 1 through 10 contain Replication Manager entries for hubs 1 and 2.

The consumers and hubs can process search requests received from clients, but cannot process modification requests. The consumers and hubs refer modification requests to the masters.

Prioritized Replication

In previous versions of Directory Server, updates were replicated in chronological order. In this version of the product, updates can be prioritized for replication. Priority is a boolean feature, it is on or off. There are no levels of priority. In a queue of updates waiting to be replicated, updates with priority are replicated before updates without priority.

Priority rules are configured with the following replication priority rule properties:

The identity of the client, bind-dn.
The type of update, op-tyupe.
The entry or subtree that was updated, base-dn.
The attributes changed by the update, att.

For information about these properties, see repl-priority(5dsconf).

When the master replicates an update to one or more hubs or consumer replicas, the priority of the update is the same across all of the hubs and consumer replicas. If one parameter is configured in a priority rule for prioritized replication, all updates that match that parameter are prioritized for replication. If two or more parameters are configured in a priority rule for prioritized replication, all updates that match all parameters are prioritized for replication.

In the following scenario, it is possible that a master replica attempts to replicate an update to an entry before it has replicated the addition of the entry:

The entry is added on the master replica and then updated on the master replica
The update operation has replication priority but the add operation does not have replication priority

In this scenario, the update operation cannot be replicated until the add operation is replicated. The update waits for its chronological turn, after the add operation, to be replicated.

Fractional Replication

Fractional replication can be used to replicate a subset of the attributes of all entries in a suffix or sub-suffix. Fractional replication can be configured, per agreement, to include attributes in the replication or to exclude attributes from the replication. Usually, fractional replication is configured to exclude attributes. The interdependency between features and attributes make managing a list of included attributes difficult.

Fractional replication can be used for the following purposes:

To filter content for synchronization between intranet and extranet servers
To reduce replication costs when a deployment requires only certain attributes to be available everywhere

Fractional replication is configured with the replication agreement properties repl-fractional-include-attr and repl-fractional-exclude-attr attributes. For information about these properties, see repl-agmt(5dsconf). For information about how to configure fractional replication, see Fractional Replication in Sun Java System Directory Server Enterprise Edition 6.0 Administration Guide.

Fractional replication is not backward compatible with versions of Directory Server prior to Directory Server 5.2. If you are using fractional replication, ensure that no other instances of Directory Server are prior to Directory Server 5.2.

Replication and the Retro Change Log Plug-In

The retro change log is a plug-in used by LDAP clients for maintaining application compatibility with earlier versions of Directory Server. The retro change log is stored in a separate database from the Directory Server change log, under the suffix cn=changelog.

A retro change log can be enabled on a standalone server or on each server in a replication topology. When the retro change log is enabled on a server, updates to all suffixes on that server are logged by default.

For information about how to use the retro change log, see Using the Retro Change Log in Sun Java System Directory Server Enterprise Edition 6.0 Administration Guide.

Retro Change Log and Multi-Master Replication

When a retro change log is enabled with replication, the retro change log receives updates from all master replicas in the topology. The updates from each master replica are combined in the retro change log. The following figure illustrates the retro change log on two servers in a multi-master topology.

Figure 4–7 Retro Change Log and Multi-Master Replication

Figure shows the retro change log on two servers in a
multi-master topology.

The retro change log uses the following attributes during replication:

changeNumber (cN): Identifies the order in which an update is logged to the retro change log
replicationCSN (CSN): Identifies the time when an update is made to a given replica
replicaIdentifier (RI): Identifies the replica that is updating the retro change log

The diagram shows that the retro change logs, RCL1 and RCL2, contain the same list of updates, but that the updates do not have the same order. However, for a given replicaIdentifier, updates are logged in the same order on each retro change log. The order in which updates are logged to the retro change log is given by the changeNumber attribute.

Failover of the Retro Change Log

The following figure illustrates a simplified replication topology where a client reads a retro change log on a consumer server.

Figure 4–8 Simplified Topology for Replication of the Retro Change Log

Figure shows a simplified replication topology where
a client reads a retro change log on a consumer server.

All of the updates made to each master replica in the topology are logged to each retro change log in the topology.

The client application reads the retro change log of Directory Server 3 and stores the last CSN for each replica identifier. The last CSN for each replica identifier is given by the replicationCSN attribute.

The following figure shows the client redirecting its reads to Directory Server 2 after the failure of Directory Server 3.

Figure 4–9 Failover of the Retro Change Log

Figure shows a client redirecting its reads to Directory Server 2
after the failure of Directory Server 3.

After failover, the client application must use the retro change log (RCL2) of Directory Server 2 to manage its updates. Because the order of the updates in RCL2 is not the same as the order in RCL3, the client must synchronize its updates with RCL2.

The client examines RCL2 to identify the cN that corresponds to its record of the last CSN for each replica identifier. In the example in Failover of the Retro Change Log, the client identifies the following correspondence between last CSN and cN:

CSN 1 from R1 corresponds to cN4 on RCL2
CSN 2 from R2 corresponds to cN5 on RCL2
CSN 3 from R3 corresponds to cN7 on RCL2
CSN 1 from R4 corresponds to cN6 on RCL2

The client identifies the update corresponding to the lowest cN in this list. In the example in Failover of the Retro Change Log, the lowest cN in the list is cN4. To ensure that the client processes all updates, it must process all updates logged to RCL2 after cN4. The client does not process updates logged to RCL2 before cN4 nor does it process the update corresponding to cN4.

Replication Conflicts and the Retro Change Log

When a replication conflict occurs, Directory Server performs operations to resolve the conflict. When the retro change log is running and the changeIsReplFixupOp attribute is set to true, the following information about the operations is logged in the changeHasReplFixupOp attribute:

Target DN of the operation
The type of update
The change made

For more information about these attributes, see the Sun Java System Directory Server Enterprise Edition 6.0 Man Page Reference.

Restrictions on Using the Retro Change Log

Observe the following restrictions when you use the retro change log:

A master replica running this version of Directory Server cannot be a supplier to a consumer replica running Directory Server 4.x.
In a replicated topology, the retro change logs on replicated servers must be up-to-date with each other. This allows switchover of the retro change log. Using the example in Failover of the Retro Change Log, the last CSN for each replica ID on RCL3 must be present on RCL2.