Sun Java System Directory Server Enterprise Edition 6.1 Deployment Planning Guide

Distributing Data Lower Down in a DIT

In many cases, data distribution is not required at the top of the DIT. However, entries further up the tree might be required by the entries in the portion of the tree that has been distributed. This section provides a sample scenario that shows how to design a distribution strategy in this case.

Logical View of Distributed Data

Example.com has one subtree for groups and a separate subtree for people. The number of group definitions is small and fairly static, while the number of person entries is large, and continues to grow. Example.com therefore requires only the people entries to be distributed across three servers. However, the group definitions, their ACIs, and the ACIs located at the top of the naming context are required to access all entries under the people subtree.

The following illustration provides a logical view of the data distribution requirements.

Figure 10–10 Logical View of Distributed Data

Figure shows how the ou=people branch must be distributed.

Physical View of Data Storage

The ou=people subtree is split across three servers, according to the first letter of the sn attribute for each entry. The naming context (dc=example,dc=com) and the ou=groups containers are stored in one database on each server. This database is accessible to entries under ou=people. The ou=people container is stored in its own database.

The following illustration shows how the data is stored on the individual Directory Servers.

Figure 10–11 Physical View of Data Storage

Figure shows physical storage of distributed data.

Note that the ou=people container is not a subsuffix of the top container.

Directory Server Configuration for Sample Distribution Scenario

Each server described previously can be understood as a distribution chunk. The suffix that contains the naming context and the entries under ou=groups, is the same on each chunk. A multi-master replication agreement is therefore set up for this suffix across each of the three chunks.

For availability, each chunk is also replicated. At least two master replicas are therefore defined for each chunk.

The following illustration shows the Directory Server configuration with three replicas defined for each chunk. For simplification, the replication agreements are only shown for one chunk, although they are the same for the other two chunks.

Figure 10–12 Directory Server Configuration

Figure shows replication topology for distributed data.

Directory Proxy Server Configuration for Sample Distribution Scenario

Client access to directory data through Directory Proxy Server is provided through data views. For information about data views see Chapter 17, Directory Proxy Server Distribution, in Sun Java System Directory Server Enterprise Edition 6.1 Reference.

For this scenario, one data view is required for each distributed suffix, and one data view is required for the naming context (dc=example,dc=com) and the ou=groups subtrees.

The following illustration shows the configuration of Directory Proxy Server data views to provide access to the distributed data.

Figure 10–13 Directory Proxy Server Configuration

Figure shows data view configuration for distributed
data.

Considerations for Data Growth

Distributed data is split according to a distribution algorithm. When you decide which distribution algorithm to use, bear in mind that the volume of data might change, and that your distribution strategy must be scalable. Do not use an algorithm that necessitates complete redistribution of data.

A numeric distribution algorithm based on uid, for example, can be scaled fairly easily. If you start with two data segments of uid=0-999 and uid=1000–1999, it is easy to add third segment of uid=2000–2999 at a later stage.