In a global deployment, access to directory services is required in more than one geographical location, or data center. This chapter provides strategies for effectively deploying Directory Server Enterprise Edition across multiple data centers. The strategies ensure that the quality of service requirements identified in Chapter 5, Defining Service Level Agreements are not compromised.
This chapter covers the following topics:
One of the goals of replication is to enable geographic distribution of the LDAP service. Replication enables you to have identical copies of information on multiple servers and across more than one data center. Replication concepts are outlined in Chapter 10, Designing a Scaled Deployment in this guide, and described in detail in Chapter 4, Directory Server Replication, in Sun Java System Directory Server Enterprise Edition 6.2 Reference. This chapter focuses on the replication features that are used in a global deployment.
Directory Server supports multi-master replication over a WAN. This feature enables multi-master replication configurations across geographical boundaries in international, multiple data center deployments.
Generally, if the Number of hosts calculated in Assessing Initial Replication Requirements is less than 16, or not significantly larger, your topology should include only master servers in a fully connected topology, that is, every master replicates to every other master in the topology. In a multi-master replication over WAN configuration, all Directory Server instances separated by a WAN must not be running versions prior to Directory Server 5.2. For a multi-master topology with more than 4 masters, Directory Server 6.x is required.
The replication protocol provides full asynchronous support, as well as window, grouping, and compression mechanisms. These features make multi-master replication over a WAN viable. Replication data transfer rates will always be less than what the available physical medium allows in terms of bandwidth. If the update volume between replicas cannot physically be made to fit into the available bandwidth, tuning will not prevent replicas from diverging under heavy update load. Replication delay and update performance are dependent on many factors, including but not limited to modification rate, entry size, server hardware, average latency and average bandwidth.
Internal parameters of the replication mechanism are optimized by default for WANs. However, if you experience slow replication due to the factors mentioned above, you may wish to empirically adjust the window size and group size parameters. You may also be able to schedule your replication to avoid peak network times, thus improving your overall network usage. Finally, Directory Server supports the compression of replication data to optimize bandwidth usage.
When you replicate data over a WAN link, some form of security to ensure data integrity and confidentiality is advised. For more information on security methods available in Directory Server, see Chapter 2, Directory Server Security, in Sun Java System Directory Server Enterprise Edition 6.2 Reference.
Directory Server provides group and window mechanisms to optimize replication flow. The group mechanism enables you to specify that changes are sent in groups, rather than individually. The group size represents the maximum number of data modifications that can be bundled into a single update message. If the network connection appears to be the bottleneck for replication, increase the group size and check replication performance again. For information on configuring the group size, see Configuring Group Size in Sun Java System Directory Server Enterprise Edition 6.2 Administration Guide.
The window mechanism specifies that a certain number of update requests are sent to the consumer, without the supplier having to wait for an acknowledgement from the consumer before continuing. The window size represents the maximum number of update messages that can be sent without immediate acknowledgement from the consumer. It is more efficient to send many messages in quick succession instead of waiting for an acknowledgement after each one. Using the appropriate window size, you can eliminate the time replicas spend waiting for replication updates or acknowledgements to arrive. If your consumer replica is lagging behind the supplier, increase the window size to a higher value than the default, such as 100, and check replication performance again before making further adjustments. When the replication update rate is high and the time between updates is therefore small, even replicas connected by a LAN can benefit from a higher window size. For information on configuring the window size, see Configuring Window Size in Sun Java System Directory Server Enterprise Edition 6.2 Administration Guide.
Both the group and window mechanisms are based on change size. Therefore, optimizing replication performance with these mechanisms might be impractical if the size of your changes varies considerably. If the size of your changes is relatively constant, you can use the group and window mechanisms to optimize incremental and total updates.
In addition to the grouping and window mechanisms, you can configure replication compression on Solaris and Linux platforms. Replication compression streamlines replication flow, which substantially reduces the incidence of bottlenecks in replication over a WAN. Compression of replicated data can increase replication performance in specific cases, such as networks with sufficient CPU but low bandwidth, or when there are bulk changes to be replicated. You can also benefit from replication compression when initializing a remote replica with large entries. Do not set this parameter in a LAN (local area network) where there is wide network bandwidth, because the compression and decompression computations will slow down replication.
The replication mechanism uses the Zlib compression library. Empirically test and select the compression level that gives you best results in your WAN environment for your expected replication usage.
For more information on configuring replication compression, see Configuring Replication Compression in Sun Java System Directory Server Enterprise Edition 6.2 Administration Guide.
A global topology (with data centers in different countries) might require restricting replication for security or compliance reasons. For example, legal restrictions might state that specific employee information cannot be copied outside of the U.S.A. Or, a site in Australia might require Australian employee details only.
The fractional replication feature enables only a subset of the attributes that are present in an entry to be replicated. Attribute lists are used to determine which attributes can and cannot be replicated. Fractional replication can only be applied to read-only consumers.
For detailed information about how fractional replication works, see Fractional Replication in Sun Java System Directory Server Enterprise Edition 6.2 Reference. For information about how to configure fractional replication, see Fractional Replication in Sun Java System Directory Server Enterprise Edition 6.2 Administration Guide.
Prioritized replication can be used when there is a strong business requirement to have tighter consistency for replicated data on specific attributes. In 5.x version of Directory Server, updates were replicated in the order in which they were received. With prioritized replication, you can specify that updates to certain attributes take precedence when they are replicated to other servers in the topology.
Prioritized replication provides the following benefits:
Improved security. Prioritized replication is used by default for account lockout. Imagine for example that an employee leaves your organization, and you lock the employee's account. To ensure that the employee cannot log in to a remote server to which the account lockout has not been replicated, account lockout changes are replicated before other changes are replicated.
Improved consistency. Directory Server replication is loosely consistent. With prioritized replication, you can assure stronger consistency for certain attributes that are considered important in your organization.
In this scenario, an enterprise has two major data centers, one in London and the other in New York, separated by a WAN. The scenario assumes that the network is very busy during normal business hours.
In this scenario, the Number of hosts has been calculated to be eight. A fully connected, 4-way multi-master topology is deployed in each of the two data centers. These two topologies are also fully connected to each other. For ease of comprehension, not all replication agreements between the two data centers are shown in the following diagram.
The replication strategy for this scenario includes the following:
Master copies of directory data are held on servers in both data centers.
A multi-master replication topology is deployed between the data centers to provide high availability and write-failover across the deployment.
Replication across the WAN link is scheduled so that it occurs only during off-peak hours to optimize bandwidth.
To increase performance, client applications are directed to local servers. Clients in the U.S. read from and write to masters in the New York data center. Clients in the UK read from and write to masters in the London data center.
In a global enterprise, a centralized data model can cause scalability and performance issues. Directory Proxy Server can be used in such a situation to distribute data efficiently and to route search and update requests appropriately.
In the architecture shown here, a large financial institution has its headquarters in London. The organization has data centers in London, New York, and Hong Kong. Currently, the vast majority of the data that is available to employees resides centrally in legacy RDBMS repositories in London. All access to this data from the financial institution’s client community is over the WAN.
The organization is experiencing scalability and performance problems with this centralized model and decides to move to a distributed data model. The organization also decides to deploy an LDAP directory infrastructure at the same time. Because the data in question is considered “mission critical” it must be deployed in a highly available, fault-tolerant infrastructure.
An analysis of client application profiles has revealed that the data is customer-based. Therefore, 95 percent of the data accessed by a geographical client community is specific to that community. Clients in Asia rarely access data for a customer in North America, although this does happen infrequently. The client community must also update customer information from time to time.
The following figure shows the logical architecture of the distributed solution.
Given the profile of 95 percent local data access, the organization decides to distribute the directory infrastructure geographically. Multiple directory consumers are deployed in each geographical location: Hong Kong, New York, and London. London consumers are not shown in the diagram for ease of understanding. Each of these consumers is configured to hold the customer data specific to the location. Data for European and Middle East customers is held in the London consumers. Data for North and South American customers is held in the New York consumers. Data for Asian and Pacific Rim customers is held in the Hong Kong consumers.
With this deployment, the overwhelming data requirement of the local client community is located in the community. This strategy provides significant performance improvements over the centralized model. Client requests are processed locally, reducing network overhead. The local directory servers effectively partition the directory infrastructure, which provides increased directory server performance and scalability. Each set of consumer directory servers is configured to return referrals if a client submits an update request. Referrals are also returned if a client submits a search request for data that is located elsewhere.
Client LDAP requests are sent to Directory Proxy Server through a hardware load balancer. The hardware load balancer ensures that clients always have access to at least one Directory Proxy Server. The locally deployed Directory Proxy Server initially routes all requests to the array of local directory servers that hold the local customer data. The instances of Directory Proxy Server are configured to load balance across the array of directory servers. This load balancing provides automatic failover and failback.
Client search requests for local customer information are satisfied by a local directory. Appropriate responses are returned to the client through Directory Proxy Server. Client search requests for geographically “foreign” customer information are initially satisfied by the local directory server by returning a referral back to Directory Proxy Server.
This referral contains an LDAP URL that points to the appropriate geographically distributed Directory Proxy Server instance. The local Directory Proxy Server processes the referral on behalf of the local client. The local Directory Proxy Server then sends the search request to the appropriate distributed instance of Directory Proxy Server. The distributed Directory Proxy Server forwards the search request on to the distributed Directory Server and receives the appropriate response. This response is then returned to the local client through the distributed and the local instances of Directory Proxy Server.
Update requests received by the local Directory Proxy Server are also satisfied initially by a referral returned by the local Directory Server. Directory Proxy Server follows the referral on behalf of the local client. However, this time the proxy forwards the update request to the supplier directory server located in London. The supplier Directory Server applies the update to the supplier database and sends a response back to the local client through the local Directory Proxy Server. Subsequently, the supplier Directory Server propagates the update down to the appropriate consumer Directory Server.