|Sun Java(TM) System Directory Server 5.2 2005Q1 Technical Overview|
Directory Server Scalability
This chapter presents what Directory Server has to offer in terms of product scalability. We examine what features make Directory Server scalable and this chapter is divided accordingly into the following sections:
Multiple Database, Multiple Server, and Data Distribution Possibilities
One obvious prerequisite for scalability is the ability to distribute your data across multiple databases and server instances. This section addresses the following topics:
Distributing Data Across Multiple Databases and Servers
Distributing data enables you to scale your directory across multiple databases and server instances. The server instances may or may not (depending on performance requirements) be stored on several machines. A distributed directory can therefore hold a much larger number of entries than would be possible with a single server. What is more, the distribution details can be hidden from the user of client application, which means that as far as directory clients are concerned, a single directory answers their directory queries.
Directory Server's multiple database architecture renders these scalable distributed naming contexts possible, thus providing the ability to support millions of users on a single system, improve backup and restore, load balancing, and simplify administration. Add to this the fact that Directory Server can run as a 64-bit application on Solaris SPARC, which allows you to attain improved performance for high-volume deployments, and you can appreciate the considerable scalability potential Directory Server has to offer.
Depending on what your directory tree is like you may want to distribute it not only across separate databases, but have multiple databases to hold certain large branches of your directory tree. The database functionality provided by Directory Server provides flexible scalability in that you can:
- add databases dynamically without having to take the entire directory off-line
- view databases independently for ease of database management
- configure databases independently (referrals or not, attribute encryption or not, indexes) for finer granularity, irrespective of how large your deployment might be
If these databases are then distributed across multiple servers, which in turn generally tends to equate to several physical machines, we can see that data distribution reduces the work each server needs to do, thus allowing the directory to scale to a much larger number of entries than would otherwise be possible.
How to Manage Your Distributed Data
Once your data is distributed, you must then define the relationships between the distributed data. You do this using pointers to directory information held in different databases. Directory Server provides the referral and chaining mechanisms to help you link distributed data into a single directory tree:
The server returns a piece of information to the client application indicating that the client application needs to contact another server to fulfill the request.
The server contacts other servers on behalf of the client application and returns the combined results to the client application after finishing the operation.
The main difference between using referrals and using chaining is the location of the intelligence that knows how to locate the distributed information. In a chained system, the intelligence is implemented in the servers. In a system that uses referrals, the intelligence is implemented in the client application.
Chaining reduces client complexity, affords dynamic management possibilities, in that it allows you to remove a part of the directory from the system while the entire system remains available to client applications, and provides access control functionality, when the chained suffix impersonates the client application, providing the appropriate authorization identity to the remote server. However, this increased functionality does not come without increased server complexity costs.
With referrals, the client must handle locating the referral and collating search results. However, referrals offer more flexibility for the writers of client applications and allow developers to provide better feedback to users about the progress of a distributed directory operation.
The method, or combination of methods, you choose will depend on the specific needs of your directory. For more detail on referrals and chaining, and how to decide between the two see the Referrals and Chaining section in the Directory Server Deployment Planning Guide.
Scalable Data Management
For data management functionality to satisfy enterprise directory service demands, scalability is essential. To prevent data management from becoming an onerous task Directory Server provides the following scalability features:
Scalable Grouping of Entries
In addition to the hierarchical grouping mechanism provided by the directory information tree, which is of course not optimal for associations between dispersed entries, frequently changing organizations, or data repeated in many entries, Directory Server provides two features to ease identity and relationship management in large directory deployments: groups and roles. Both mechanisms allow for more flexible and thus scalable data management. We examine them both in the following sections:
A group is an entry that identifies the other entries that are its members, whose scope can encompass the entire directory. There are two types of groups, static and dynamic. Static groups explicitly name their member entries and are suitable for groups with relatively few members, such as your directory administrators. Dynamic groups, on the other hand, specify a filter and all entries that match the filter are then considered to be members of that group; dynamic because membership is defined each time the filter is evaluated. It is possible by using the DN of another group as member attribute of a dynamic group, to place groups inside other groups.
Roles are the second entry grouping mechanism that enable you to determine role membership as soon as an entry is retrieved from the directory. Each role has members, or entries that possess the role. Every entry that belongs to a role is given the nsRole virtual attribute whose values are the DNs of all roles for which the entry is a member (termed virtual since it is generated on-the-fly by the server and not stored in the directory). There are three different types of roles:
which allows you to specify roles either explicitly or dynamically depending on which type of role you use.
Tailoring Grouping Mechanisms To Scalability Requirements
Because the groups and roles mechanisms provide a degree of overlapping functionality, it is important to understand their advantages and disadvantages in order to optimize the degree of scalability they can both provide. Generally speaking, the more recent roles mechanism is designed to provide frequently required functionality more efficiently. However, because the choice of a grouping mechanism influences server complexity and determines how clients process membership information, you must plan your grouping mechanism carefully. You must have a detailed understanding of the typical set membership queries and set management operations you will need to perform, to be able to decide which mechanism is more suitable.
Favor the groups mechanism if your priorities include:
- Obtaining a membership list, when the size of the group is less than 20,000 members (above this size of group, Directory Server will perform better using roles)
- Assigning and removing members (since no special access rights are required to add the user to the group contrary to roles)
- Minimizing computation overheads for performance reasons
- Adding or removing sets into or from existing sets (as the groups mechanism does not have any of the nesting restrictions that roles do)
- Maintaining flexibility of scope for grouping entries (roles can only extend their scope via a nested role on the same server instance)
- Using the chaining functionality to distribute your data (as roles are subject to chaining restrictions)
- Supporting LDAP client applications which require group entries to be of the groupOfNames or groupOfUniqueNames object classes
Likewise, favor the roles mechanism if your priorities include:
- Rapidly enumerating the members of a set and calculating set membership for an entry
(Roles push membership information out to the user entry where it can be cached to make subsequent membership tests more efficient, the server performs all computations, and the client only needs to read the values of the nsRole attribute. In addition, all types of roles appear in this attribute, allowing the client to process all roles uniformly. Roles can perform both operations more efficiently and with simpler clients than is possible with groups).
- Integrating your grouping mechanism with existing Directory Server functionality such as Password Policy, Account Inactivation and Access Control thus taking advantage of the role membership computations that the server will do automatically.
For a list of other implementation considerations surrounding roles and groups refer to the "Grouping Directory Entries" and "Managing Attribute" sections of the Directory Server Deployment Planning Guide.
Scalable Attribute Management
Directory Server also provides an attribute management mechanism called Class of Service (CoS) for sharing attributes between entries in a way that is invisible to applications. With CoS, some attribute values need not be stored with the entry itself. Instead they are generated by the CoS logic as the entry is sent to the client application. For the client application, these attributes appear just like all other attributes. CoS allows related entries to share data for coherence and space considerations. We examine the following in this section:
Imagine a directory containing thousands of entries that all have the same value for the facsimileTelephoneNumber attribute. Traditionally, to change the fax number, you would update each entry individually, a time consuming job for administrators. Using CoS, the fax number is stored in a single place, and Directory Server automatically generates the facsimileTelephoneNumber attribute on every concerned entry as it is returned. It goes without saying that CoS is crucial to the scalability of your directory deployment, as it means that directory administrators only have a single fax value to manage as opposed to the thousands that would otherwise be the case. There are three different types of CoS which provide three slightly different types of attribute management functionality. For a detailed insight into the different types of CoS see theManaging Attributes with Class of Service section in the Directory Server Deployment Planning Guide.
Additional scalability is brought by the fact that generated CoS attributes can be multi-valued, roles and CoS can be used together to provide role-based attributes, and CoS priorities can be set (should CoS schemes be created that actually compete with each other to provide an attribute value). Scalability is enhanced given that CoS can bring:
CoS Implementation Considerations
Bear in mind that as CoS virtual attributes are not indexed, referencing them in an LDAP search filter may have an impact on performance, and that as CoS schemes are resource intensive, they should only be used when strictly necessary. As the server caches CoS information, modifications to CoS definitions do not take immediate effect, and as a result, frequent CoS modifications do present a risk, during cache reconstruction, of outdated data being accessed. The cache management performance improvements brought by Sun Java System Directory Server 5.2 2005Q1 do not completely eliminate this risk. For other implementation considerations related to access control and distribution, see "CoS Limitations" in the Directory Server Deployment Planning Guide for more detail regarding CoS related limitations.
Effective Rights Management
The access control model provided by Directory Server is powerful in that access can be granted to users via many different mechanisms. However, this flexibility can make determining what your security policy comprises fairly complex. It is important for this powerful access control model to remain manageable, that is, for it to be simple to administer users and debug the access control policy in place, if it is to scale to the complex deployment requirements at stake, which renders an effective rights retrieval functionality essential. This ability to request the rights of a given user to directory entries and attributes is provided by the Effective Rights LDAP control supported by both the console and LDAP search.
When using the control you specify the DN of the user for which you want to know the effective rights, as well as any additional attributes. The information returned includes:
It is important to note that this rights information corresponds to the ACIs effective at the time of your request, the authentication method used, and the host machine name and address from which your request was made. Also note that viewing effective rights is itself a directory operation that should be protected by placing ACIs on the effective rights attributes. For further information regarding the Effective Rights functionality refer to the "Requesting Effective Rights Information" section of the Directory Server Deployment Planning Guide.
Scalable Schema Management
Directory Server comes with a standard schema that includes hundreds of object classes and attributes. In terms of scalable schema management Directory Server provides you with the following functionality:
- Online Schema Extension
Not only can you extend existing schema by creating new object classes and attributes, either using the Console or using basic LDAP modify commands (from the command line or from an LDAP enabled application), but this can be done without having to take the entire directory off-line.
- Schema Checking
This schema checking functionality allows you to ensure that the object classes and attributes you are using are defined in the directory schema, that the attributes required for an object class are contained in the entry, and that only attributes allowed by the object class are contained in the entry (thus avoiding any resulting inconsistencies).
Directory Proxy Server and Scalability
Directory Proxy Server brings additional scalability to Directory Server which is often essential to the success of your directory service. A brief outline of what Directory Proxy Server's can offer in terms of scalability follows.
Directory Proxy Server helps Directory Server to scale in that it is designed to support high availability directory deployments by providing both automatic load balancing and automatic fail over and fail back among a set of replicated LDAP Directory Servers. When deployments scale in both size and complexity, this guarantee of automatic load balancing, failover and failback is welcome to say the least.
Functionally, Directory Proxy Server is an "LDAP access router" located between LDAP clients and LDAP Directory Servers. Requests from LDAP clients can be filtered and routed to LDAP Directory Servers based on rules defined in the Directory Proxy Server configuration. Results from Directory Server can be filtered and passed back to clients, again based on rules defined in the Directory Proxy Server configuration. This process is totally transparent to the LDAP clients, which connect to Directory Proxy Server just as they would to any LDAP Directory Server.
For extranet and intranet environments it is often necessary to ensure that mission-critical directory-enabled clients and applications have 24x7 access to directory data. Directory Proxy Server maintains connection state information for all Directory Servers that it knows about, and is able to dynamically perform proportional load balancing of LDAP operations across a set of configured Directory Servers. Should one or more Directory Servers become unavailable, the load is proportionally redistributed among the remaining servers. When a Directory Server comes back on line, the load is proportionally reallocated.
It is important to note however, that to avoid Directory Proxy Server becoming the single point of failure for your directory deployment, you should use at least two Directory Proxy Servers with an IP appliance in front of them.