The type of data in your directory determines how you structure the directory, who can access the data, and how access is granted. Data types can include, among others, user names, email addresses, telephone numbers, and information about groups to which users belong.
This chapter explains how to locate, categorize, structure, and organize data. It also explains how to map data to the Directory Server schema. This chapter covers the following topics:
The first step in categorizing existing data is to identify where that data comes from and who owns it.
Identify organizations that provide information.
Locate all the organizations that manage information essential to your enterprise. Typically, these organizations include your information services, human resources, payroll, and accounting departments.
Identify tools and processes that are information sources.
Common sources for information include the following:
Networking operating systems, such as Windows, Novell Netware, and UNIX® NIS
PBX or telephone switching systems
Human resources applications
Determine how centralizing each piece of data affects the management of data.
Centralized data management might require new tools and new processes. Issues can arise when centralization requires increasing staff in some organizations and decreasing staff in others.
Data ownership refers to the person or organization that is responsible for ensuring that data is up-to-date. During the data design phase, decide who can write data to the directory. Common strategies for determining data ownership include the following:
Allow read-only access to the directory for everyone except a small group of directory content managers.
Allow individual users to manage strategic subsets of information.
These subsets of information might include their passwords, descriptive information about themselves, and their role within the organization.
Allow a person’s manager to write to some strategic subset of that person’s information, such as contact information or job title.
Allow an organization’s administrator to create and manage entries for that organization.
Organization administrators in effect become your directory content managers.
Create roles that give groups of people read or write access privileges.
For example, you might create roles for human resources, finance, or accounting. Allow each of these roles to have read access, write access, or both to the data needed by the group. This data might include salary information, government identification number, and home phone numbers and address.
For more information about roles and grouping entries, see Grouping Directory Data and Managing Attributes, Chapter 10, Directory Server Groups, Roles, and CoS, in Sun Java System Directory Server Enterprise Edition 6.3 Administration Guide and Chapter 8, Directory Server Groups and Roles, in Sun Java System Directory Server Enterprise Edition 6.3 Reference.
As you determine who can write to the data, you might find that multiple individuals require write access to the same information. For example, an information systems or directory management group should have write access to employee passwords. You might also want all employees to have write access to their own passwords. While you generally must give multiple people write access to the same information, try to keep this group small and easy to identify. Small groups help to ensure your data’s integrity.
For information about setting access control for your directory, see Chapter 7, Directory Server Access Control, in Sun Java System Directory Server Enterprise Edition 6.3 Administration Guide and How Directory Server Provides Access Control in Sun Java System Directory Server Enterprise Edition 6.3 Reference.
To distinguish between data used to configure Directory Server and other Java Enterprise System servers and the actual user data stored in the directory, do the following:
Provide different backup strategies for user and configuration data.
Provide different high availability standards for user and configuration data.
Shut down, restore, and power up configuration servers quickly.
Keep configuration servers up while performing maintenance on other Directory Server instances.
When determining data sources, ensure that you include data from other data sources, including legacy data sources. This data might not be stored in the directory. However, Directory Server might need to have some knowledge of, or control over, the data.
Directory Proxy Server provides a virtual directory feature that aggregates information, in real-time, from multiple data repositories. These repositories include LDAP directories, data that complies with the JDBC specification, and LDIF flat files.
The virtual directory supports complex filters that handle attributes from different data sources. It also supports modifications that combine attributes from different data sources.
During the data analysis phase, you might find that the same data is required by several applications, but in a different format. Instead of duplicating this information, it is preferable to have the applications transform it for their requirements.
The directory information tree (DIT) provides a way to structure directory data so that the data can be referred to by client applications. The DIT interacts closely with other design decisions, including how you distribute, replicate, or control access to directory data.
A well-designed DIT provides the following:
Simplified directory data maintenance
Flexibility in creating replication policies and access controls
Support for the applications that use the directory
Simplified directory navigation for users
The DIT structure follows the hierarchical LDAP model. The DIT organizes data, for example, by group, by people, or by geographical location. It also determines how data is partitioned across multiple servers.
DIT design has an impact on replication configuration and on how you use Directory Proxy Server to distribute data. If you want to replicate or distribute certain portions of a DIT, consider replication and the requirements of Directory Proxy Server at design time. Also, decide at design time whether you require access controls on branch points.
A DIT is defined in terms of suffixes, subsuffixes, and chained suffixes. A suffix is a branch or subtree whose entire contents are treated as a unit for administrative tasks. Indexing is defined for an entire suffix, and an entire suffix can be initialized in a single operation. A suffix is also usually the unit of replication. Data that you want to access and manage in the same way should be located in the same suffix. A suffix can be located at the root of the directory tree, where it is called a root suffix.
Because data can only be partitioned at the suffix level, an appropriate directory tree structure is required to spread data across multiple servers.
The following figure shows a directory with two root suffixes. Each suffix represents a separate corporate entity.
A suffix might also be a branch of another suffix, in which case it is called a subsuffix. The parent suffix does not include the contents of the subsuffix for administrative operations. The subsuffix is managed independently of its parent. Because LDAP operation results contain no information about suffixes, directory clients are unaware of whether entries are part of root suffixes or subsuffixes.
The following figure shows a directory with a single root suffix and multiple subsuffixes for a large corporate entity.
A suffix corresponds to an individual database within the server. However, databases and their files are managed internally by the server and database terminology is not used.
Chained suffixes create a virtual DIT by referencing suffixes on other servers. With chained suffixes, Directory Server performs the operation on the remote suffix. The directory then returns the result as if the operation had been performed locally. The location of the data is transparent. The client is unaware that the suffix is chained and that the data is retrieved from a remote server. A root suffix on one server can have subsuffixes that are chained to another server. In this scenario, the client is aware of a single tree structure.
In the special case of cascading chaining, the chained suffix might reference another chained suffix on the remote server, and so on. Each server forwards the operation and eventually returns the result to the server that handles the client’s request.
DIT design involves choosing a suffix to contain your data, determining the hierarchical relationship between data entries, and naming the entries in the DIT hierarchy. The following sections describe the design process in more detail.
The suffix is the name of the entry at the root of the DIT. If you have two or more DITs that do not have a natural common root, you can use multiple suffixes. The default Directory Server installation contains multiple suffixes. One suffix is used to store user data. The other suffixes are for data that is needed by internal directory operations, such as configuration information and directory schema.
All directory entries must be located below a common base entry, the suffix. Each suffix name must be as follows:
Static, so that the name rarely changes
Short, so that entries beneath the suffix are easier to read online
Easy for a person to type and remember
It is generally considered best practice to map your enterprise domain name to a Distinguished Name (DN). For example, an enterprise with the domain name example.com would use a DN of dc=example,dc=com.
The structure of a DIT can be flat or hierarchical. Although a flat tree is easier to manage, a degree of hierarchy might be required for data partitioning, replication management, and access control.
A branch point is a point at which you define a new subdivision within the DIT. When deciding on branch points, avoid potential problematic name changes. The likelihood of a name changing is proportional to the number of components in the name that can potentially change. The more hierarchical the DIT, the more components in the names, and the more likely the names are to change.
Use the following guidelines when defining and naming branch points:
Branch your tree to represent only the largest organizational subdivisions in your enterprise.
Limit branch points to divisions, such as Corporate Information Services, Customer Support, Sales, and Professional Services. Make sure that your divisions are stable. Do not perform this kind of branching if your enterprise reorganizes frequently.
Use functional or generic names rather than actual organizational names.
Names change and you do not want to have to change your DIT every time your enterprise renames its divisions. Instead, use generic names that represent the function of the organization. For example, use Engineering instead of Widget Research and Development.
If you have multiple organizations that perform similar functions, create a single branch point for that function instead of branching based on divisional lines.
For example, even if you have multiple marketing organizations that are responsible for a specific product line, create a single Marketing subtree. All marketing entries then belong to that tree.
Try to use only the traditional branch point attributes that are shown in the following table.
Traditional attributes increase the likelihood of retaining compatibility with third-party LDAP client applications. In addition, traditional attributes are known to the default directory schema, which simplifies the construction of entries for the branch distinguished name (DN).
Branch according to the type of data stored in the directory.
For example, you might create a separate branch for people, groups, service, and devices.
A country name.
An organization name. This attribute is typically used to represent a large divisional branching. The branching might include a corporate division, academic discipline, subsidiary, or other major branching within the enterprise. You should also use this attribute to represent a domain name.
An organizational unit. This attribute is typically used to represent a smaller divisional branching of your enterprise than an organization. Organizational units are generally subordinate to the preceding organization.
A state or province name.
A locality, such as a city, country, office, or facility name.
A domain component.
Be consistent when choosing attributes for branch points. Some LDAP client applications might fail if the DN format is inconsistent across your DIT. If l (localityName) is subordinate to o (organizationName) in one part of your DIT, ensure that l is subordinate to o in all other parts of your directory.
When designing a DIT, consider which entries will be replicated to other servers. If you want to replicate a specific group of entries to the same set of servers, those entries should fall below a specific subtree. To describe the set of entries to be replicated, specify the DN at the top of the subtree. For more information about replicating entries, see Chapter 4, Directory Server Replication, in Sun Java System Directory Server Enterprise Edition 6.3 Reference.
A DIT hierarchy can enable certain types of access control. As with replication, it is easier to group similar entries and to administer the entries from a single branch.
A hierarchical DIT also enables distributed administration. For example, you can use the DIT to give an administrator from the marketing department access to marketing entries, and an administrator from the sales department access to sales entries.
You can also set access controls based on directory content, rather than the DIT. Use the ACI filtered target mechanism to define a single access control rule. This rule states that a directory entry has access to all entries that contain a particular attribute value. For example, you can set an ACI filter that gives the sales administrator access to all entries that contain the attribute ou=Sales.
However, ACI filters can be difficult to manage. You must decide which method of access control is best suited to your directory: organizational branching in the DIT hierarchy, ACI filters, or a combination of the two.
The directory information tree organizes entries hierarchically. This hierarchy is a type of grouping mechanism. The hierarchy is not well suited for associations between dispersed entries, for organizations that change frequently, or for data that is repeated in many entries. Directory Server groups and roles offer more flexible associations between entries. The class of service (CoS) mechanism enables you to manage attributes so that the attributes are shared between entries. This sharing is done in a way that is invisible to applications.
These entry grouping and attribute management mechanisms are described in detail in Chapter 8, Directory Server Groups and Roles, in Sun Java System Directory Server Enterprise Edition 6.3 Reference and in Chapter 9, Directory Server Class of Service, in Sun Java System Directory Server Enterprise Edition 6.3 Reference.
This section provides an overview of the grouping mechanisms that is sufficient to design an administrative strategy. It does not explain how the mechanisms work or how to set them up.
The section is divided into the following topics:
Directory Server distinguishes among the static, dynamic, and nested groups.
Although groups may identify members anywhere in the directory, the group definitions themselves should be located under an appropriately named node such as ou=Groups. This makes them easy to find, for example, when defining access control instructions (ACIs) that grant or restrict access when the bind credentials are members of a group.
Static groups explicitly name their member entries. For example, a group of directory administrators would name the specific people who formed part of that group, as shown in the following illustration.
The following LDIF extract shows how the members of this static group would be defined.
dn: cn=Directory Administrators, ou=Groups, dc=example,dc=com ... member: uid=kvaughan, ou=People, dc=example,dc=com member: uid=rdaugherty, ou=People, dc=example,dc=com member: uid=hmiller, ou=People, dc=example,dc=com
Dynamic groups specify a filter and all entries that match the filter are members of the group. These groups are dynamic because membership is defined each time the filter is evaluated.
Imagine, for example, that all management employees and their assistants were situated on the 3rd floor of your building, and that the room number of each employee commenced with the number of the floor. If you wanted to create a group containing just the employees on the third floor, you could use the room number to define just these employees, as shown in the following illustration.
The following LDIF extract shows how the members of this dynamic group would be defined.
dn: cn=3rd Floor, ou=Groups, dc=example,dc=com ... memberURL: ldap:///dc=example,dc=com??sub?(roomnumber=3*)
Nested groups use the DN of another group as the uniqueMember attribute of a static or dynamic group to place groups inside other groups. Directory Server also supports mixed groups, that is groups that reference individual entries, static groups, and dynamic groups.
Imagine for example that you wanted a group containing all directory administrators, and all management employees and their assistants. You could use a combination of the two groups defined earlier to create one nested group, as shown in the following illustration.
The following LDIF extract shows how the members of this nested group would be defined.
dn: cn=Admins and 3rd Floor, ou=Groups, dc=example,dc=com ... member: cn=Directory Administrators, ou=Groups, dc=example,dc=com member: cn=3rd Floor, ou=Groups, dc=example,dc=com
Nested groups are not the most efficient grouping mechanism. Dynamic nested groups incur an even greater performance cost. To avoid these performance problems, consider using roles instead.
Roles are an entry grouping mechanism. Roles enable you to determine role membership as soon as an entry is retrieved from the directory. Each role has members, or entries that possess the role. As with groups, you can specify role members explicitly or dynamically.
Directory Server supports the following three types of roles:
Managed roles. Explicitly assign a role to member entries.
Filtered roles. Automatically make entries members if the entries match a specified LDAP filter. In this way, the role depends on the attributes contained in each entry.
Nested roles. Enable you to create roles that contain other roles.
The functionality of the groups and roles mechanisms overlap somewhat. Both mechanisms have advantages and disadvantages. Generally, the roles mechanism is designed to provide frequently required functionality more efficiently. Because the choice of a grouping mechanism influences server complexity and determines how clients process membership information, you must plan your grouping mechanism carefully. To decide which mechanism is more suitable, you need to understand the typical membership queries and management operations that are performed.
Groups have the following advantages:
Static groups are the only standards-based grouping mechanism. Static groups are therefore interoperable with most client applications and LDAP servers.
If you only need to enumerate members of a given set, static groups are less costly. Enumerating members of a static group by retrieving the member attribute is easier than recovering all entries that share a role. In Directory Server, significant performance improvements have been made for large multi-valued attributes. Equality matching and modify operations on these attributes are greatly improved, specifically in relation to static groups. Membership testing for group entries has also been improved. These improvements remove some of the previous restrictions on static groups, specifically the restriction on group size.
Directory Server also provides group membership directly in user entries, with the isMemberOf operational attribute. This feature applies to static groups only but includes nested groups. For more information, see Managing Groups in Sun Java System Directory Server Enterprise Edition 6.3 Administration Guide.
Static groups are preferable to roles for management operations such as assigning and removing members.
Static groups are the simplest mechanism for assigning a user to a set or removing a user from a set. Special access rights are not required to add the user to the group.
The right to create the group entry automatically gives you the right to assign members to that group. This is not the case for managed and filtered roles. In these roles, the administrator must also have the right to write the nsroledn attribute to the user entry. The same access right restrictions also apply indirectly to nested roles. The ability to create a nested role implies the ability to pull together other roles that have already been defined.
Dynamic groups are preferable to roles for use in filter-based ACIs.
If you only need to find all members based on a filter, such as for designating bind rules in ACIs, use dynamic groups. Although filtered roles are similar to dynamic groups, filtered roles trigger the roles mechanism and generate the virtual nsRole attribute. If your client does not need the nsRole value, use dynamic groups to avoid the overhead of this computation.
Groups are preferable to roles for adding or removing sets into or from existing sets.
If you want to add a set to an existing set, or remove a set from an existing set, the groups mechanism is simplest. The groups mechanism presents no nesting restrictions. The roles mechanism only allows nested roles to receive other roles.
Groups are preferable to roles if flexibility of scope for grouping entries is critical.
Groups are flexible in terms of scope because the scope for possible members is the entire directory, regardless of where the group definition entries are located. Although roles can also extend their scope beyond a given subtree, they can only do so by adding the scope-extending attribute nsRoleScopeDN to a nested role.
Roles have the following advantages:
Roles are preferable to dynamic groups if you want to enumerate members of a set and find all sets of which a given entry is a member. Static groups also provide this functionality with the isMemberOf attribute.
Roles push membership information out to the user entry where this information can be cached to make subsequent membership tests more efficient. The server performs all computations, and the client only needs to read the values of the nsRole attribute. In addition, all types of roles appear in this attribute, allowing the client to process all roles uniformly. Roles can perform both operations more efficiently and with simpler clients than is possible with dynamic groups.
Roles are preferable to groups if you want to integrate your grouping mechanism with existing Directory Server functionality such as CoS, Password Policy, Account Inactivation, and ACIs.
If you want to use the membership of a set “naturally” in the server, roles are a better option. This implies that you use the membership computations that the server does automatically. Roles can be used in resource-oriented ACIs, as a basis for CoS, as part of more complex search filters, and with Password Policy, Account Inactivation, and so forth. Groups do not allow this kind of integration.
Be aware of the following issues when using roles:
The nsRole attribute can only be assigned by the roles mechanism. While this attribute cannot be assigned or modified by any directory user, it is potentially readable by any directory user. Define access controls to keep this attribute from being read by unauthorized users.
The nsRoleDN attribute defines managed role membership. You need to decide whether users can add or remove themselves from the role. To keep from modifying their own roles, you must define an ACI to that effect.
Filtered roles determine membership through filters that are based on the existence or the values of attributes in user entries. Assign the user permissions of these attributes carefully to control who can define membership in the filtered role.
The Class of Service (CoS) mechanism allows attributes to be shared between entries. Like the role mechanism, CoS generates virtual attributes on the entries as the entries are retrieved. CoS does not define membership, but it does allow related entries to share data for coherency and space considerations. CoS values are calculated dynamically when the values are requested. CoS functionality and the various types of CoS are described in detail in the Sun Java System Directory Server Enterprise Edition 6.3 Reference.
The following sections examine the ways in which you can use the CoS functionality as intended, while avoiding performance pitfalls:
CoS generation always impacts performance. Client applications that search for more attributes than they actually need can compound the problem.
If you can influence how client applications are written, remind developers that client applications perform much better when looking up only those attribute values that they actually need.
CoS provides substantial benefits for relatively low cost when you need the same attribute value to appear on numerous entries in a subtree.
Imagine, for example, a directory for MyCompany, Inc. in which every user entry under ou=People has a companyName attribute. Contractors have real values for companyName attributes on their entries, but all regular employees have a single CoS-generated value, MyCompany, Inc., for companyName. The following figure demonstrates this example with pointer CoS. Notice that CoS generates companyName values for all permanent employees without overriding real, not CoS-generated, companyName values stored for contractor employees. The company name is generated only for those entries for which companyName is an allowed attribute.
In cases where many entries share the same value, pointer CoS works particularly well. The ease of maintaining companyName for permanent employees offsets the additional processing cost of generating attribute values. Deep directory information trees (DITs) tend to bring together entries that share common characteristics. Pointer CoS can be used in deep DITs to generate common attribute values by placing CoS definitions at appropriate branches in the tree.
CoS also provides substantial data administration benefits when directory data has natural relationships.
Consider an enterprise directory in which every employee has a manager. Every employee shares a mail stop and fax number with the nearest administrative assistant. Figure 4–4 demonstrates the use of indirect CoS to retrieve the department number from the manager entry. In Figure 4–5, the mail stop and fax number are retrieved from the administrative assistant entry.
In this implementation, the manager’s entry has a real value for departmentNumber, and this real value overrides any generated value. Directory Server does not generate attribute values from CoS-generated attribute values. Thus, in the Figure 4–4 example, the department number attribute value needs to be managed only on the manager's entry. Likewise, for the example shown in Figure 4–5, mail stop and fax number attributes need to be managed only on the administrative assistant’s entry.
A single CoS definition entry can be used to exploit relationships such as these for many different entries in the directory.
Another natural relationship is service level. Consider an Internet service provider that offers customers standard, silver, gold, and platinum packages. A customer’s disk quota, number of mailboxes, and rights to prepaid support levels depend on the service level purchased. The following figure demonstrates how a classic CoS scheme enables this functionality.
One CoS definition might be associated with multiple CoS template entries.
Directory Server optimizes CoS when one classic CoS definition entry is associated with multiple CoS template entries. Directory Server does not optimize CoS if many CoS definitions potentially apply. Instead, Directory Server checks each CoS definition to determine whether the definition applies. This behavior leads to performance problems if you have thousands of CoS definitions.
This situation can arise in a modified version of the example shown in Figure 4–6. Consider an Internet service provider that offers customers delegated administration of their customers’ service level. Each customer provides definition entries for standard, silver, gold, and platinum service levels. Ramping up to 1000 customers means creating 1000 classic CoS definitions. Directory Server performance would be affected as it runs through the list of 1000 CoS definitions to determine which apply. If you must use CoS in this sort of situation, consider indirect CoS. In indirect CoS, customers’ entries identify the entries that define their class of service allotments.
When you start approaching the limit of having different CoS schemes for every target entry or two, you are better off updating the real values. You then achieve better performance by reading real, not CoS-generated values.
The directory schema describes the types of data that can be stored in a directory. During schema design, each data element is mapped to an LDAP attribute. Related elements are gathered into LDAP object classes. A well-designed schema helps maintain data integrity by imposing constraints on the size, range, and format of data values. You decide what types of entries your directory contains and the attributes that are available to each entry.
The predefined schema that is included with Directory Server contains the Internet Engineering Task Force (IETF) standard LDAP schema. The schema contains additional application-specific schema to support the features of the server. It also contains Directory Server-specific schema extensions. While this schema meets most directory requirements, you might need to extend the schema with new object classes and attributes that are specific to your directory.
Schema design involves doing the following:
Mapping your data to the default schema.
To map existing data to the default schema, identify the type of object that each data element describes then select a similar object class from the default schema. Use the common object classes, such as groups, people, and organizations. Select a similar attribute from the matching object class that best matches the data element.
Identifying unmatched data.
Extending the default schema to define new elements to meet your remaining needs.
If data elements exist that do not match the object classes and attributes defined by the default directory schema, you can customize the schema. You can also extend the schema to impose additional constraints on the existing schema. For more information, see About Custom Schema in Sun Java System Directory Server Enterprise Edition 6.3 Administration Guide.
Planning for schema maintenance.
Where possible, use the existing schema elements that are defined in the default Directory Server schema. Standard schema elements help to ensure compatibility with directory-enabled applications. Because the schema is based on the LDAP standard, it has been reviewed and agreed to by a large number of directory users.
Consistent data assists LDAP client applications in locating directory entries. For each type of information that is stored in the directory, select the required object classes and attributes to support that information. Always use the same object classes and attributes. If you use schema objects inconsistently, it is difficult to locate information.
You can maintain schema consistency in the following ways:
Use schema checking to ensure that attributes and object classes conform to the schema rules.
For more information about schema checking, see Chapter 12, Directory Server Schema, in Sun Java System Directory Server Enterprise Edition 6.3 Administration Guide.
Select and apply a consistent data format.
The LDAP schema allows you to place any data on any attribute value. However, you should store data consistently in the DIT by selecting a format appropriate for your LDAP client applications and directory users. With the LDAP protocol and Directory Server, you must represent data using the data formats specified in RFC 4517.
For more information about the standard LDAP schema, and about designing a DIT, see the following sites:
RFC 4510: Lightweight Directory Access Protocol (LDAP): Technical Specification Road Map http://www.ietf.org/rfc/rfc4510.txt
RFC 4511: Lightweight Directory Access Protocol (LDAP): The Protocol http://www.ietf.org/rfc/rfc4511.txt
RFC 4512: Lightweight Directory Access Protocol (LDAP): Directory Information Models http://www.ietf.org/rfc/rfc4512.txt
RFC 4513: Lightweight Directory Access Protocol (LDAP): Authentication Methods and Security Mechanisms http://www.ietf.org/rfc/rfc4513.txt
RFC 4514: Lightweight Directory Access Protocol (LDAP): String Representation of Distinguished Names http://www.ietf.org/rfc/rfc4514.txt
RFC 4515: Lightweight Directory Access Protocol (LDAP): String Representation of Search Filters http://www.ietf.org/rfc/rfc4515.txt
RFC 4516: Lightweight Directory Access Protocol (LDAP): Uniform Resource Locator http://www.ietf.org/rfc/rfc4516.txt
RFC 4517: Lightweight Directory Access Protocol (LDAP): Syntaxes and Matching Rules http://www.ietf.org/rfc/rfc4517.txt
RFC 4518: Lightweight Directory Access Protocol (LDAP): Internationalized String Preparation http://www.ietf.org/rfc/rfc4518.txt
RFC 4519: Lightweight Directory Access Protocol (LDAP): Schema for User Applications http://www.ietf.org/rfc/rfc4519.txt
Understanding and Deploying LDAP Directory Services. T. Howes, M. Smith, G. Good. Macmillan Technical Publishing, 1999
For a complete list of the RFCs and standards supported by Directory Server Enterprise Edition, see Appendix A, Standards and RFCs Supported by Directory Server Enterprise Edition, in Sun Java System Directory Server Enterprise Edition 6.3 Evaluation Guide.