|Skip Navigation Links|
|Exit Print View|
|Oracle Directory Server Enterprise Edition Deployment Planning Guide 11g Release 1 (220.127.116.11.0)|
Getting the right hardware for a medium to large Directory Server deployment involves some testing with data similar to the data you expect to serve in production, and access patterns similar to those you expect from client applications. When optimizing for particular systems, make sure you understand how system buses, peripheral buses, I/O devices, and supported file systems work. This knowledge helps you take advantage of I/O subsystem features when tuning these features to support Directory Server.
This section looks at how to approach hardware sizing for Directory Server. It covers what to consider when deciding how many processors, how much memory, how much disk space, and what type of network connections to dedicate to Directory Server in your deployment.
This section covers the following topics:
Note - Unless indicated otherwise, the server properties described in the following sections can be set with the dsconf command. For more information about using dsconf, see dsconf(1M).
To tune performance implies modification of the default configuration to reflect specific deployment requirements. The following list of process phases covers the key things to think about when tuning Directory Server.
Define specific, measurable objectives for tuning, based on deployment requirements.
Consider the following questions.
Which applications use Directory Server?
Can you dedicate the entire system to Directory Server?
Does the system run other applications?
If so, which other applications run on the system?
How many entries are handled by the deployment?
How large are the entries?
How many searches per second must Directory Server support?
What types of searches are expected?
How many updates per second must Directory Server support?
What types of updates are expected?
What sort of peak update and search rates are expected?
What average rates are expected?
Does the deployment call for repeated bulk import initialization on this system?
If so, how often do you expect to import data? How many entries are imported?
What types of entries?
Must initialization be performed online with the server running?
The list here is not exhaustive. Ensure that your list of goals is exhaustive.
Determine how you plan to implement optimizations. Also, determine how you plan to measure and analyze optimizations.
Consider the following questions.
Can you change the hardware configuration of the system?
Are you limited to using hardware that you already have, tuning only the underlying operating system, and Directory Server?
How can you simulate other applications?
How should you generate representative data samples for testing?
How should you measure results?
How should you analyze results?
Carry out the tests that you planned. For large, complex deployments, this phase can take considerable time.
Check whether the potential optimizations tested reach the goals defined at the outset of the process.
If the optimizations reach the goals, document the results.
If the optimizations do not reach the goals, profile and monitor Directory Server.
Profile and monitor the behavior of Directory Server after applying the potential modifications.
Collect measurements of all relative behavior.
Plot and analyze the behavior that you observed while profiling and monitoring. Attempt to find evidence and to discover patterns that suggest further tests.
You might need to go back to the profiling and monitoring phase to collect more data.
Apply further potential optimizations suggested by your analysis of measurements.
Return to the phase of performing tests.
When the optimizations applied reach the goals defined at the outset of the process, document the optimizations well so the optimizations can be easily reproduced.
How much disk and memory space you devote to Directory Server depends on your directory data. If you already have representative data in LDIF, use that data when sizing hardware for your deployment. Representative data here means sample data that corresponds to the data you expect to use in deployment, but not actual data you use in deployment. Real data comes with real privacy concerns, can be multiple orders of magnitude larger than the specifications need to generate representative data, and may not help you exercise all the cases you want to test. Representative data includes entries whose average size is close to the size you expect to see in deployment, whose attributes have values similar to those you expect to see in deployment, and whose numbers are present in proportions similar to those you expect to see in deployment.
Take anticipated growth into account when you are deciding on representative data. It is advisable to include an overhead on current data for capacity planning.
If you do not have representative data readily available, you can use the makeldif(1) command to generate sample LDIF, which you can then import into Directory Server. Chapter 4, Defining Data Characteristics can help you figure out what representative data would be for your deployment. The makeldif command is one of the Directory Server Resource Kit tools.
For deployments expected to serve millions of entries in production, ideally you would load millions of entries for testing. Yet loading millions of entries may not be practical for a first estimate. Start by creating a few sets of representative data, for example 10,000 entries, 100,000 entries, and 1,000,000 entries, import those, and extrapolate from the results you observe to estimate the hardware required for further testing. When you are estimating hardware requirements, make provision for data that will be replicated to multiple servers.
Notice when you import directory data from LDIF into Directory Server the resulting database files (including indexes) are larger than the LDIF representation. The database files, by default, are located under the instance-path/db/ directory.
Directory Server default configuration settings are defined for typical small deployments and to make it easy to install and evaluate the product. This section examines some key configuration settings to adjust for medium to large deployments. In medium to large deployments you can often improve performance significantly by adapting configuration settings to your particular deployment.
When Directory Server reads or writes data, it works with fixed blocks of data, called pages. By increasing the page size you increase the size of the block that is read or written in one disk operation.
The page size is related to the size of entries and is a critical element of performance. If you know that the average size of your entries is greater than db-page-size/4–24 (24 is the per page binary tree internal structure), you must increase the database page size. The database page size should also match the file system disk block size.
Directory Server is designed to respond quickly to client application requests. In order to avoid waiting for directory data to be read from disk, Directory Server caches data in memory. You can configure how much memory is devoted to cache for database files, for directory entries, and for importing directory data from LDIF.
Ideally the hardware on which you run Directory Server allows you to devote enough space to cache all directory data in physical memory. The data should fit comfortably, such that the system has enough physical memory for operation, and the file system has plenty of physical memory for its caching and operation. Once the data are cached, Directory Server has to read data from and write data to disk only when a directory entry changes.
Directory Server supports 64–bit memory addressing, and so can handle total cache sizes as large as a 64–bit processor can address. For small to medium deployments it is often possible to provide enough memory that all directory data can be held in cache. For large deployments, however, caching everything may not be practical or cost effective.
For large deployments, caching everything in memory can cause side effects. Tools such as the pmap command, that traverse the process memory map to gather data, can freeze the server process for a noticeable time. Core files can become so large that writing them to disk during a crash can take several minutes. Startup times can be slow if the server is shut down abruptly and then restarted. Directory Server can also pause and stop responding temporarily when it reaches a checkpoint and has to flush dirty cached pages to disk. When the cache is very large, the pauses can become so long that monitoring software assumes Directory Server is down.
I/O buffers at the operating system level can provide better performance. Very large buffers can compensate for smaller database caches.
Directory Server indexes directory entry attribute values to speed searches for those values. You can configure attributes to be indexed in various ways. For example, indexes can help Directory Server determine quickly whether an attribute has a value, whether it has a value equal to a given value, and whether it has a value containing a given substring.
Indexes can add to search performance, but they can also impact write performance. When an attribute is indexed, Directory Server has to update the index as values of the attribute change.
Directory Server saves index data to files. The more indexes you configure, the more disk space required. Directory Server indexes and data files are found, by default, under the instance-path/db/ directory.
For a detailed discussion of indexing and index settings, read Chapter 9, Directory Server Indexing, in Oracle Directory Server Enterprise Edition Reference.
Some Directory Server administration files can potentially become very large. These files include the LDIF files containing directory data, backups, core files, and log files.
Depending on your deployment, you may use LDIF both to import Directory Server data, and to serve as auxiliary backup. A standard text format, LDIF allows you to export binary data as well as strings. LDIF can occupy significant disk space in large deployments. For example, a directory containing 10 million entries having an average size of 2 kilobytes, would in LDIF representation occupy 20 gigabytes on disk. You might maintain multiple LDIF files of that size if you use the format for auxiliary backup.
Binary backup files also occupy space on disk, at least until you move them somewhere else for safekeeping. Backup files produced with Directory Server utilities consist of binary copies of the directory database files. Alternatively for large deployments you can put Directory Server in frozen mode and take a snapshot of the file system. Either way, you must have disk space available for the backup.
By default Directory Server writes log messages to instance-path/logs/access and instance-path/logs/errors. By default Directory Server requires one gigabyte of local disk space for access logging, and another 200 megabytes of local disk space for errors logging.
For a detailed discussion of Directory Server logging, read Chapter 10, Directory Server Logging, in Oracle Directory Server Enterprise Edition Reference.
Directory Server lets you replicate directory data for availability and load balancing between the servers in your deployment. Directory Server allows you to have multiple read-write (master) replicas deployed together.
Internally, the server makes this possible by keeping track of changes to directory data. When the same data are modified on more than one read-write replica Directory Server can resolve the changes correctly on all replicas. The data to track these changes, must be retained until they are no longer needed for replication. Changes are retained for a period of time specified by the purge delay whose default value is seven days. If your directory data undergoes much modification, especially of large multi-valued attributes, this data can grow quite large.
Because the level of growth is dependent on several factors, there is no catch-all formula to calculate potential data growth. The best approach is to test typical modifications and measure the growth. The following factors have an effect on data growth as a result of entry modification:
The type of entries and the types of attributes that are modified.
Multi-valued attributes cause larger growth. If the attribute values are small, the growth is more visible.
The workload applied to the entry.
Adding and deleting entries causes larger growth. Adding an attribute value causes larger growth than replacing an attribute value.
The number of entries that are modified, and the number of attributes that are modified in each entry.
The size of the database page.
After numerous modifications, certain entries can become larger than the database page size.
Note that the replication metadata remains in the entry until the purge delay has passed and the entry is modified again.
For a detailed discussion of Directory Server replication, read Chapter 7, Directory Server Replication, in Oracle Directory Server Enterprise Edition Reference.
Directory Server runs as a multithreaded process, and is designed to scale on multiprocessor systems. You can configure the number of threads Directory Server creates at startup to process operations. By default Directory Server creates 30 threads. The value is set using the dsconf(1M) command to adjust the server property thread-count.
The trick is to keep the threads as busy as possible without incurring undo overhead from having to handle many threads. As long as all directory data fits in cache, better performance is often seen when thread-count is set to twice the number of processors plus the expected number of simultaneous update operations. If only a fraction of a large directory data set fits in cache, Directory Server threads may often have to wait for data being read from disk. In that case you may find performance improves with a much higher thread count, up to 16 times the number of available processors.
Directory Server uses file descriptors to hold data related to open client application connections. By default Directory Server uses a maximum of 1024 file descriptors. The value is set using the dsconf command to adjust the server property file-descriptor-count. If you see a message in the errors log stating too many fds open, you may observe better performance by increasing file-descriptor-count, presuming your system allows Directory Server to open additional file descriptors.
The file-descriptor-count property does not apply on Windows.
Once in deployment Directory Server use is likely to grow. Planning for growth is key for a successful deployment, in which you continue to provide a consistently high level of service. Plan for larger, more powerful systems than you need today, basing your requirements in part on the growth you expect tomorrow.
Sometimes directory services must grow rapidly, even suddenly. This is the case for example when a directory service sized for one organization is merged with that of another organization. By preparing for growth in advance and by explicitly identifying your expectations, you are better equipped to deal with rapid and sudden growth, because you know in advance whether the expected increase outstrips the capacity you planned.
Basic recommendations follow. These recommendations apply in most situations. Although the recommendations presented here are in general valid, avoid the temptation to apply the recommendations without understanding the impact on the deployment at hand. This section is intended as a checklist, not a cheat sheet.
Adjust cache sizes.
Ideally, the server has enough available physical memory to hold all caches used by Directory Server. Furthermore, an appropriate amount of extra physical memory is available to account for future growth. When plenty of physical memory is available, set the entry cache size large enough to hold all entries in the directory. Use the entry-cache-size suffix property. Set the database cache size large enough to hold all indexes with the db-cache-size property. Use the dn-cache-size or dn-cache-count properties to control the size of the DN cache.
Remove unnecessary indexes. Add additional indexes to support expected requests.
From time to time, you can add additional indexes that support requests from new applications. You can add, remove, or modify indexes while Directory Server is running. Use for example the dsconf create-index and dsconf delete-index commands.
Be careful not to remove system indexes. For a list of system indexes, see System Indexes and Default Indexes in Oracle Directory Server Enterprise Edition Reference.
Directory Server gradually indexes data after you make changes to the indexes. You can also force Directory Server to rebuild indexes with the dsconf reindex command.
Allow only indexed searches.
Unindexed searches can have a strong negative impact on server performance. Unindexed searches can also consume significant server resources.
Consider forcing the server to reject unindexed searches by setting the require-index-enabled suffix property to on.
Adjust the maximum number of values per index key with the all-ids-threshold property.
Tune the underlying operating system according to recommendations made by the idsktune command. For more information, see idsktune(1M).
Adjust operational limits.
Adjustable operational limits prevent Directory Server from devoting inordinate resources to any single operation. Consider assigning unique bind DNs to client applications requiring increased capabilities, then setting resource limits specifically for these unique bind DNs.
Distribute disk activity.
Especially for deployments that support large numbers of updates, Directory Server can be extremely disk I/O intensive. If possible, consider spreading the load across multiple disks with separate controllers.
Disable unnecessary logging.
Disk access is slower than memory access. Heavy logging can therefore have a negative impact on performance. Reduce disk load by leaving audit logging off when not required, such as on a read-only server instance. Leave error logging at a minimal level when not using the error log to troubleshoot problems. You can also reduce the impact of logging by putting log files on a dedicated disk, or on a lesser used disk, such as the disk used for the replication changelog.
When replicating large numbers of updates, consider adjusting the appropriate replication agreement properties.
The properties are transport-compression, transport-group-size, and transport-window-size.
On Solaris systems, move the database home directory to a tmpfs file system.
The database home directory, specified by the db-env-path property, indicates where Directory Server locates database cache backing files. Data files continue to reside by default under instance-path/db.
With the database cache backing files on a tmpfs file system, the system does not repeatedly flush the database cache backing files to disk. You therefore avoid a performance bottleneck for updates. In some cases, you also avoid the performance bottleneck for searches. The database cache memory is mapped to the Directory Server process space. The system essentially shares cache memory and memory used to hold the backing files in the tmpfs file system. You therefore gain performance at essentially no cost in terms of memory space needed.
The primary cost associated with this optimization is that database cache must be rebuilt after a restart of the host machine. This cost is probably not a cost that you can avoid, however, if you expect a restart to happen only after a software or hardware failure. After such a failure, the database cache must be rebuilt anyway.
Enable transaction batches if you can afford to lose updates during a software or hardware failure.
You enable transaction batches by setting the server property db-batched-transaction-count.
Each update to the transaction log is followed by a sync operation to ensure that update data is not lost. By enabling transaction batches, updates are grouped together before being written to the transaction log. Sync operations only take place when the whole batch is written to the transaction log. Transaction batches can therefore significantly increase update performance. The improvement comes with a trade off. The trade off is during a crash, you lose update data not yet written to the transaction log.
Note - With transaction batches enabled, you lose up to db-batched-transaction-count - 1 updates during a software or hardware failure. The loss happens because Directory Server waits for the batch to fill, or for 1 second, whichever is sooner, before flushing content to the transaction log and thus to disk.
Do not use this optimization if you cannot afford to lose updates.
Configure the referential integrity plug-in to delay integrity checks.
The referential integrity plug-in ensures that when entries are modified, or deleted from the directory, all references to those entries are updated. By default, the processing is performed synchronously, before the response for the delete operation is returned to the client. You can configure the plug-in to have the updates performed asynchronously. Use the ref-integrity-check-delay server property.
To measure Directory Server performance, you prepare the server, then subject it to the kind of client application traffic you expect in production. The better you reproduce the kind of access patterns client applications that happen in production, the better job you can do sizing the hardware and configuring Directory Server appropriately.
Directory Server Resource Kit provides the authrate(1), modrate(1), and searchrate(1) commands you can use for basic tests. These commands let you measure the rate of binds, modifications, and searches your directory service can support.
You can also simulate, measure, and graph complex, realistic client access using SLAMD. The SLAMD Distributed Load Generation Engine (SLAMD) is a Java application that is designed to stress test and analyze the performance of network-based applications. It was originally developed by Sun Microsystems, Inc. to benchmark and analyze the performance of LDAP Directory Servers. SLAMD is available as an open source application under the Sun Public License, an OSI-approved open source license. To obtain information about SLAMD, go to http://www.slamd.com/. SLAMD is also available as a java.net project. See https://slamd.dev.java.net/.
As a multithreaded process built to work on systems with multiple processors, Directory Server performance scales linearly in most cases as you devote more processors to it. When running Directory Server on a system with many processors, consider using the dsconf command to adjust the server property thread-count, which is the number of threads Directory Server starts to process server operations.
In specific directory deployments, however, adding more processors might not significantly impact performance. When handling demanding performance requirements for searching, indexing, and replication, consider load balancing and directory proxy technologies as part of the solution.
The following factors significantly affect the amount of memory needed:
Directory Server database cache, entry cache, and import cache settings
Peak replication load
Peak client application load
Server settings for file-descriptor-count and thread-count
Overhead for the operating system, other applications running on the system, and system administration activity
To estimate the memory size required to run Directory Server, estimate the memory needed for a specific Directory Server configuration on a system loaded as in production, including application load generated for example using the Directory Server Resource Kit commands or SLAMD.
Before you measure Directory Server process size, give the server some time after startup to fill entry caches as during normal or peak operation. If you have space to put everything in cache memory, you can speed this warm up period for Directory Server by reading every entry in the directory to fill entry caches. If you do not have space to put everything in cache memory, simulate client access for some time until the cache fills as it would with a pattern of normal or peak operation.
With the server in an equilibrium state, you can use utilities such as pmap on Solaris or Linux, or the Windows Task Manager to measure memory used by the Directory Server process, ns-slapd on UNIX systems, slapd.exe on Windows systems. For more information, see the pmap(1) man page. Measure process size both during normal operation and peak operation before deciding how much memory to use.
Make sure to add to your estimates the amount of memory needed for system administration, and for the system itself. Operating system memory requirements can vary widely depending on the system configuration. Therefore, estimating the memory needed to run the underlying operating system must be done empirically. After tuning the system, monitor memory use to your estimate. You can use utilities such as the Solaris vmstat and sar commands, or the Task Manager on Windows to measure memory use.
At a minimum, provide enough memory so that running Directory Server does not cause constant page swapping, which negatively affects performance. Utilities such as MemTool, unsupported and available separately for Solaris systems, can be useful in monitoring how memory is used by and allocated to running applications.
If the system cannot accommodate additional memory, yet you continue to observe constant page swapping, reduce the size of the database and entry caches. Although you can throttle memory use with the heap-high-threshold-size and heap-low-threshold-size server settings, consider the heap threshold mechanism as a last resort. Performance suffers when Directory Server must delay other operations to free heap memory.
On Red Hat Linux systems, you can adjust the /proc/sys/vm/swappiness parameter to tune how aggressively the kernel swaps out memory. High swappiness means that the kernel will swap out a large amount and low swappiness means that the kernel will try not to use swap space at all. Decreasing the swappiness setting may therefore result in improved Directory performance as the kernel holds more of the server process in memory longer before swapping it out. If the system is dedicated to a single Directory Server instance, set the swappiness to zero. If the system runs several heavy processes or multiple concurrent instances of Directory Server, consider testing the Directory performance with various swappiness settings.
Disk use and I/O capabilities can have great impact on performance. The disk subsystem can become an I/O bottleneck, especially for a deployment that supports large numbers of modifications. This section recommends ways to estimate overall disk capacity for a Directory Server instance.
Note - Do not install Directory Server or any data it accesses on network disks.
Directory Server software does not support the use of network-attached storage through NFS, AFS, or SMB. All configuration, database, and index files must reside on local storage at all times, even after installation. Log files can be stored on network disks.
The following factors significantly affect the amount of local disk space needed:
Number of directory entries
Average sizes of entries
Server database page size setting when directory data is imported
To adjust the database page size, set the nsslapd-db-page-size attribute. For more information, see Directory Server Database Page Size.
Number of indexes maintained on directory data
Size of stored LDIF, backups, logs, and core files
When you have set up indexes, adjusted the database page size, and imported directory data, you can estimate the disk capacity required for the instance by reading the size of the instance-path/ contents, and adding the size of expected LDIF, backups, logs, and core files. Also estimate how much the sizes you measure are expected to grow, particularly during peak operation. Make sure you leave a couple of gigabytes of extra space for the errors log in case you need to increase the log level and size for debugging purposes.
Getting an estimation of the disk required for directory data can be done in some cases by extrapolation. If it is not practical to load Directory Server with as much data as you expect in production, extrapolate from smaller sets of sample data as suggested in Making Sample Directory Data. When the amount of directory data you use is smaller than in production, you must extrapolate for other measurements, too.
The following factors determine how fast the local disk must be:
Level of updates sustained, including the volume of replication traffic
Whether directory data are mainly in cache or on disk
Log levels used for access and error logging, and whether the audit log is enabled
Whether directory data, logs, and the transaction log (for updates) can be placed on separate disk subsystems
Whether backups are performed with Directory Server online or offline
Disks used should not be saturated under normal operating circumstances. You can use tools such as the Solaris iostat command to isolate potential I/O bottlenecks.
To increase disk throughput distribute files across disk subsystems. Consider providing dedicated disk subsystems for transaction logs (dsconf set-server-prop db-log-path:/transaction/log/path), databases (dsconf create-suffix --db-path /suffix/database/path suffix-name), and log files (dsconf set-log-prop path:/log/file/path). In addition consider putting database cache files on a memory-based file system such as a Solaris tmpfs file system, where files are swapped to disk only if available memory is exhausted (for example, dsconf set-server-prop db-env-path:/tmp). If you put database cache files on a memory-based file system, make sure the system does not run out of space to keep that entire file system in memory.
To further increase throughput use multiple disks in RAID configuration. Large, non volatile I/O buffers and high-performance disk subsystems such as those offered in Sun StorEdge products can greatly enhance Directory Server performance and uptime. On Solaris 10 systems, using ZFS can also improve performance.
Directory Server is a network-intensive application. You can estimate theoretical maximum throughput using the following formula. Notice that this formula does not account for replication traffic.
max. throughput = max. entries returned/second x average entry size
Imagine that a Directory Server must respond to a peak of 5000 searches per second and that the server returns one entry per search. The entries have an average size of 2000 bytes. The theoretical maximum throughput would be 10 megabytes, or 80 megabits, not counting replication. 80 megabits are likely to be more than a single 100-megabit Ethernet adapter can provide. To improve network availability for a Directory Server instance, equip the system with a faster connection, or with multiple network interfaces. Directory Server can listen on multiple network interfaces within the same process.
Note - The preceding example assumes that the client application requests all attributes when reading or searching the directory. Generally, you should design client applications so that they request only the required attributes.
If you plan multi-master replication over a wide area network, test your configuration to make sure the connection provides sufficient throughput with minimum latency and near-zero packet loss. High latency and packet loss both slow replication. In addition, avoid a topology where replication traffic goes through a load balancer.
The default configuration of Directory Server can allow client applications to use more Directory Server resources than are required.
The following uses of resources can hurt directory performance:
Opening many connections then leaving them idle or unused
Launching costly and unnecessary unindexed searches
Storing enormous and unplanned for binary attribute values
In some deployment situations, you should not modify the default configuration. For deployments where you cannot tune Directory Server, use Directory Proxy Server to limit resources, and to protect against denial of service attacks.
In some deployment situations, one instance of Directory Server must support client applications, such as messaging servers, and directory clients such as user mail applications. In such situations, consider using bind DN based resource limits to raise individual limits for directory intensive applications. The limits for an individual account can be adjusted by setting the attributes nsSizeLimit, nsTimeLimit, nsLookThroughLimit, and nsIdleTimeout on the individual entry. For information about how to control resource limits for individual accounts, see Setting Resource Limits For Each Client Account in Oracle Directory Server Enterprise Edition Administration Guide.
Table 6-1 describes the parameters that set the global values for resource limits. The limits in Table 6-1 do not apply to the Directory Manager user, therefore, ensure client applications do not connect as the Directory Manager user.
Table 6-1 Tuning Recommendations For Resources Devoted to Client Applications
Table 6-2 describes the parameters that can be used to tune how a Directory Server instance uses system and network resources.
Table 6-2 Tuning Recommendations For System Resources