Sun Java System Directory Server Enterprise Edition 6.2 Reference

Chapter 5 Directory Server Data Caching

For fast response time to client requests, Directory Server caches directory information in memory. If you must have top Directory Server performance, but cannot fit all directory data in available memory, you can tune cache settings to optimize performance.

This chapter covers what cache is, and also provides recommendations about tuning cache settings. This chapter includes the following sections:

Caches and How Directory Server Uses Them

This section describes the types of cache whose settings you can tune. It also describes how Directory Server uses those types of cache. This section covers the following topics:

Types of Cache

This section describes the types of cache used by Directory Server.

Figure 5–1 shows the caches for an instance of Directory Server with three suffixes, each with its own entry cache.

Directory Server also uses a file system cache. The file system cache is managed by the underlying operating system, and by I/O buffers in disk subsystems.

Figure 5–1 Entry and Database Caches in Context

Figure shows caches for an instance of Directory Server with
three suffixes, each with its own entry cache.

Database Cache

Each instance of Directory Server has one database cache. The database cache holds pages from the database that contain indexes and entries. Each page is not an entry, but a slice of memory that contains a portion of the database.

Directory Server moves pages between the database files and the database cache to maintain the maximum database cache size you specify. The amount of memory used by Directory Server for the database cache can be larger than the specified size. This is because Directory Server requires additional memory to manage the database cache.

For very large database caches, it is important that the memory used by Directory Server does not exceed the size of available physical memory. If the available physical memory is exceeded, the system pages repeatedly and performance is degraded.

The memory can be monitored by empirical testing and by the use of tools such as pmap(1) on Solaris systems. The ps(1) utility can also be used with the -p pid and -o format options to view current memory used by a particular process such as Directory Server ns-slapd. For more information, refer to the operating system documentation.

For 32-bit servers, the database cache size must be limited so that the total Directory Server ns-slapd process size is less than the maximum process size allowed by the operating system. Usually, this limit is in the 2-3 GB range.

Entry Cache

The entry cache holds recently accessed entries that are formatted for delivery to client applications. The entry cache is allocated as required until it reaches a size larger than, but based on the maximum entry cache size you specify.

As entries stored in the entry cache are already formatted, Directory Server returns entries from an entry cache efficiently. Entries in the database must be formatted and stored in the entry cache before they are delivered to client applications.

The maximum size you specify indicates how much memory Directory Server requests from the underlying memory allocation library. Depending on how the memory allocation library handles requests for memory, the actual memory used may be much larger than the amount of memory available to Directory Server for the entry cache.

The memory used by the Directory Server process depends on the memory allocation library that is used, and depends on the entries cached. Entries with many small attribute values usually require more overhead than entries with few large attribute values.

For 32-bit servers, the entry cache size must be limited so that the total Directory Server ns-slapd process size is less than the maximum process size allowed by the operating system. In practice, this limit is generally in the 2-3 GB range.

Import Cache

The import cache is created and used when a suffix is initialized. If the deployment involves offline suffix initialization only, import cache and database cache are not used together. In this case, the import cache and database cache do not need to be added together when the cache size is aggregated. See Total Aggregate Cache Size. When the import cache size is changed, the change takes effect the next time the suffix is reset and initialized. The import cache is allocated for the initialization, then released after the initialization.

Directory Server handles import cache in the same way as it handles database cache. Sufficient physical memory must be available to prevent swapping. The benefits of having a larger import cache diminish for cache sizes larger than 2 GB.

File System Cache

The operating system allocates available memory not used by Directory Server caches and other applications to the file system cache. The file system cache holds data that was recently read from the disk, making it possible for subsequent requests to obtain data from cache rather than having to read it again from the disk. Because memory access is many times faster than disk access, leaving some physical memory available for the file system cache can boost performance.

For 32-bit servers, a file system cache can be used as a replacement for some of the database cache. Database cache is more efficient for Directory Server use than file system cache, but file system cache is not directly associated with the Directory Server ns-slapd process. Potentially, a larger total cache can be made available to Directory Server than would be available by using database cache alone.

64-bit servers do not have the same process size limit issue as 32-bit servers. Use database cache instead of file system cache with 64-bit servers.

Refer to the operating system documentation for information about file system cache.

Total Aggregate Cache Size

The sum of all caches used simultaneously must remain smaller than the total size of available physical memory, minus the memory intended for file system cache, minus the memory intended for other processes such as Directory Server itself.

For 32-bit servers, the total aggregate cache size must be limited so that the total Directory Server ns-slapd process size is less than the maximum process size allowed by the operating system. In practice, this limit is generally in the 2-3 GB range.

If suffixes are initialized while Directory Server is online, the sum of the database cache, the entry cache, and the import cache sizes should remain smaller than the total size of available physical memory.

Table 5–1 Import Operations and Cache Use


Cache Type	Offline Import	Online Import
Database	no	yes
Entry	yes	yes
Import	yes	yes

If all suffixes are initialized while Directory Server is offline, the import cache does not coexist with the database cache, so the same memory can be allocated to the import cache for offline suffix initialization and to the database cache for online use. If you opt to implement this special case, however, ensure that no online bulk loads are performed on a production server. The sum of the caches used simultaneously must remain smaller than the total size of available physical memory.

How Directory Server Performs Searches by Using Cache

In Figure 5–2, individual lines represent threads that access different levels of memory. Broken lines represent probable bottlenecks to minimize through effective tuning of Directory Server.

Figure 5–2 How Directory Server Performs Searches

Figure illustrates how Directory Server performs searches
that specify a base DN and searches that use filters.

The following sections describe how Directory Server performs searches by using the cache. By processing subtree searches as described in the following sections, Directory Server returns results without loading the whole set of results into memory.

How Directory Server Performs Base Searches

Base searches specify a base DN and are the simplest type of searches for Directory Server to manage. Directory Server processes base searches in the following stages.

Directory Server attempts to retrieve the entry from the entry cache.

If the entry is found in the entry cache, Directory Server checks whether the candidate entry matches the filter provided for the search.

If the entry matches the filter provided for the search, Directory Server returns the formatted, cached entry to the client application.
Directory Server attempts to retrieve the entry from the database cache.

If the entry is found in the database cache, Directory Server copies the entry to the entry cache for the suffix. Directory Server proceeds as if the entry had been found in the entry cache.
Directory Server attempts to retrieve the entry from the database itself.

If the entry is found in the database, Directory Server copies the entry to the database cache . Directory Server proceeds as if the entry had been found in the database cache.

How Directory Server Performs Subtree and One-Level Searches

Searches on a subtree or a level of a tree involve additional processing to handle multiple entries. Directory Server processes subtree searches and one-level search in the following stages.

Directory Server attempts to define a set of candidate entries that match the filter from indexes in the database cache.

If no appropriate index is present, the set of candidate entries must be found directly in the database itself.
For each candidate entry, Directory Server performs the following tasks.
1. Performs a base search to retrieve the entry.
2. Checks whether the entry matches the filter provided for the search.
3. Returns the entry to the client application if the entry matches the filter.

How Directory Server Performs Updates by Using the Cache

In Figure 5–3, individual lines represent threads that access different levels of memory. Broken lines represent probable bottlenecks to minimize through effective tuning of Directory Server.

Figure 5–3 How Directory Server Performs Updates

Figure illustrates how Directory Server manages updates.

The figure does not show the impact of the internal base search performed to get the entry for update.

Directory Server processes updates in the following stages.

Directory Server performs a base DN search to retrieve the entry, or to update or verify the entry in the case of an add operation that it does not already exist.
Directory Server updates the database cache and any indexes affected.

If data affected by the change have not been loaded into the database cache, this step can result in disk activity while the relevant data are loaded into the cache.
Directory Server writes information about the changes to the transaction log and waits for the information to be flushed to disk, which happens periodically, at each checkpoint. Directory Server database files are thus updated during the checkpoint operation, not for each write.
Directory Server formats and copies the updated entry to the entry cache for the suffix.
Directory Server returns an acknowledgement of successful update to the client application.

How Directory Server Initializes a Suffix by Using the Cache

The following figure illustrates how Directory Server initializes a suffix by using the cache. Individual lines represent threads that access different levels of memory. Broken lines represent probable bottlenecks to minimize through effective tuning of Directory Server.

Figure 5–4 How Directory Server Initializes a Suffix

Directory Server initializes a suffix in the following stages:

Starts a thread to feed an entry cache, used as a buffer, from LDIF.
Starts a thread for each index affected and a thread to create entries in the import cache. These threads consume entries fed into the entry cache.
Reads from and writes to the database files when import cache runs out.

Directory Server can also write log messages during suffix initialization, but does not write to the transaction log.

Tools for suffix initialization delivered with Directory Server provide feedback on the cache hit rate and import throughput. If cache hit rate and import throughput drop together, it is possible that the import cache is too small.

Tuning Cache Settings

This section provides recommendations for setting database and entry cache sizes. It does not cover import cache sizes. The recommendations here pertain to maximizing either search rate or modify rate, not both at once.

This section covers the following topics:

Basic Tuning Recommendations

Here you find the basic recommendations for maximizing search rates or maximizing modification rates achieved by Directory Server. Set cache sizes according to the following recommendations:

For Maximum Search Rate (Searches Only)

If the directory data do not fit into available physical memory, or only just fit with no extra room to spare, set cache sizes to their minimum values, 500k for db-cache-size, 200k for entry-cache-size, and allow the server to use as much of the operating system's file system cache as possible.

If the directory data fit into available physical memory with physical memory to spare, allocate memory to the entry cache until either the entry cache is full or, on a 32–bit system, the entry cache reaches maximum size. Then allocate memory to the database cache until it is full or reaches maximum size.

See Configuring Memory in Sun Java System Directory Server Enterprise Edition 6.2 Administration Guide for instructions on setting cache sizes.

For Maximum Modification Rate (Modifications Only)

If the directory data do not fit into available physical memory, or only just fit with no extra room to spare, set the entry cache sizes to the minimum value, 200k for entry-cache-size, set the database cache to a value in the 100M to 1G range, and allow the server to use as much of the operating system's file system cache as possible. Keeping some database cache available ensures that modifications remain cached between each database checkpoint.

See Configuring Memory in Sun Java System Directory Server Enterprise Edition 6.2 Administration Guide for instructions on setting cache sizes.

Small, Medium, and Large Data Sets

A working set refers to the data actually pulled into memory so that the server can respond to client applications. The data set is then the entries in the directory that are being used due to client traffic. The data set may include every entry in the directory, or may be composed of some smaller number of entries, such as entries corresponding to people in a time zone where users are active.

We define three data set sizes, based on how much of the directory data set fits into available physical memory:

Small: The data set fits entirely into physical memory with fully-loaded database and entry caches.
Medium: The data set fits in physical memory, and extra physical memory can be dedicated to entry cache.
Large: The data set is too small to fit completely in available physical memory.

The ideal case is of course the small data set. If your data set is small, set database cache size and entry cache size such that all entries fit in both the database cache and the entry cache.

The following sections provide recommendations for medium and large data sets where the server performs either all searches or all modify operations.

Optimum Search Performance (Searches Only)

Figure 5–5 shows search performance on a hypothetical system. As expected, Directory Server offers top search performance for a given system configuration when the whole data set fits into memory.

Figure 5–5 Search Performance

Performance improves as more of the data set fits into
memory.

For large data sets better performance has been observed when database cache and entry cache are set to their minimum sizes and available memory is left to the operating system for use in allocating file system cache. As shown, performance improves when more of the data set fits into the file system cache.

For medium data sets better performance has been observed when the file system cache holds the whole data set, and extra physical memory available is devoted to entry cache. As shown, performance improves when more of the medium data set fits in entry cache.

Optimum Modify Performance (Modifications Only)

Figure 5–6 shows modify performance on a hypothetical system. As expected, Directory Server offers top modify performance for a given system configuration when the whole data set fits into memory.

Figure 5–6 Modify Performance

For medium data sets, modify performance reaches its maximum as all entries fit into file system cache. As suggested in Basic Tuning Recommendations, keeping some database cache available ensures the modifications to remain cached between each database checkpoint.