30 Tuning Performance

Oracle Unified Directory aims to be high-performing and highly-scalable. Although the server can achieve impressive results with the "out-of-the-box" server configuration and default JVM settings, performance can often be improved significantly through some basic tuning.

The default settings of Oracle Unified Directory are targeted at evaluators and developers who are running equipment with limited resources. When you deploy Oracle Unified Directory in a production environment, it useful to do some initial tuning of the Java Virtual Machine (JVM) and of the server configuration to improve scalability and performance (particularly for write operations).

This chapter covers the following topics:

30.1 Assessing Performance Problems

You can obtain a quick idea of whether performance issues are related to problems with the server or with the client by examining the access log at INSTANCE_DIR/OUD/logs/access. This log contains entries of the form:

[09/Sep/2009:15:36:18 +0200] SEARCH RES conn=1 op=16 msgID=17 
  result=0 nentries=1 etime=1

The value of the etime field is the time (in milliseconds) that the server spent processing the request. Large etimes generally indicate an issue on the server side (which can usually be resolved by appropriate performance tuning or indexing. If you are experiencing performance problems but the etimes are small, the issue is more likely to be with your client application.

Comprehensive monitoring information is available under the cn=monitor entry. For more information, see Chapter 29, "Monitoring Oracle Unified Directory." Oracle Unified Directory performance can also be monitored by using the Enterprise Manager Grid Control plugin. For more information, see the System Monitoring Plug-in for Oracle Unified Directory User's Guide.

30.2 General Performance Tuning

Note that performance tuning strategies differ depending on whether you are running a directory server or a proxy server.

The following items can improve performance in specific deployment scenarios.

  • Java Version. Use the most recent Java Runtime Environment (JRE) release available. Oracle Unified Directory is designed to work with Java SE 6 and 7.

  • Environment Variables. The server uses the OPENDS_JAVA_HOME environment variable to point to your installed JRE. If you have multiple versions of Java installed on a system, set the JAVA_HOME environment variable to point to the root of the desired installation. In this way, the version of the JRE specified by the JAVA_HOME variable can be used by other applications but not by Oracle Unified Directory.

    To specify a JRE installation for the server, do one of the following:

    • Use the dsjavaproperties command to set the appropriate environment variables.

      For more information, see dsjavaproperties.

    • Set the OPENDS_JAVA_BIN environment variable (with the JAVA binary path).

    • Set the OPENDS_JAVA_HOME environment variable (with the JAVA installation path).

30.3 Tuning Java Virtual Machine Settings

You can use the JAVA_ARGS environment variable to provide global configuration arguments that can be passed to the JVM, or you can use the java.properties file. Any argument that can be used with the java command can be used with both methods.

It is recommended to tune the JVM for optimal performance and ensures that Oracle Unified Directory applications are robust and responsive. You can tune the JVM by tuning the heap size. The heap size is divided into the following:

  • Young generation: Includes operations like PDUs and local variables.

  • Old generation: Includes Oracle Unified Directory caches like the JE database cache and the entry cache.

  • Permanent generation: Includes constants and classes.

When Oracle Unified Directory is in Directory Server mode, you can perform one of the following database caching option:

  • Cache the entire database in database cache. This will give optimal performance but will lead to long cache warmup and larger heap size.

  • Cache only the internal nodes of the database Btree (Upper and inner nodes) in database cache and keep remaining RAM for file system cache. This will give good performance, short cache warmup, smaller heap size and is recommended for very large deployments (Above 50MBytes entries). It is recommended for small and medium deployments.

For more information, see Section 30.4, "Determining the Database Cache Size".

Note:

For proxy mode, use large old generation for distribution with global index.

For more information, see dsjavaproperties.

For additional information about tuning the JVM, see the Java Performance Documentation (http://java.sun.com/docs/performance/). The Java Tuning White Paper (http://java.sun.com/performance/reference/whitepapers/tuning.html) and the Garbage Collection Tuning (http://www.oracle.com/technetwork/java/javase/tech/index-jsp-136373.html) documents are particularly useful.

The following table describes the main JVM tunable options.

Parameter Description

-server

Always use the server JVM instead of the client JVM. The client VM is better optimized for processes that run for a short period of time and need to start as quickly as possible. The server VM can take longer to warm up but is faster in the long run.

-d32 or -d64

Select the 32-bit or 64-bit version of the JVM as follows:

  • -d32 provides better performance for JVM heaps smaller than 3.5Gbytes.

  • -XX:+UseCompressedOops should be used for JVM heaps between 3.5Gbytes and 31Gbytes.

  • -d64: should be used for JVM heaps over 32Gbytes.

-XX:+UseCompressedOops

Use this option if you use the 64-bit JVM and if the heap size is less than 32 Gbytes.

-Xms2g and -Xmx2g

This parameter sets the initial and maximum heap size available to the JVM. Increasing the heap size can improve performance, but setting it too high can have a detrimental effect in the form of longer pauses for full garbage collection runs. The initial and maximum sizes should generally be set to the same values.

For maximum performance, size the heap so that the entire DB can be cached in memory. In general, you should allocate enough heap for the server runtime and the rest to the DB cache.

For example, if you want to modify the heap size of an Oracle Unified Directory instance with only one JE backend named userRoot. Then you must decide the space needed for the new generation, the old generation and the perm generation. To size the different generations, you must consider the following:

  • The size of the database impacting the old generation

  • Determine the need to use an entry cache impacting the old generation.

  • The type of GC used impacting the old generation.

  • The type of usage impacting the new generation.

If you use CMS as the garbage collector of the oldgen, you must take into account the -XX:CMSInitiatingOccupancyFraction property when you calculate the heap size so that it is coherent with the size (or percent of the heap) occupied by the dbcache.

If you set the CMSInitiatingOccupancyFraction to 55, the dbcache percent should be set to 50. Then if you have a database on disk that is 10GB, you need at least a heap of 22GB if you want the entire database to fit into the dbcache.

-XX:NewSize=512M

The total heap space is divided into the old generation and the young generation. This parameter sets the size of the young generation. The remaining memory (old generation) must be sufficient to hold the DB cache plus some overhead.

-XX:+UseConcMarkSweepGC

Use the Concurrent Mark Sweep (CMS) garbage collector. This option allows the JVM to minimize the response time of LDAP operations, but it can have a small impact on the overall performance (throughput) of the server. Use this option of long pause times are not tolerated.

-XX:CMSInitiatingOccupancyFraction=<percentage>

Specify the level at which the CMS garbage collection is started. The default value is approximately 68%. Use this value if you want to set the percentage to something other than the default value.

-XX:+UseBiasedLocking

Improve locking performance in the server in cases where there is not expected to be a high degree of contention.

-XX:LargePageSizeInBytes=256m

Use large pages for the information it stores in memory. This argument applies primarily to systems using the UltraSPARC T1 processor.

-XX:+UseParallelGC

Specify that the system should use parallel garbage collection, which is particularly useful on systems with a large number of CPUs.

-XX:+UseParallelOldGC

Specify that the JVM should use parallel garbage collection for the old (tenured) generation.

-XX:ParallelGCThreads=8

Specify that the JVM should use 8 threads when performing parallel garbage collection. The default is to use a number of threads equal to the number of CPUs, but this can be inappropriate on systems with a very large number of CPUs or on CMT-based systems like those using the UltraSPARC T1 processor.


30.4 Determining the Database Cache Size

If you have installed or configured and initialized an Oracle Unified Directory instance then you can determine the database cache size requirements by measuring the size of <OUD_INSTANCE_DIR>/OUD/db/userRoot directory (Assuming there is only one database for the Oracle Unified Directory instance named userRoot).

If an Oracle Unified Directory instance is not configured or initialized, then you can determine the memory required to store internal nodes for one index file or the file containing user data, by running the DbCacheSize utility (com.sleepycat.je.util).

For more information on using the DbCacheSize utility, see this Javadoc page: http://docs.oracle.com/cd/E17277_02/html/java/com/sleepycat/je/util/DbCacheSize.html.

For example, 10 million entries of 4Kbytes with an index and average key size of 10 bytes are as follows:

[oud@oel5 bin]$ java -jar -XX:+UseCompressedOops /space/Middleware/Oracle_OUD1/lib/je.jar DbCacheSize -records 10000000 -key 10 -data 4000
 
=== Database Cache Size ===
 Minimum Bytes    Maximum Bytes   Description
---------------  ---------------  -----------
    259,725,752      317,907,896  Internal nodes only
 40,721,011,192   40,779,193,336  Internal nodes and leaf nodes
=== Internal Node Usage by Btree Level ===
 Minimum Bytes    Maximum Bytes      Nodes    Level
---------------  ---------------  ----------  -----
    256,180,800      313,709,120     112,360    1
      3,503,312        4,149,456       1,262    2
         38,864           46,032          14    3
          2,776            3,288           1    4

A 10 million entries deployment with 4 Kbytes will require 37 Gbytes to store the full user data in the database cache (4Kbytes entries and the internal nodes of the Database Btree). If you want to store only the internal nodes in the database cache, then 303 Mbytes are required per indexes (3 Gbytes for 10 indexes).

30.5 Tuning the Server Configuration

Various components of the server can be tuned to provide performance improvements in specific scenarios. Most performance tuning recommendations depend on several variables, including the anticipated workload, the types of data that are stored, and the hardware and resources available.

The following general tuning recommendations can improve performance in specific deployments.

30.5.1 Back End Tuning Parameters

The following Berkeley DB JE tuning parameters can be used to tune performance:

Parameter Description

je.checkpointer.highPriority

If true, the checkpointer uses more resources in order to complete the checkpoint in a shorter time interval. Btree latches are held and other threads are blocked for a longer period. Log cleaner record migration is performed by cleaner threads instead of lazily during eviction and checkpoints (see CLEANER_LAZY_MIGRATION). When set to true, application response time may be longer during a checkpoint, and more cleaner threads may be required to maintain the configured log utilization.

Setting that property to false is a way to achieve better throughput and lower response times.

preload-time-limit

You can configure the server to preload some of the database contents into memory on startup. For large databases, preloading the database cache avoids a long warmup period after server startup. For more information, see "Local DB Backend Configuration" in the Oracle Unified Directory Configuration Reference.

db-cache-percent and

db-cache-size

Use these properties to configure the amount of memory that the database cache uses. For best performance, consider configuring the server so that the whole database fits into the database cache.

Determine the approximate size of the database after an import. For example, after doing an import into the userRoot back end, run the following command (on UNIX systems) to determine the size of the database:

$ cd INSTANCE_DIR/OUD/db
$ du -sk userRoot/
910616 userRoot/

On Windows systems, use an equivalent procedure to determine the database size. Remember that the database size is not static and can increase after an initial import when modifications are made.

Setting the JVM heap to 2 Gbytes (-Xms2g -Xmx2g), and the db-cache-percent to 50, will cause the DB cache to use 1 Gbyte of memory. To monitor the DB cache size, observe the following properties under the "dn:cn=userRoot Database Environment,cn=monitor" entry through Jtrace and JMX:

  • Check that EnvironmentCacheDataBytes has a value that is consistent with the expected size of the DB cache.

  • Check that EnvironmentNCacheMiss does not have unexpected growth when loading the server.

As the database grows very large over time due to replication metadata, users, and applications. This may effect the performance after the import. It is recommended that you tune the Oracle Unified Directory JVM heap size (Primarily the old generation).

db-directory

Ensure that the database is held on a fast file system with adequate storage. The file system should be different to the location of the access logs. By default, the database will grow to twice its original size. For example, if the database is 1 Gbyte after an import, the file system should have at least 2 Gbytes available.

db-evictor-lru-only

Use this property can be used to control how the database cache retains information. Setting this value to false ensures that the internal nodes are maintained in cache, which provides better performance when the JE cache holds only a small percentage of the database contents.

db-txn-durability

Use this property to configure durability for write operations. Reducing durability can increase write performance, but it can also increase the chance of data loss in the event of a JVM crash or a system crash. This property takes the following values:

  • write-to-disk. All data are written synchronously to disk.

  • write-to-fs. Data are written to the file system immediately but might stay in the file system before being flushed to disk.

  • write-to-cache. Data are written to an internal buffer and flushed to the file system, then to disk when necessary.

db-log-file-max

Use this property to control the size of JE log files. Increasing the file size can improve write performance, but it can also make it harder to maintain the desired utilization percentage.

db-num-cleaner-threads

and db-cleaner-min-utilization

These properties control how the cleaner works, which keeps the database size down and keeps up with high write throughput.

db-num-lock-tables

On systems with a large number of CPUs, this property can improve concurrency within the database lock manager.


30.5.2 Core Server Tuning Parameters

The following core server tuning parameters can be used to tune performance:

  • num-request-handlers

    This property can be configured so that the LDAP connection handler (and the LDAPS connection handler, if it is enabled) use multiple threads for decoding client requests. Increasing the number of threads on systems with a larger number of CPUs can improve performance. As a rule of thumb, you should set this property to a quarter the number of CPUs, with a maximum of twelve.

    In some cases disabling the keep-stats property can help reduce lock contention in the connection handlers. For more information, see "LDAP Connection Handler Configuration" in the Oracle Unified Directory Configuration Reference.

  • num-worker-threads

    The default value of this property is two times the number of CPUs. This value is sufficient in most deployments.

  • log-file

    Ensure that the access log publisher is on a fast file system, or turn it off altogether by setting the enabled property to false. For more information see "File Based Access Log Publisher Configuration" in the Oracle Unified Directory Configuration Reference.

30.5.3 Additional Tuning Recommendations

The following additional recommendations can improve performance in specific scenarios.

  • Enable an Entry Cache. In some cases, particularly those involving relatively small directories (for example, up to a few hundred thousand entries), it can be useful to enable an entry cache. In general the FIFO entry cache provides better results than the soft reference entry cache. For more information, see "Entry Cache Configuration" in the Oracle Unified Directory Configuration Reference.

    For large database, it is recommended that you store only a specific set of the data in the cache, by using the include-filter property. Storing static groups in the entry cache can greatly improve the overall performance of the server. This reduces the time required to perform group membership lookup, which is necessary in evaluating ACIs, for example.

  • Disable Unused Virtual Attributes. If the functionality needed by one or more of the virtual attributes is not required, they can be disabled for a slight performance improvement when decoding entries. For more information, see "Virtual Attribute Configuration" in the Oracle Unified Directory Configuration Reference.

  • Disable Unused Access Logging. If access logging is not necessary, disabling the server access logger can help improve performance. For more information, see "Log Publisher Configuration" in the Oracle Unified Directory Configuration Reference.

  • Disable Unused Access Control Handlers. If you do not need access control processing in the server, then you can disable it by setting the enabled configuration property to false for the Access Control Handler. You can set the property by using dsconfig.

  • Reduce Lock Contention. On systems with large numbers of CPUs (for example, chip multi-threading (CMT) systems with several hardware threads per core), you can reduce lock contention by setting the org.opends.server.LockManagerConcurrencyLevel system property to be equal to the number of worker threads you intend to use.

    Note:

    This property must be set as a JVM system property, because it can be required very early in the server startup process, even before accessing the server configuration.