7 Tuning for Better Application Throughput

This chapter describes how to tune the JRockit JVM for improved application throughput.

Every application has unique behavior and has unique requirements on the JVM for gaining maximum application throughput. The standard behavior of the Oracle JRockit JVM gives good performance for most applications. However, you can tune the JVM further to increase application throughput.

This chapter includes information about the following topics:

7.1 Measuring Application Throughput

In this document application throughput denotes the speed at which a Java application runs. If your application is a transaction-based system, high throughput means that more transactions are executed during a given amount of time. You can also measure the throughput by measuring how long it takes to perform a specific task or calculation.

To measure the throughput of your application you need a benchmark. The benchmark should simulate several realistic use cases of the application and run long enough to allow the JVM to warm up and perform several garbage collections. You also need a way to measure the results, either by timing the entire run of a specific set of actions or by measuring the number of transactions that can be performed during a specific amount of time. For an optimal throughput assessment, the benchmark should run on high load and not depend on any external input like database connections.

When you have a benchmark set up, you can monitor the behavior of the JVM using one of the following methods:

  • Create a run-time analysis with the JRockit Flight Recorder in Oracle JRockit Mission Control. In the Flight Recorder tool, you can see the frequency of the garbage collections and why garbage the collections are launched. This information provides clues for memory management tuning. For information about creating and analyzing a Flight Recorder report, see the Oracle JRockit Mission Control Online Help.

  • Create verbose outputs by using the -Xverbose option. For example, the -Xverbose:memdbg,gcpause,gcreport option causes memory management data such as garbage collection frequency and duration to be displayed. The -Xverbose:memdbg option also causes the reason for garbage collection to be displayed. This helps you study the garbage collection behavior.

Now you have the tools for measuring the throughput of your Java application and can start to tune the JVM for better application throughput.

7.2 Selecting the Garbage Collector for Maximum Throughput

The first step of tuning the JRockit JVM for maximum application throughput is to select an appropriate garbage collection mode or strategy. The options available are:

  • Dynamic Garbage Collection Mode Optimized for Throughput

    This is the default garbage collection mode for the JRockit JVM. This mode selects the optimal garbage collection strategy for maximum application throughput.

  • Static Generational Parallel Garbage Collection

    This static garbage collector is a good alternative if you do not want to use a dynamic garbage collection mode. The generational parallel garbage collector provides high throughput for applications that allocate a lot of temporary objects.

  • Static Single-Spaced Parallel Garbage Collection

    This is another alternative if you do not want to use a dynamic garbage collection mode. The single-spaced parallel garbage collector provides high throughput for applications that allocate mostly large objects.

For more information about garbage collector options, see Section 4.2, "Selecting and Tuning a Garbage Collector."

7.2.1 Dynamic Garbage Collection Mode Optimized for Throughput

The default garbage collection mode in the JRockit JVM (assuming that you run in server mode, which is also default) tunes the memory management for maximum application throughput. By default, it selects a generational garbage collection strategy. It also tunes the nursery size, if the garbage collection strategy is generational.

Note that if you use the dynamic garbage collection mode optimized for throughput, the garbage collection pauses do not have any strict time limits. If your application is sensitive to long latencies, you should tune for low latencies rather than for maximum throughput, or find a middle path that gives you acceptable latencies.

The dynamic garbage collection mode optimized for throughput is the default garbage collector for the JRockit JVM. You can also turn it on explicitly as follows:

java -Xgc:throughput myApplication

7.2.2 Static Single-Spaced Parallel Garbage Collection

If you want to use a static garbage collector, then you should use a parallel garbage collector in order to maximize application throughput. If the large or small object allocation ratio is high, then use a single-spaced garbage collector (-Xgc:singlepar). You can see the ratio between large and small object allocation if you do a Flight Recorder recording of your application.

7.2.3 Static Generational Parallel Garbage Collection

If you want to maximize application throughput and the large or small object allocation ratio is low, then use a generational parallel garbage collector (-Xgc:genpar). A generational parallel garbage collector might be the right choice even if the large or small object allocation ratio is high when you are using a very small nursery. You can see the ratio between large and small object allocation if you do a Flight Recorder recording of your application.

7.3 Tuning the Heap Size

The default heap size starts at 64 MB and can increase up to 3 GB (for 64-bit systems) or 1 GB (for 32-bit systems). Most server applications need a large heap, at least larger than 1 GB, to optimize throughput. For such applications, you should set the heap size manually by using the -Xms (initial heap size) and -Xmx (maximum heap size) command-line options. Setting -Xms the same size as -Xmx has regularly shown to be the best configuration for improving throughput.

Example:

java -Xms:2g -Xmx:2g myApplication

For more information about setting the initial and maximum heap sizes, including guidelines for setting these values, please see Section 4.4, "Optimizing Memory Allocation Performance."

7.4 Manually Tuning the Nursery Size

The nursery or young generation is the area of free chunks in the heap where objects are allocated when running a generational garbage collector (-Xgc:throughput, -Xgc:genpar or -Xgc:gencon). A nursery is valuable because most objects in a Java application die young. Collecting garbage from the young space is preferable to collecting the entire heap, as it is a less expensive process and most objects in the young space would already be dead when the garbage collection is started.

If you are using a generational garbage collector you might need to change the nursery setting to accommodate more young objects.

  • -Xgc:throughput and -Xgc:genpar change the nursery size dynamically at run time. In some cases manual tuning can result in a more efficient nursery size.

  • -Xgc:gencon has a fairly low and static nursery size setting. For many applications, you might want to tune the nursery size manually when using this garbage collector.

An efficient nursery size is such that the amount of memory freed by young collections (garbage collections of the nursery) rather than old collections (garbage collections of the entire heap) is as high as possible. To achieve this, you should set the nursery size close to the size of half of the free heap after an old collection.

To set the nursery size manually, use the -Xns option.

Example:

java -Xgc:gencon -Xms:2g -Xmx:2g -Xns:512m myApplication

7.5 Manually Tuning Compaction

Compaction is the process of moving chunks of allocated space toward the lower end of the heap, helping to create contiguous free memory at the other end. The JRockit JVM does partial compaction of the heap at each old collection.

The default compaction setting for garbage collectors (-Xgc) uses a dynamic compaction scheme that tries to avoid peaks in the compaction times. This is a compromise between keeping garbage collection pauses even and maintaining a good throughput, so it does not necessarily give the best possible throughput. Tuning the compaction can pay off well, depending on the characteristics of the application.

There are two ways to tune the compaction for better throughput; increasing the size of the compaction area and increasing the maximum references. Increasing the size of the compaction area helps to reduce the fragmentation on the heap. Increasing the maximum references implicitly allows larger areas to be compacted at each garbage collection. This reduces the garbage collection frequency and makes allocation of large objects faster, thus improving the throughput.

For information about tuning these compaction options, see Section 4.3, "Tuning Compaction."

7.6 Tuning the Thread-Local Area Size

Increasing the preferred TLA size speeds up allocation of small objects when each Java thread allocates a lot of small objects, because the threads do not have to synchronize to get a new TLA as often.

In Oracle JRockit JVM, the preferred TLA size also determines the size limit for objects allocated in the nursery. Increasing the TLA size also allows larger objects to be allocated in the nursery, which could be beneficial for applications that allocate a lot of large objects. A JRockit Flight Recorder recording provides you with statistics on the sizes of large objects allocated by your application. For good performance, you can try setting the preferred TLA size at least as large as the largest object allocated by your application.

For information about how to set the TLA size, see Section 4.4.1, "Setting the Thread Local Area Size."