Diagnostics Guide

     Previous  Next    Open TOC in new window  Open Index in new window  View as PDF - New Window  Get Adobe Reader - New Window
Content starts here

Tuning For Better Application Throughput

Every application has a unique behavior and has its own unique requirements on the JVM for gaining maximum application throughput. The “out of the box” behavior of the Oracle JRockit JVM gives good performance for most applications. You can however often tune the JVM further to gain some extra application throughput, which means that the application will run faster.

This chapter describes how to tune the JRockit JVM for improved application throughput. It includes information on the following subjects:

 


Measuring Your Application’s Throughput

In this document “application throughput” denotes the speed at which a Java application runs. If your application is a transaction based system, high throughput means that more transactions are executed during a given amount of time. You can also measure the throughput by measuring how long it takes to perform a specific task or calculation.

To measure the throughput of your application you need a benchmark. The benchmark should simulate several realistic use cases of the application and run long enough to allow the JVM to warm up and perform several garbage collections. You also need a way to measure the results, either by timing the entire run of a specific set of actions or by measuring the number of transactions that can be performed during a specific amount of time. For an optimal throughput assession, the benchmark should run on high load and not depend on any external input like database connections.

When you have a benchmark set up, you can monitor the behavior of the JVM using one of the following methods:

Now you have the tools for measuring the throughput of your Java application and can start to tune the JVM for better application throughput.

 


Select Garbage Collector

The first step of tuning the JRockit JVM for maximum application throughput is to select an appropriate garbage collection mode or strategy.

For more information about different garbage collector options, see Selecting and Tuning a Garbage Collector.

Dynamic Garbage Collection Mode Optimized for Throughput

The default garbage collection mode in the JRockit JVM (assuming that you run in server mode, which is also default) tunes the memory management for maximum application throughput. Depending on the behavior of your application, it will select either a generational or non-generational parallel garbage collection strategy. It will also tune the nursery size, if the garbage collection strategy is generational.

Be aware that if you use the dynamic garbage collection mode optimized for throughput, the garbage collection pauses will not have any strict time limits. If your application is sensitive to long latencies, you should tune for low latencies rather than for maximum throughput, or find a middle path that gives you acceptable latencies.

The dynamic garbage collection mode optimized for throughput is the default garbage collector for the JRockit JVM. You can also turn it on explicitly like this:

java -XgcPrio:throughput myApplication

Static Single-Spaced Parallel Garbage Collection

If you want to use a static garbage collector, then you should use a parallel garbage collector in order to maximize application throughput. If the large/small object allocation ratio is high, then use a single-spaced garbage collector (-Xgc:singlepar). You can see the ratio between large and small object allocation if you do a JRA recording of your application.

To improve throughput by using a static garbage collector, you may also need to set other -X or -XX options to deliver that throughput.

Static Generational Parallel Garbage Collection

If you want to maximize application throughput and the large/small object allocation ratio is low, then use a generational parallel garbage collector (-Xgc:genpar). A generational parallel garbage collector might be the right choice even if the large/small object allocation ratio is high when you are using a very small nursery. You can see the ratio between large and small object allocation if you do a JRA recording of your application.

To improve throughput by using a static garbage collector, you may also need to set other -X or -XX options to deliver that throughput.

 


Tune the Heap Size

The default heap size starts at 64 MB and can increase up to 1 GB. Most server applications need a large heap—at least larger than 1 GB—to optimize throughput. For such applications, you will need to set the heap size manually by using the -Xms (initial heap size) and -Xmx (maximum heap size) command-line options. Setting -Xms the same size as -Xmx has regularly shown to be the best configuration for improving throughput; for example:

java -Xms:2g -Xmx:2g myApp

For more information on setting the initial and maximum heap sizes, including guidelines for setting these values, please see Optimizing Memory Allocation Performance.

 


Manually Tune the Nursery Size

The nursery—or young generation —is the area of free chunks in the heap where objects are allocated when running a generational garbage collector (-XgcPrio:throughput, -Xgc:genpar or -Xgc:gencon). A nursery is valuable because most objects in a Java application die young. Collecting garbage from the young space is preferable to collecting the entire heap, as it is a less expensive process and most objects in the young space will already be dead when the garbage collection is started.

If you are using a generational garbage collector you might need to change the nursery setting to accommodate more young objects.

An efficient nursery size is such that the amount of memory freed by young collections (garbage collections of the nursery) rather than old collections (garbage collections of the entire heap) is as high as possible. To achieve this, you should set the nursery size close to the size of half of the free heap after an old collection.

To set the nursery size manually, use the -Xns command-line option; for example:

java -Xgc:gencon -Xms:2g -Xmx:2g -Xns:512m myApp

 


Manually Tune Compaction

Compaction is the process of moving chunks of allocated space toward the lower end of the heap, helping to create contiguous free memory at the other end. The JRockit JVM does partial compaction of the heap at each old collection.

The default compaction setting for static garbage collectors (-Xgc or -XXsetGC) use a dynamic compaction scheme that tries to avoid “peaks” in the compaction times. This is a compromise between keeping garbage collection pauses even and maintaining a good throughput, so it doesn't necessarily give the best possible throughput. Tuning the compaction can pay off well, depending on the application's characteristics.

There are two ways to tune the compaction for better throughput; increasing the size of the compaction area and increasing the compact set limit. Increasing the size of the compaction area will help reduce the fragmentation on the heap. Increasing the compact set limit will implicitly allow larger areas to be compacted at each garbage collection. This reduces the garbage collection frequency and makes allocation of large objects faster, thus improving the throughput.

For information on tuning these compaction options, please refer to Tuning the Compaction of Memory.

 


Tune the Thread-Local Area Size

Thread Local Areas (TLAs) are chunks of free memory used for object allocation. The TLAs are reserved from the heap and given to the Java threads on demand, so that the Java threads can allocate objects without having to synchronize with the other Java threads for each object allocation.

Increasing the preferred TLA size speeds up allocation of small objects when each Java thread allocates a lot of small objects, as the threads won’t have to synchronize to get a new TLA as often.

In Oracle JRockit JVM R27.3 and later releases the preferred TLA size also determines the size limit for objects allocated in the nursery. Increasing the TLA size will thus also allow larger objects to be allocated in the nursery, which is beneficial for applications that allocate a lot of large objects. In older versions you need to set both the TLA size and the Large Object Limit to allow larger objects to be allocated in the nursery. A JRA recording will show you statistics on the sizes of large objects allocated by your application. For good performance you can try setting the preferred TLA size at least as large as the largest object allocated by your application.

For more information on how to set the TLA size, see Setting the Thread Local Area Size.


  Back to Top       Previous  Next