15 Codecache Tuning

This chapter describes techniques for reducing the just-in-time (JIT) compiler's consumption of memory in the codecache, where it stores compiled methods.

This chapter contains the following topics:

Introduction
java Launcher Codecache Option Summary
Measuring Codecache Usage
Constraining the Codecache Size
Reducing Compilations
Reducing Compiled Method Sizes

Introduction

The Java Virtual Machine (JVM) generates native code and stores it in a memory area called the codecache. The JVM generates native code for a variety of reasons, including for the dynamically generated interpreter loop, Java Native Interface (JNI) stubs, and for Java methods that are compiled into native code by the just-in-time (JIT) compiler. The JIT is by far the biggest user of the codecache. This appendix describes techniques for reducing the JIT compiler's codecache usage while still maintaining good performance.

This chapter describes three ways to reduce the JIT's use of the codecache:

Constrain the amount of codecache available to the JIT.
Tune the JIT to compile fewer methods.
Tune the JIT to generate less code per method.

java Launcher Codecache Option Summary

The JVM options passed by the java launcher listed in the tables in this section can be used to reduce the amount of codecache used by the JIT. The table descriptions are summaries. Most of the options are described in more detail in the sections that follow.

How to Use the Codecache Options of the java Command

The options listed in the following sections share the following characteristics.

All options are –XX options, for example, -XX:InitialCodeCacheSize=32m. Options that have true/false values are specified using + for true and - for false. For example, -XX:+PrintCodeCache sets this option to true.
For any option that has "varies" listed as the default value, run the launcher with XX:+PrintFlagsFinal to see your platform's default value.
If the default value for an option differs depending on which JVM is being used (client or server), then both defaults are listed, separated by a '/'. The client JVM default is listed first. The minimal JVM uses the same JIT as the client JVM, and therefore has the same defaults.

Codecache Size Options

Table 15-1 summarizes the codecache size options. See also Constraining the Codecache Size.

Table 15-1 Codecache Size Options

Option	Default	Description
`InitialCodeCacheSize`	160K (varies)	Initial code cache size (in bytes)
`ReservedCodeCacheSize`	32M/48M	Reserved code cache size (in bytes) - maximum code cache size
`CodeCacheExpansionSize`	32K/64K	Code cache expansion size (in bytes)

Codecache Flush Options

Table 15-2 summarizes the codecache flush options.

Table 15-2 Codecache Flush Options

Option	Default	Description
`ExitOnFullCodeCache`	false	Exit the JVM if the codecache fills
`UseCodeCacheFlushing`	false	Attempt to sweep the codecache before shutting off compiler
`MinCodeCacheFlushingInterval`	30	Minimum number of seconds between codecache sweeping sessions
`CodeCacheMinimumFreeSpace`	500K	When less than the specified amount of space remains, stop compiling. This space is reserved for code that is not compiled methods, for example, native adapters.

Compilation Policy Options

Table 15-3 summarizes the compilation policy (when to compile) options.

Table 15-3 Compilation Policy Options

Option	Default	Description
`CompileThreshold`	1000 or 1500/10000	Number of interpreted method invocations before (re-)compiling
`OnStackReplacePercentage`	140 to 933	NON_TIERED number of method invocations/branches (expressed as a percentage of `CompileThreshold`) before (re-)compiling OSR code

Compilation Limit Options

Table 15-4 summarizes the compilation limit options, which determine how much code is compiled).

Table 15-4 Compilation Limit Options

Option	Default	Description
`MaxInlineLevel`	9	Maximum number of nested calls that are inlined
`MaxInlineSize`	35	Maximum bytecode size of a method to be inlined
`MinInliningThreshold`	250	Minimum invocation count a method needs to have to be inlined
`InlineSynchronizedMethods`	true	Inline synchronized methods

Diagnostic Options

Table 15-5 summarizes the diagnostic options.

Table 15-5 Diagnostic Options

Option	Default	Description
`PrintFlagsFinal`	false	Print all JVM options after argument and ergonomic processing
`PrintCodeCache`	false	Print the code cache memory usage when exiting
`PrintCodeCacheOnCompilation`	false	Print the code cache memory usage each time a method is compiled

Measuring Codecache Usage

To measure the success of a codecache usage reduction effort, you must measure the codecache usage and the effect on performance. This section explains how to measure the codecache usage. It is up to you to decide the best way to measure performance for your application.

Start with a baseline (the amount of codecache used when no codecache reduction techniques are applied), and then monitor the effect of your codecache reduction techniques on both codecache size and performance relative to the baseline.

Keep in mind that the codecache starts relatively small and then grows as needed as new methods are compiled. Sometimes compiled methods are freed from the codecache, especially when the maximum size of the codecache is constrained. The memory used by free methods can be reused for newly compiled methods, allowing additional methods to be compiled without growing the codecache further.

You can get information on codecache usage by specifying –XX:+PrintCodeCache on the java launcher command line. When your application exits, you will see output similar to the following:

CodeCache: size=32768Kb used=542Kb max_used=542Kb free=32226Kb
 bounds [0xb414a000, 0xb41d2000, 0xb614a000] 
 total_blobs=131 nmethods=5 adapters=63 
 compilation: enabled

The most useful part of the output for codecache reduction efforts is the first line. The following describes each of the values printed:

size: The maximum size of the codecache. It should be equivalent to what was specified by –XX:ReservedCodeCacheSize. Note that this is not the actual amount of physical memory (RAM) used by the codecache. This is just the amount of virtual address space set aside for it.
used: The amount of codecache memory actually in use. This is usually the amount of RAM the codecache occupies. However, due to fragmentation and the intermixing of free and allocated blocks of memory within the codecache, it is possible that the codecache occupies more RAM than is indicated by this value, because blocks that were used then freed are likely still in RAM.
max_used: This is the high water mark for codecache usage; the maximum size that the codecache has grown to and used. This generally is considered to be the amount of RAM occupied by the codecache, and will include any free memory in the codecache that was at some point in use. For this reason, it is the number you will likely want to use when determining how much codecache your application is using.
free: This is size minus used.

The –XX:+PrintCodeCacheOnCompilation option also produces the same output as the first line above produced by –XX:+PrintCodeCache, but does so each time a method is compiled. It can be useful for measuring applications that do not terminate. It can also be useful if you are interested in the codecache usage at a certain point in the application's execution, such as after application startup has completed.

Because max_used generally represents the amount of RAM used by the codecache, this is the value you will want note when you take your baseline measurement. The sections that follow describe how you can reduce max_used.

Constraining the Codecache Size

Constraining the codecache size means the codecache is limited to a size that is less than what would an unconstrained codecache would use. The ReservedCodeCacheSize option determines the maximum size of the codecache. It defaults to a minimum of 32MB for the client JVM and 48MB for the server VM. For most Java applications, this size is so large that the application will never fill the entire codecache. Thus the codecache is viewed as being unconstrained, meaning the JIT will continue to compile any code that it thinks should be compiled.

When is Constraining the Codecache Size Useful?

Applications that make state changes that result in a new set of methods being "hot" can benefit greatly from a constrained codecache.

A common state change is from startup to regular execution. The application might trigger a lot of compilation during startup, but very little of this compiled code is needed after startup. By constraining the codecache, you will trigger codecache flushing to throw away the code compiled during startup to make room for the code needed during application execution.

Some applications make state changes during execution, and tend to stay in the new state for an extended period of time. For these applications, the codecache only needs to be big enough to hold the compiled code needed during any given state. Thus if your application has five distinct states, each needing about 1MB of codecache to perform well, then you can constrain the codecache to 1MB, which will be an 80% reduction over the normal 5MB codecache usage for the application. Note, however, that each time the application makes a state change, there will be some performance degradation while the JIT compiles the methods needed for the new state.

How to Constrain the Codecache Size

When the codecache is constrained (its usage approaches or reaches the ReservedCodeCacheSize), to compile more methods, the JIT must first throw out some already compiled methods. Discarding compiled methods is known as codecache flushing. The UseCodeCacheFlushing option turns codecache flushing on and off. By default it is on. You can disable this feature by specifying XX:-UseCodeCacheFlushing. When enabled, the codecache flushing is triggered when the memory available in the codecache is low. It is critical to enable codecache flushing if you constrain the codecache. If flushing is disabled, the JIT does not compile methods after the codecache fills up.

To determine an appropriate ReservedCodeCacheSize value for your application, you must first see how much codecache the application uses when the codecache is unconstrained. Use the XX:+PrintCodeCache option described in Measuring Codecache Usage, and examine the max_used value, which is how much codecache your application uses. You can then try setting ReservedCodeCacheSize to smaller values and see how well your application performs.

If you are trying to use a small (less than 5MB) codecache, you must consider CodeCacheMinimumFreeSpace. For larger codecaches, leave the default value alone. Generally, the JIT keeps enough space free in the codecache to honor this option. For a small code cache, add CodeCacheMinimumFreeSpace to your new ReservedCodeCacheSize. As an example, suppose:

max_used = 3M
CodeCacheMinimumFreeSpace = 500k

To reduce the codecache size from 3MB to 2MB, increase ReservedCodeCacheSize to 2500k (2M+500K) After making the change, verify that max_used changes to 2M.

When constraining the codecache, usually CodeCacheMinimumFreeSpace can be set to a lower value. However, CodeCacheMinimumFreeSpace should be at least 100KB. If free space is exhausted, the JVM throws VirtualMachineError and exits, or in rare cases, crashes. For the 3MB to 2MB example, the following settings are appropriate:

-XX:ReservedCodeCacheSize=2100k
-XX:CodeCacheMinimumFreeSpace=100k

Finding the optimal ReservedCodeCacheSize for your needs is an iterative process. You can repeatedly use smaller and smaller values for ReservedCodeCacheSize until your application's performance degrades unacceptably, and then increase until you get acceptable performance again. You should also gauge the incremental return you are achieving. You might find that you can decrease max_used by 50% with only a 5% performance degradation, and decrease max_used by 60% with a 10% performance degradation. In this example, the second 10% codecache reduction cost as much performance as the initial 50% codecache reduction. You might conclude in this case that the 50% reduction a good balance between codecache usage and performance.

Reducing Compilations

Reducing the number of compiled methods, or the rate at which they are compiled, is another effective way of reducing the amount of codecache that is used. Two main command line options that affect how aggressively methods are compiled: CompileThreshold and OnStackReplacePercentage. CompileThreshold relates to the number of method invocations needed before the method is compiled. OnStackReplacePercentage relates to the number of backwards branches taken in a method before it gets compiled, and is specified as a percentage of CompileThreshold. When a method's combined number of backwards branches and invocations reaches or exceeds CompileThreshold * OnStackReplacePercentage / 100, the method is compiled. Note that there is also an option called BackEdgeThreshold, but it currently does nothing. Use OnStackReplacePercentage instead.

Larger values for these options decreases compilations. Setting the options larger than their defaults defers when a method gets compiled (or recompiled), possibly even preventing a method from ever getting compiled. Usually, setting these options to larger values also reduces performance (methods are interpreted), so it is important to monitor both performance and codecache usage when you adjust them. For the client JVM, tripling the default values of these options is a good starting point. For the server JVM, CompileThreshold is already set fairly high, so probably does not need to be adjusted further.

Reducing Compiled Method Sizes

There are a number of command-line options that reduce the size of compiled methods, but generally at some performance cost. Like the codecache reduction methods described in other sections, the key is finding a setting that gives good code cache usage reduction, without much performance loss.

The options described in this section all relate to reducing the amount of inlining the compiler does. Inlining is when the compiler includes the code for a called method into the compiled code for the method being compiled. Inlining can be done many levels deep, so if method a() calls b() which in turn calls c(), then when compiling a(), b() can be inlined in a(), which in turn can trigger the inlining of c() into a().

The JIT compiler uses multiple heuristics to determine if a method should be inlined. In general, the heuristics are tuned for optimal performance. However, you can adjust some of them to sacrifice some performance for less codecache usage. The more useful options to tune are described below:

InlineSmallCode: The value of this option determines how small an already compiled method must be for it to be inlined when called from a method being compiled. If the compiled version of this method is bigger than the setting for InlineSmallCode, then it is not inlined. Instead, a call to the compiled version of the method is generated.
MaxInlineLevel: This option represents the maximum nesting level of the inlining call chain. Inlining becomes much less useful at deeper levels, and can eventually be harmful to performance due to code bloat. The default value is 9. Setting MaxInlineLevel as low as 1 or 2 ensures that trivial methods, such as getters and setters, are inlined.
MaxInlineSize: This option represents the maximum bytecode size of an inlined method. It defaults to 35, but is automatically reduced at each inlining level. Setting it very small (around 6) ensures that only trivial methods are inlined.
MinInliningThreshold: The interpreter tracks invocation counts at method call sites. This count is used by the JIT to help determine if a called method should be inlined. If a method's number of invocations is less than MinInliningThreshold, the method is not inlined. Raising this threshold reduces the number of method call sites that are inlined. Trivial methods are always inlined and are not subject to the setting of MinInliningThreshold.
InlineSynchronizedMethods: This option can be used to disable the inlining of methods that are declared as synchronized. Because synchronizing is a fairly expensive operation, especially on multi-core devices, the benefits of inlining even small synchronized methods is greatly diminished. You might find you can disable the inlining of synchronized methods with little or no perceived performance degradation, but with a noticeable reduction in the codecache usage.