Java Performance Examples
The GraalVM compiler achieves excellent performance, especially for highly abstracted programs, due to its versatile optimization techniques. Code using more abstraction and modern Java features like Streams or Lambdas will see greater speedups. The examples below demonstrate this.
Running Examples
Streams API Example
A simple example based on the Streams API is used here to demonstrate performance gains when using the GraalVM compiler. This example counts the number of uppercase characters in a body of text. To simulate a large load, the same sentence is processed 10 million times:
1. Save the following code snippet to a file named CountUppercase.java
:
2. Compile it and run as follows:
javac CountUppercase.java
java CountUppercase In 2020 I would like to run ALL languages in one VM.
1 (297 ms)
2 (452 ms)
3 (136 ms)
4 (88 ms)
5 (107 ms)
6 (135 ms)
7 (88 ms)
8 (87 ms)
9 (78 ms)
total: 69999993 (1550 ms)
The warmup time depends on numerous factors like the source code or how
many cores a machine has. If the performance profile of CountUppercase
on your
machine does not match the above, run it for more iterations by adding
-Diterations=N
just after java
for some N
greater than 1.
3. Add the -Dgraal.PrintCompilation=true
option to see statistics for the compilations:
java -Dgraal.PrintCompilation=true CountUppercase In 2020 I would like to run ALL languages in one VM.
This option prints a line after each compilation that shows the method compiled, time taken, bytecodes processed (including inlined methods), size of machine code produced, and amount of memory allocated during compilation.
4. Use the -XX:-UseJVMCICompiler
option to disable the GraalVM compiler and
use the native top tier compiler in the VM to compare performance:
java -XX:-UseJVMCICompiler CountUppercase In 2020 I would like to run ALL languages in one VM.
1 (747 ms)
2 (806 ms)
3 (640 ms)
4 (771 ms)
5 (606 ms)
6 (582 ms)
7 (623 ms)
8 (564 ms)
9 (682 ms)
total: 69999993 (6713 ms)
The preceding example demonstrates the benefits of partial escape analysis (PEA) and advanced inlining, which combine to significantly reduce heap allocation. The results were obtained using Oracle GraalVM Enterprise Edition.
The GraalVM Community Edition still has good performance compared to the native top-tier
compiler as shown below. You can simulate the Community Edition on the Enterprise Edition
by adding the option -Dgraal.CompilerConfiguration=community
.
Sunflow Example
Sunflow is an open source rendering engine. The following example is a simplified version of the Sunflow engine core code. It performs calculations to blend various values for a point of light in a rendered scene.
1. Save the following code snippet to a file named Blender.java
:
2. Compile it and run as follows:
javac Blender.java
java Blender
1156 ms
916 ms
925 ms
980 ms
913 ms
904 ms
862 ms
863 ms
919 ms
868 ms
If you would like to check how it would behave when using GraalVM Community, use the following configuration flag:
java -Dgraal.CompilerConfiguration=community Blender
3. Use the -XX:-UseJVMCICompiler
option to disable the GraalVM compiler and run with the default HotSpot JIT compiler:
java -XX:-UseJVMCICompiler Blender
2546 ms
2522 ms
1710 ms
1741 ms
1724 ms
1722 ms
1763 ms
1742 ms
1714 ms
1733 ms
The performance improvement comes from the partial escape analysis moving the allocation of color
in initialize
down to the point where it is stored into colors
(i.e., the point at which it escapes).
Check the Compiler Configuration on JVM reference for other performance tuning options.