Auxiliary Engine Caching

The following document describes how the auxiliary engine cache of GraalVM works.

This feature is only available in GraalVM Enterprise Edition. In the community edition, these options are not available.

Introduction

Warmup of Truffle guest language programs can take a significant amount of time. Warmup consists of work that is repeated every time a program is executed until peak performance is reached. This includes:

  1. Loading and parsing the guest application into Truffle AST data structures.
  2. Execution and profiling of the guest application in the interpreter.
  3. Compilation of the AST to machine code.

Within a single OS process, the work performed during warmup can be shared by specifying an explicit engine. This requires language implementations to disable context-related optimizations to avoid deoptimizations between contexts that share code. Auxiliary engine caching builds upon the mechanism for disabling context-related optimizations and adds the capability to persist an engine with ASTs and optimized machine code to disk. This way, the work performed during warmup can be significantly reduced in the first application context of a new process.

We use the SVM auxiliary image feature to persist and load the necessary data structures to the disk. Persisting the image can take significant time as compilation needs to be performed. However, loading is designed to be as fast as possible, typically almost instantaneous. This reduces the warmup time of an application significantly.

Getting Started

Starting from a GraalVM EE installation, you first need to (re)build an image with auxiliary engine caching capabilities. For example, one can rebuild the JavaScript image by adding the auxiliary engine cache feature:

graalvm/bin/native-image --macro:js-launcher -H:+AuxiliaryEngineCache -H:ReservedAuxiliaryImageBytes=1073741824

The --macro argument value depends on the guest language By default, auxiliary images of up to 1GB are possible. The maximum size can be increased or decreased as needed. The amount of reserved bytes does not actually impact the memory consumed by the application. In future versions, the auxiliary engine cache will be enabled by default when the --macro:js-launcher macro is used.

After rebuilding the JavaScript launcher, the feature is used as follows:

Create a new file fib.js:

function fib(n) {
   if (n == 1 || n == 2) {
       return 1;
   }
   return fib(n - 1) + fib(n - 2);
}
console.log(fib(32))

In order to persist the engine of a profiling run to disk use the following command line:

graalvm/bin/js --experimental-options --engine.TraceCache=true --engine.CacheStore=fib.image fib.js

The ` –engine.TraceCache=true` option is optional and allows you to see what is going on.

The output is as follows:

[engine] [cache] No load engine cache configured.
2178309
[engine] [cache] Preparing engine for store (compile policy hot)...
[engine] [cache] Force compile targets mode: hot
[engine] [cache] Prepared engine in 1 ms.
[engine] [cache] Persisting engine for store ...
[engine] [cache] Persisted engine in 20 ms.
[engine] [cache] Detecting changes (update policy always)...
[engine] [cache]     New image contains         1 sources and  82 function roots.
[engine] [cache]     Always persist policy.
[engine] [cache] Writing image to fib.image...
[engine] [cache] Finished writing 1,871,872 bytes in 4 ms.

The engine can now be loaded from disk using the following command:

graalvm/bin/js --experimental-options --engine.TraceCache --engine.CacheLoad=fib.image fib.js

which prints:

[engine] [cache] Try loading image './fib.image'...
[engine] [cache] Loaded image in 0 ms. 1,871,872 bytes   1 sources  82 roots
[engine] [cache] Engine from image successfully patched with new options.
2178309
[engine] [cache] No store engine cache configured.

Since there is no need to warm up the application, the application’s execution time should be significantly improved.

Usage

The cache store and load operations can be controlled using the following options:

The compilation of roots may be forced when an image is stored using the --engine.CacheCompile=<policy> option. The supported policies are:

By default, all started compilations in the compile queue will be completed and then persisted. Whether a function root is AOT compilable is determined by the language. A language supports AOT by implementing RootNode.prepareForAOT().

An update policy can be specified if both load and store operations are set using the --engine.UpdatePolicy=<policy> option. Available policies are:

Known Restrictions

Security Considerations

All data that is persisted to disk represents code only and no application context-specific data like global variables. However, profiled ASTs and code may contain artifacts of the optimizations performed in a Truffle AST. For example, it is possible that runtime strings are used for optimizations and therefore persisted to an engine image.

Development and Debugging on NativeImage

There are several options useful for debugging auxiliary engines caching when running on NativeImage:

Development and Debugging on HotSpot

It can be useful to debug language implementation issues related to auxiliary image on HotSpot. On GraalVM EE in JVM mode, we have additional options that can be used to help debug issues with this feature: Since storing partial heaps on HotSpot is not supported, these debug features do not work on HotSpot.

For example:

js --jvm  --experimental-options --engine.TraceCompilation --engine.DebugCacheTrace --engine.DebugCacheStore --engine.DebugCacheCompile=executed fib.js

Prints the following output:

[engine] opt done         fib                                                         |ASTSize            32 |Time   231( 147+84  )ms |Tier             Last |DirectCallNodes I    6/D    8 |GraalNodes   980/ 1857 |CodeSize         7611 |CodeAddress 0x10e20e650 |Source       fib.js:2
2178309
[engine] [cache] Preparing debug engine for storage...
[engine] [cache] Force compile targets mode: executed
[engine] [cache] Force compiling 4 roots for engine caching.
[engine] opt done         @72fa3b00                                                   |ASTSize             3 |Time   211( 166+45  )ms |Tier             Last |DirectCallNodes I    2/D    1 |GraalNodes   500/ 1435 |CodeSize         4658 |CodeAddress 0x10e26c8d0 |Source            n/a
[engine] opt done         :program                                                    |ASTSize            25 |Time   162( 123+39  )ms |Tier             Last |DirectCallNodes I    1/D    1 |GraalNodes   396/ 1344 |CodeSize         4407 |CodeAddress 0x10e27fd50 |Source       fib.js:1
[engine] opt done         Console.log                                                 |ASTSize             3 |Time    26(  11+15  )ms |Tier             Last |DirectCallNodes I    0/D    0 |GraalNodes    98/  766 |CodeSize         2438 |CodeAddress 0x10e285710 |Source    <builtin>:1
[engine] [cache] Stored debug engine in memory.

This allows rapidly iterating on problems related to the compilation as well as to attach a Java debugger. A Java debugger can be attached using --vm.Xdebug --vm.Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=8000.

Debugging the loading of persisted engines is more difficult as writing an engine to disk is not supported on HotSpot. However, it is possible to use the polyglot embedding API to simulate this use-case in a unit test. See the com.oracle.truffle.enterprise.test.DebugEngineCacheTest class as an example.