1 About Java Flight Recorder

Note:

Java Flight Recorder requires a commercial license for use in production. To learn more about commercial features and how to enable them please visit http://www.oracle.com/technetwork/java/javaseproducts/.

Java Flight Recorder (JFR) is a tool for collecting, diagnosing, and profiling data about a running Java application. It is integrated into the Java Virtual Machine (JVM) and causes almost no performance overhead, so it can be used even in heavily loaded production environments. When default settings are used, performance impact is less than one percent. For some applications, it can be significantly lower. However, for short-running applications (which are not the kind of applications running in production environments), relative startup and warmup times can be larger, which might impact the performance by more than one percent. JFR collects data about the JVM as well as the Java application running on it.

Compared to other similar tools, JFR has the following benefits:

  • Provides Better Data: JFR captures data from various parts of the runtime, and significant effort has been made to ensure that the captured data represents the true state of the system. Examples of this effort include minimizing the observer effect, and being able to capture samples outside safe points.

  • Provides a Better Data Model: The data model is self-describing. A recording, no matter the size, contains everything required to understand the data.

  • Provides Better Performance: The flight recorder engine itself is optimized for performance. Care has been taken to ensure that data capture will not undo optimizations or otherwise negatively affect performance. Some data can be obtained practically for free, because it is already captured by the runtime.

  • Allows for Third-Party Event Providers: A set of APIs make it possible for JFR to capture data from third-party applications, including WebLogic Server and other Oracle products.

  • Reduces Total Cost of Ownership: JFR enables you to spend less time diagnosing and troubleshooting problems, reduces operating costs and business interrupts, provides faster resolution time when problems occur, and improves system efficiency.

JFR is primarily used for:

  • Profiling

    JFR continuously captures information about the running system. This profiling information includes execution profiling (which shows where the program spends its time), thread stall/latency profiling (which shows why the threads are not running), allocation profiling (which shows where the allocation pressure is), garbage collection details and more.

  • Black Box Analysis

    JFR continuously saves information to a circular buffer. Because the overhead is so low, the flight recorder can be always on. The information can be accessed later, when looking for the cause of a particular anomaly.

  • Support and Debugging

    Data collected by JFR can be essential when contacting Oracle support to help diagnose issues with your Java application.

1.1 Understanding Events

Java Flight Recorder collects data about events. Events occur in the JVM or the Java application at a specific point in time. Each event has a name, a time stamp, and an optional payload. The payload is the data associated with an event, for example, the CPU usage, the Java heap size before and after the event, the thread ID of the lock holder, and so on.

Most events also have information about the thread in which the event occurred, the stack trace at the time of the event, and the duration of the event. Using the information available in events, you can reconstruct the runtime details for the JVM and the Java application.

JFR collects information about four types of events:

  • An instant event occurs instantly, and is logged right away.

  • A duration event has a start and an end time, and is logged when it completes.

  • A timed event is a duration event that has an optional user defined threshold, so that only events lasting longer than the specified period of time are recorded. This is not possible for other types of events.

  • A sample event (also called requestable event) is logged at a regular interval to provide a sample of system activity. You can configure how often sampling occurs.

JFR monitors the running system at an extremely high level of detail. This produces an enormous amount of data. To keep the overhead as low as possible, limit the type of recorded events to those you actually need. In most cases, very short duration events are of no interest, so limit the recording to events with a duration exceeding a certain meaningful threshold.

1.2 Understanding Data Flow

JFR collects data from the JVM (through internal APIs) and from the Java application (through the JFR APIs). This data is stored in small thread-local buffers that are flushed to a global in-memory buffer. Data in the global in-memory buffer is then written to disk. Disk write operations are expensive, so you should try to minimize them by carefully selecting the event data you enable for recording. The format of the binary recording files is very compact and efficient for applications to read and write.

There is no information overlap between the various buffers. A particular chunk of data is available either in memory or on disk, but never in both places. This has the following implications:

  • Data not yet flushed to a disk buffer will not be available in the event of a power failure.

  • A JVM crash can result in some data being available in the core file (that is, the in-memory buffer) and some in the disk buffer. JFR does not provide the capability to merge such buffers.

  • There may be a small delay before data collected by JFR is available to you (for example, when it has to be moved to a different buffer before it can be made visible).

  • The data in the recording file may not be in time sequential order as the data is collected in chunks from several thread buffers.

In some cases, the JVM drops the event order to ensure that it does not crash. Any data that cannot be written fast enough to disk is discarded. When this happens, the recording file will include information on which time period was affected. This information will also be logged to the logging facility of the JVM.

You can configure JFR to not write any data to disk. In this mode, the global buffer acts as a circular buffer and the oldest data is dropped when the buffer is full. This very low-overhead operating mode still collects all the vital data necessary for root-cause problem analysis. Because the most recent data is always available in the global buffer, it can be written to disk on demand whenever operations or surveillance systems detect a problem. However, in this mode, only the last few minutes of data is available, so it only contains the most recent events. If you need to get the full history of operation for a long period of time, use the default mode where events are written to disk regularly.

1.3 Java Flight Recorder Architecture

JFR is comprised of the following components:

  • JFR runtime is the recording engine inside the JVM that produces the recordings. The runtime engine itself is comprised of the following components:

    • The agent controls buffers, disk I/O, MBeans, and so on. This component provides a dynamic library written in C and Java code, and also provides a JVM-independent pure Java implementation.

    • The producers insert data into the buffers. They can collect events from the JVM and the Java application, and (through a Java API) from third-party applications.

  • Flight Recorder plugin for Java Mission Control (JMC) enables you to work with JFR from the JMC client, using a graphical user interface (GUI) to start, stop, and configure recordings, as well as view recording files.

1.4 Enabling Java Flight Recorder

By default, JFR is disabled in the JVM. To enable JFR, you must launch your Java application with the -XX:+FlightRecorder option. Because JFR is a commercial feature, available only in the commercial packages based on Java Platform, Standard Edition (Oracle Java SE Advanced and Oracle Java SE Suite), you also have to enable commercial features using the -XX:+UnlockCommercialFeatures options.

For example, to enable JFR when launching a Java application named MyApp, use the following command:

java -XX:+UnlockCommercialFeatures -XX:+FlightRecorder MyApp

Alternatively, (if using JDK 8u40 or later) you can enable JFR at runtime from within JMC itself. When you start a new Flight Recording, a dialog box will appear stating that:

Commercial Features are not enabled in the JVM. To start a Flight Recording, you need to enable Commercial Features. Do you want to do that now?

Click "Yes" to enable these features.

You can also enable Java Flight Recorder in a running JVM by using the appropriate jcmd diagnostic commands. For examples, see Section 2.2, "Using Diagnostic Commands".

Note that when running alternative languages relying on lambda forms on the JVM -- such as the JavaScript implementation Nashorn -- the depths of the stack traces can get quite deep. To ensure that stack traces with large stacks are sampled properly, you may need to increase the Flight Recorder stack depth. Setting its value to 1024 will usually be enough:

java -XX:+UnlockCommercialFeatures -XX:+FlightRecorder -XX:FlightRecorderOptions=stackdepth=1024 MyApp

1.4.1 Improving the Fidelity of the JFR Method Profiler

One nice property of the JFR method profiler is that it does not require threads to be at safe points in order for stacks to be sampled. However, since the common case is that stacks will only be walked at safe points, HotSpot normally does not provide metadata for non-safe point parts of the code, which means that such samples will not be properly resolved to the correct line number and BCI. That is, unless you specify:

-XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints

With DebugNonSafepoints, the compiler will generate the necessary metadata for the parts of the code not at safe points as well.