4 Using WLDF with Java Flight Recorder

The integration of the WebLogic Diagnostics Framework (WLDF) with Java Flight Recorder enables WebLogic Server events to be propagated to the Java Flight Recorder for inclusion in a common data set for runtime or post-incident analysis. The Flight Recording data is also included in WLDF diagnostic image captures, which enables you to capture flight recording snapshots based on WLDF policies. You can use this capability to capture and analyze, in a single view, the runtime system information for both the JVM and the Fusion Middleware components running on it.

This chapter also explains common usage scenarios that show how this integration can provide for a comprehensive performance analysis and diagnostic foundation for production systems based on WebLogic Server.

About Java Flight Recorder

Java Flight Recorder is a performance monitoring and profiling tool that records diagnostic information on a continuous basis. The Java Flight Recorder is available even when there is a catastrophic failure such as a system crash.

Java Flight Recorder is available in Oracle HotSpot. When WebLogic Server is configured with HotSpot, Java Flight Recorder is not enabled by default. See Using Java Flight Recorder with Oracle HotSpot for information about how to enable Java Flight Recorder with WebLogic Server.

Note:

For the most current information about configurations supported in this release of WebLogic Server, see Oracle Fusion Middleware Supported System Configurations on the Oracle Technology Network.

Java Flight Recorder maintains a buffer of diagnostics and profiling data, called a flight recording or a JFR file, that you can access whenever you need it. The flight recording functions in a manner similar to an aircraft "black box" in which new data is continuously added and older data is stripped out, as shown in Figure 4-1.

Figure 4-1 Circular Flight Recording Buffer

Description of Figure 4-1 follows
Description of "Figure 4-1 Circular Flight Recording Buffer"

The data contained in the JFR file includes events from the JVM and from any other event producer, such as WebLogic Server and Oracle Dynamic Monitoring System (DMS). The JFR file can be analyzed at any time, using Java Mission Control, to examine the details of system execution flow that occurred leading up to an event.

The amount of additional processing overhead that results when Java Flight Recorder is enabled, and also configure WLDF to generate WebLogic Server diagnostics to be captured by Java Flight Recorder, is minimal. This makes it ideal to be used on a full time basis, especially in production environments where it adds the greatest value.

Java Flight Recorder provides the following key benefits:

  • Designed to run continuously — When Java Flight Recorder is configured to run full-time, with both JVM and WLDF events captured in the flight recording, diagnostic data is always available at the time an event occurs, including a system crash. This ensures that a record of diagnostic data leading up to the event is available, allowing you to diagnose the event without having to recreate it.

  • Comprehensive data — Java Flight Recorder combines data generated by tools such as the Runtime Analyzer and the Latency Analysis Tool and presents it in one place.

  • Integration with event providers — HotSpot includes a set of APIs that allow Java Flight Recorder to monitor additional system components, including WebLogic Server, Oracle Dynamic Monitoring System (DMS), and other Oracle products.

For more information about Java Flight Recorder, see Java Flight Recorder Runtime Guide at the following location:

http://docs.oracle.com/javacomponents/index.html

Using Java Flight Recorder with Oracle HotSpot

Java Flight Recorder is available with Oracle Hotspot. If WebLogic Server is configured with Oracle HotSpot, Java Flight Recorder is disabled by default. Enable the Java Flight Recorder to capture the WLDF diagnostic data.

To enable Java Flight Recorder, you must specify the following JVM options in the WebLogic Server instance in which the JVM runs:

-XX:+UnlockCommercialFeatures -XX:+FlightRecorder

Note:

The sequence in which you specify JVM options to Hotspot is very important. The options are processed from left to right, and option values are overwritten if there are duplicates. Therefore, note the following:

  • HotSpot does not recognize the FlightRecorder option unless it is preceded by the UnlockCommercialFeatures option.

  • If you specify only the FlightRecorder option, or you specify FlightRecorder before specifying UnlockCommercialFeatures, the HotSpot JVM does not start.

Key Features of WLDF Integration with Java Flight Recorder

WLDF integration with Java Flight Recorder provides several useful features, including having WebLogic Server events captured in the flight recording, the ability to throttle the volume of data captured, tools for downloading diagnostic image captures, and more.

The key features provided by WLDF to leverage integration with Java Flight Recorder include the following:

  • WLDF diagnostic data captured in a flight recording

    WLDF can be configured to generate diagnostic data about WebLogic Server events that is captured in the flight recording. Captured events include those from components such as: web applications; EJBs; JDBC, JTA, and JMS resources; resource adapters; and WebLogic web services.

  • WLDF diagnostic volume control

    The ability to generate WebLogic Server event data for the Flight Recording is controlled by the WLDF diagnostic volume configuration. This control also determines the amount of WebLogic Server event data that is captured by Java Flight Recorder, and can be adjusted to include more, or less, data for each WebLogic Server event that is generated. See Configuring WLDF Diagnostic Volume.

    Note:

    • By default, the WLDF diagnostic volume is set to Low.

    • The WLDF diagnostic volume setting does not affect explicitly configured diagnostic modules or the built-in diagnostic modules.

  • Automatic throttling of generated events under load

    As processing load rises on a given WebLogic Server instance, WLDF automatically begins throttling the number of incoming WebLogic Server requests that are selected for event generation and recording into the JFR file. The degree of throttling is adjusted continuously as system load rises and falls.

    Throttling provides three key benefits:

    • The overhead of capturing events generated by WLDF for Java Flight Recorder remains minimized, which is especially important when systems are under load.

    • The time interval encompassed in the flight recording buffer is maximized, giving you a better historical record of data.

    • Throttling has the effect of sampling incoming WebLogic Server requests, maintaining high performance while still providing an accurate overall view of system activity under load.

    Note:

    Throttling affects only the Flight Recording data that is captured by WLDF. It does not affect data captured by other event producers, such as the JVM.

  • WLDF diagnostic image capture support for JFR files

    WLDF diagnostic image capture automatically includes the JFR file, if one has been generated by Java Flight Recorder. The JFR file includes data generated by all active event producers, including WebLogic Server. An image captured using the Policies and Actions component may contain the JFR file, if available.

  • WLST commands for downloading the contents of diagnostic image captures

    WLST includes a set of commands for downloading the contents of diagnostic image captures, described in WLST Online Commands for Downloading Diagnostics Image Captures. Although these commands are generally useful for listing, copying, and downloading all entries contained in the diagnostic image capture, they can also be used for obtaining the JFR file, if available. Once obtained from the diagnostic image capture, the JFR file can be viewed in Java Mission Control.

Java Flight Recorder Use Cases

Java Flight Recorder helps to resolve important diagnostic issues such as diagnosing critical failure, and examining and reporting runtime data. When a critical failure occurs, the data captured by Java Flight Recorder is useful for failure analysis. Likewise, capturing data at specific time and at runtime help to diagnose data after and before a particular event.

This section summarizes the three common business cases of using the Java Flight Recorder to resolve diagnostic issues:

For more information about scenarios using Java Flight Recorder, see also About Java Flight Recorder in Java Flight Recorder Runtime Guide, available at the following URL:

http://docs.oracle.com/javacomponents/index.html

Diagnosing a Critical Failure — The "Black Box"

When a "catastrophic" failure occurs, the content of the Java Flight Recorder buffer can be made available for post-failure analysis in a manner analogous to the use of an aircraft's black box. Examples of such failures include a JVM crash or an out-of-memory error (OOME) resulting in an application terminating.

When these situations arise, the flight recording contains the following information, which can be helpful in determining the cause of the failure:

  • JVM core dump, including metadata about the Java Flight Recorder configuration at the time of the crash. Furthermore, depending on the disk storage parameters that are set, the Java Flight Recorder data buffer might contain a certain amount of data.

  • WebLogic Server events, captured by WLDF, that preceded the failure.

Java Flight Recorder uses a combination of memory and disk to store its buffer. The most recent data is stored in memory and is flushed out to disk as it "ages". In this way, the on-disk data can be available even after a power failure or similar catastrophic event; only the most recent data will be unavailable (for example, the data that had not yet been flushed to disk). The text dump file will contain metadata about the Java Flight Recorder configuration at the time of the crash, including the path to the data buffer file when applicable. For more information about using Java Flight Recorder, see the Java Flight Recorder Runtime User Guide at the following location:

http://docs.oracle.com/javacomponents/index.html

Profiling During Performance Testing or in Production

Profiling involves capturing data beginning at a specific point in time so that, later, you can analyze the events that were generated after that point. In contrast to real-time diagnostics reporting, described in the following section, profiling involves analyzing the diagnostic data generated after a particular event occurs, as opposed to the data that precedes it.

Profiling with Java Flight Recorder optimizes the ability to perform deep analysis of lock contention and causes of latency.

Real-Time Application Diagnostics and Reporting

It is particularly useful to examine diagnostic data generated during run time when a particular event occurs for the purposes of understanding the system activity that preceded the event; for example, system activity occurring moments before a serious error message is generated. By using the diagnostic capabilities available in WLDF in conjunction with Java Flight Recorder, you can capture a large amount of system-wide diagnostic data the moment a problem occurs. You can then leverage the capabilities of Java Mission Control to quickly correlate that event with other system activity and process execution data within the "snapshot in time" that the JFR file provides, enabling you to quickly isolate likely causes of the problem.

One WLDF feature that is particularly useful in conjunction with Java Flight Recorder is the image action. An image action generates a diagnostic image capture in response to the triggering of a policy that is configured in a diagnostic system module. The policy monitors the server environment for one or more specific conditions, and when those conditions occur, the policy can automatically executes an image action. When Flight Recorder is enabled, the diagnostic image capture automatically includes the JFR file. The JFR file can then be extracted from the diagnostic image capture and examined immediately in Java Mission Control or stored for later analysis. An image action, used when WLDF data is captured by Java Flight Recorder, is particularly well suited for real-time diagnosis of intermittent problems.

Image action is part of the Policies and Actions system in WLDF. To set up an image action, you create one or more individual policies. A policy includes a Java EL expression to specify the event for the policy to detect. For example, the following log policy expression detects the server log message with severity level Critical and ID BEA-149618:

log.severityString == 'Critical'  &&  log.messageId == 'BEA-149618'

Policies can monitor any of the following:

  • Runtime MBean instances in the local runtime MBean server

    A scheduled policy can execute an image action if runtime MBean attributes detect a performance issue, such as high memory utilization rates or problems with open socket connections to the server.

  • Messages published to the server log

    A log policy can execute an image action if a specific message, severity level, or string is issued.

  • Event generated by the WLDF Instrumentation component

    An event policy can execute an image action if an instrumentation service generates a particular event.

See the following topics:

The following sections explain how to obtain the JFR file from the diagnostic image capture and provide an example of using Java Mission Control to examine the WebLogic Server events contained in the JFR file:

Obtaining the Flight Recording File

The diagnostic image capture is a single Java Flight Recorder (JFR) file that contains individual images produced by different server subsystems. The JFR file is included in the diagnostic image as FlightRecording.jfr.

A diagnostic image capture can be generated on-demand — for example, from the WebLogic Server Administration Console, Fusion Middleware Control, WLST, or a JMX application — or it can be generated as the result of an image action. For information about how to generate a diagnostic image captures and configure the location in which they are created, see Configure and capture diagnostic images in Oracle WebLogic Server Administration Console Online Help.

To view the contents of the JFR file, you first need to extract it from the diagnostic image capture as described in Configuring and Capturing Diagnostic Images. Once you have extracted the JFR file, you can view its contents in Java Mission Control.

For an example WLST script that retrieves the JFR file from a diagnostic image file and saves it to a local directory, see Example: Retrieving a JFR File from a Diagnostic Image Capture.

Analyzing Java Flight Recorder Data

You can extract the JFR file from the diagnostic image capture and use Java Mission Control to examine the contents of the JFR file. JFR provides graphical user interface which gives view of all the event information recorded in the JFR file.

The following sections highlight some of the capabilities of Java Mission Control's graphical user interface, which provides a lot of tooling support for drilling down into the diagnostic data generated not only by WLDF for WebLogic Server events, but also from all other available event producers, including HotSpot:

For complete details about the Java Mission Control interface, see Java Mission Control User's Guide at the following location:

http://docs.oracle.com/javacomponents/index.html

Note:

Flight Recorder data may include partition-id and partition-name in captured JFR events, but only the partition user may have access to the JFR data containing the information corresponding to the partition for that user. See Monitoring and Debugging Partitions in Using Oracle WebLogic Server Multitenant.

Java Flight Recorder Graphical User Interface

Java Mission Control includes the Java Flight Recorder graphical user interface, which allows users who are running a Java Flight Recorder-compliant version of Oracle HotSpot to view JVM recordings, current recording settings, and runtime parameters. The JFR interface includes the Events Type View, which gives you direct access to event information that has been recorded in the JFR file, such as event producers and types, event logging and graphing, event by thread, event stack traces, and event histograms.

The Overview tab in the Java Flight Recorder interface is useful for analyzing a system's general health because it can reveal behavior that might indicate bottlenecks or other sources of poor system performance. Figure 4-2 shows an example of the Overview tab in the Events Type View.

Note the following regarding the information shown in Figure 4-2:

  • The Events Type View is available by selecting the Events tab group icon.

  • The name of the Java Flight Recorder file appears at the top of the Overview tab. Note that the Java Flight Recorder is always named FlightRecording.jfr, it is useful to rename it descriptively after downloading it from the diagnostic image capture.

  • The Event Types Browser, on the left side, is a tree that shows the available event types in a recording. It works in conjunction with the Events tab group to provide a means to select events or groups of events in a recording that might be of interest to you and to obtain more granular information about them.

    As you select and deselect entries in the Event Types Browser, the information displayed in the Overview tab is filtered dynamically. For example, by selecting only WebLogic Server, event data from all non-WebLogic event producers is filtered out.

  • The range navigator, which is the graph displayed below the Overview tab title, is a time line that shows all events in a recording that pertain to the data displayed on the selected tab. A set of buttons are available for adjusting the range of data that is displayed, which can simplify the process of drilling down into the details of Java Flight Recorder data.

  • The Producers section identifies each event producer that generated the data that is displayed. Metrics are included for each producer, indicating the volume of event activity generated by each as a proportion of the total set of event data displayed.

  • The Event Types section lists all events represented in the Overview tab, along with key metric data about each event.

Figure 4-2 Example Overview Page of Java Flight Recorder File in Java Mission Control

Description of Figure 4-2 follows
Description of "Figure 4-2 Example Overview Page of Java Flight Recorder File in Java Mission Control"

Analyzing Execution Flow — A Sample Walkthrough

This section shows an example of the steps that a developer or support engineer might use to identify the event activity associated with a particular request in a Web application hosted on WebLogic Server. This example is not meant to recommend a specific way to diagnose performance problems, but simply shows how the Java Flight Recorder graphical user interface can be used to greatly simplify the process of locating and analyzing performance issues.

The following examples are shown in this section:

Displaying Event Data for a Product Subcomponent

When you start Java Mission Control and open a JFR file, you can use the Event Types View to quickly select the specific events you want to analyze. As you select and deselect items in the Event Types Browser (which is available in the Event Types View), the information displayed in the Java Flight Recorder graphical user interface is updated instantly to show information about only the selected event types.

Figure 4-3 shows the Event Types Browser with only servlet event types selected.

Figure 4-3 Event Types Browser

Description of Figure 4-3 follows
Description of "Figure 4-3 Event Types Browser"

Viewing the Event Log to Display Details

To view details about the events logged by one or more event types, select the Log tab, which is available at the bottom of the Java Flight Recorder graphical user interface. An example of the Log tab for servlet event types is shown in Figure 4-4.

Figure 4-4 Servlet Event Log

Description of Figure 4-4 follows
Description of "Figure 4-4 Servlet Event Log"

When using the Log tab, you can view details about events as follows:

  • You can click on individual column heads in the Event Log table to modify the sort order of the events. For example, by clicking the Duration column, you can quickly identify the events that took the longest time to execute.

  • When you select an event in the Event Log table, details about that event are displayed in the Event Attributes table. For example, Figure 4-4 shows the following attributes:

    • Event start, end, and duration times

    • User ID of person who issued the request on the servlet

    • Method, class name, and URI of invoked servlet

    • Partition ID and name — Note that events generated on behalf of a server or domain scope resource are tagged with a partition-id of 0, and the partition-name of DOMAIN.

    • Relationship ID (RID), which distinguishes the work done in one thread on one process, from work done by any other threads on this and other processes on behalf of the same request. See Understanding ECIDs and RIDs in Correlating Messages in Administering Oracle Fusion Middleware.

    • Execution context ID (ECID)

Different event types have different attributes. For example, if this were a JDBC event, you could scroll among the attributes to see the SQL statement, the JDBC connection pool used, and the stack from which it was called. The interface makes it easy to scan for unexpected behavior that can be analyzed in deeper detail.

Note:

The value of the ECID is a unique identifier that can be used to correlate individual events as being part of the same request execution flow. For example, events that are identified as being related to a particular request typically have the same ECID value, as shown in Tracking Execution Flow by Analyzing an Operative Set. However, the format of the ECID string itself is determined by an internal mechanism that is subject to change; therefore, you should not have or place any dependencies on that format.

Tracking Execution Flow by Analyzing an Operative Set

The Java Flight Recorder graphical user interface in Java Mission Control allows you to analyze the run-time trail of system activity that occurs as the result of a particular event. In this example, the run-time trail is analyzed by first defining an operative set. An operative set is any set of events that you choose to work in Java Mission Control.

In the example shown in this section, an operative set is created for the events that have the same execution context ID (ECID) attribute as the servlet invocation event selected in the Event Log table, shown in Figure 4-5. The operative set is then analyzed to see the execution flow that resulted from that servlet invocation. (Note that this operative set could be expanded to include events that match on different attributes as well; for example, events containing a specific SQL statement but not necessarily the same ECID.)

Figure 4-5 Operative Set Defined by Execution Context ID (ECID)

Description of Figure 4-5 follows
Description of "Figure 4-5 Operative Set Defined by Execution Context ID (ECID)"

This operative set is defined by right-clicking the desired event in the Event Log, and then selecting Operative Set > Add matching ECID > ecid. See Figure 4-6.

Figure 4-6 Defining an Operative Set by Matching ECID

Description of Figure 4-6 follows
Description of "Figure 4-6 Defining an Operative Set by Matching ECID"

The operative set is then displayed by selecting Show Only Operative Set above the event log table, shown in Figure 4-7. Note how the operative set is indicated in the range navigator.

Figure 4-7 Displaying an Operative Set

Description of Figure 4-7 follows
Description of "Figure 4-7 Displaying an Operative Set"

The runtime trail of execution flow that results from the request that generated the servlet invocation event can be viewed by including additional event types. For example, Figure 4-8 shows the operative set when all WebLogic Server event types are added, using the Event Type Browser, and listing the events in chronological order. (You can sort the events chronologically by selecting the Start Time column head.)

Figure 4-8 Adding all WebLogic Server Events to Operative Set

Description of Figure 4-8 follows
Description of "Figure 4-8 Adding all WebLogic Server Events to Operative Set"

In this example, note a portion of the execution flow shown in the Event Log:

  1. The servlet URI is invoked.

  2. The servlet uses an EJB, which requires access to the database.

  3. A JDBC connection is obtained and a transaction is started.

Expanding the Operative Set and Viewing Correlated Diagnostic Data

The operative set can be further analyzed by constraining the time interval of the execution flow and adding correlated events from additional producers. By constraining the time interval for displayed events, you can add events to the Event Log that occurred simultaneously with the operative set. This allows you to see additional details about the execution context that can help diagnose performance issues.

The time interval can be constrained by using the range selection bars in the range navigator. You can grab these bars with your pointer and drag them inward or outward to change the range of events displayed in the Event Log. The range selection bars are activated when you hover your pointer over either end of the navigator, as shown in Figure 4-9.

Figure 4-9 Range Navigator Selection Bars

Description of Figure 4-9 follows
Description of "Figure 4-9 Range Navigator Selection Bars"

Events from additional producers, such as HotSpot, can be selected in the Event Types Browser. Note that JVM events do not have ECID attributes, so they cannot be included among the WLDF events in the operative set. So to view the JVM events, you need to de-select Show Only Operative Set.

At this point the events that are displayed in the Event Log are those that occurred during the selected time interval but not correlated otherwise. Figure 4-10 shows drilling down into JDBC activity by selecting only JDBC events and JVM events. The Event Log is updated and listed in chronological order to show the JVM activity that occurred simultaneously to the flow of the JDBC events in the selected time interval.

Figure 4-10 Adding JVM Events to JDBC Event Log

Description of Figure 4-10 follows
Description of "Figure 4-10 Adding JVM Events to JDBC Event Log"

Changing the Location of Temporary JFR Files

The temporary JFR files created in the operating system's temp directory are managed directly by the JVM. WLDF does not control these files. (By default, WLDF temporary files related to Java Flight Recorder are placed in the DOMAIN_HOME/servers/SERVER_NAME/server/logs/diagnostic_images directory.)

However, you can change the location in which the JVM places its temporary files by using the following command-line option when starting Java Flight Recorder, where path represents the preferred location:

-XX:FlightRecorderOptions=repository=path

For more information about Java Flight Recorder configuration settings, see Java Flight Recorder Runtime Guide at the following location:

http://docs.oracle.com/javacomponents/index.html