Diagnostics Guide

     Previous  Next    Open TOC in new window  Open Index in new window  View as PDF - New Window  Get Adobe Reader - New Window
Content starts here

The System is Crashing

A Java application may stop running for several reasons. The most common reason is of course that the application finished running or was halted normally. Other reasons may be Java application errors, unhandled exceptions or irrecoverable Java errors like OutOfMemoryError. Occasionally you may encounter a JVM crash, which means that the JVM itself has encountered a problem from which it hasn’t managed to recover gracefully. You can identify a JVM crash by the dump information that the Oracle JRockit JVM prints out in case of a crash.

This document describes how to diagnose and resolve JVM crashes. It includes information on the following subjects:

 


Notifying Oracle Support

Note that even if you do not have a Support contract with Oracle, you should notify Oracle Support if you have encountered a problem in the JRockit JVM. This way, Oracle can make sure that the problem is fixed in the next release. For information on communication with Oracle Support, please refer to Submitting Problems to Oracle Support.

 


Classify the Crash

The first step in diagnosing and resolving a JVM crash is to classify the crash, i.e. trying to determine where and why the crash occurred.

Using a Crash File

Whenever the JRockit JVM crashes, it creates a snapshot of the state of the computer and the JVM process at the time of the crash and writes this information into one of these “crash files”:

Binary crash files (core files and mdmp files) are very helpful to the Oracle support organization when solving JRockit JVM problems; however, if you don’t have a service agreement with Oracle Systems, these files will not be of much help to you.

Determine the Crash Type

You can sometimes determine where and why the crash occurred by retrieving the text dump file and reviewing it for information that points to the crash type. Checking the size of the binary dump file may help in some cases, as well as checking the setup of the operating system. Table 30-1 lists symptoms you can look for and the probable crash types corresponding to these symptoms.

Table 30-1 Crash Symptoms and Crash Types
Symptoms
Probable Crash Type
The dump file indicates that the JVM process has run out of virtual memory. See Understanding Crash Files for details.
The core file or mdmp file size is close to the maximum virtual memory size of the process on the OS. See Understanding Crash Files for details.
The dump file indicates that stack overflow errors have occurred. See Understanding Crash Files for details.
For Linux users only:
The dump file indicates that LD_ASSUME_KERNEL is set. See Understanding Crash Files for details.
For Linux users only:
You are using a non-standard or unsupported Linux configuration.
None of the above apply or help solve the problem.

 


Out Of Virtual Memory Crash

The JVM reserves virtual memory for many purposes; for example the Java heap, Java methods, thread stacks and JVM internal data structures. In addition, native (JNI) code can also allocate memory. The process size consists of all the memory reserved by the JVM and is limited according to the operating system limitations. If the virtual memory allocation of the JVM process exceeds these limitations, the JVM will run out of virtual memory, which may cause it to crash. This section discusses the following topics:

Verify the Out Of Virtual Memory Error

Before you can start debugging an Out Of Virtual Memory Error, you should first verify that the error is indeed due to the JVM process running out of virtual memory. This section contains information on:

Virtual Memory Maximums

Table 30-2 shows the maximum virtual memory available to a single process on the various 32-bit operating systems. Virtual memory is practically unlimited on 64-bit platforms.

Table 30-2 Approximate Maximum Virtual Memory Available to IA32 Architectures
OS
Max Process Virtual Memory
Windows
2GB
Windows /3GB Startup Option
3GB
Linux (normally)
3GB

Checking the Text dump File

The text dump file, if such has been created by the JVM, may indicate that memory allocations have failed. See Understanding Crash Files for details. This is a strong indication that the JVM process has run out of virtual memory.

Checking the Binary mdmp or core File

When the JRockit JVM crashes, it generates a binary crash file. By default this file contains a copy of the entire JVM process. Check the size of this file to determine that the JVM process has indeed run out of virtual memory.

  1. Verify that the binary crash file size has not been limited with the command line option -XXdumpSize or with the operating system command ulimit (Linux and Solaris only). Use the command ulimit -a to verify that the crash file size is unlimited on Linux and Solaris. If the size of the binary crash file has been limited, you can not use it to verify that the JVM process has run out of virtual memory.
  2. Compare the size of the mdmp or core file with the size of the heap and ensure that it is larger than your heap size. This is a sanity check to verify that the binary crash file has not been truncated for example due to limited disk space.
  3. Determine if the size of the mdmp or core file is close to the maximum process size allowed by the particular OS.

Troubleshoot the Out Of Virtual Memory Error

When you have verified that the JVM process has run out of virtual memory, you can start troubleshooting in order to fix the problem. This section covers the following topics:

Upgrade to the Latest JRockit JVM Version Available

Make sure you are running the latest available JRockit JVM version. There have been many fixes to address and reduce memory usage in the JVM over time. Using the latest JVM version for a specific major JDK will ensure that you are running the most memory efficient one.

Reduce the Java Heap Size

The Java heap is only a part of the JVM’s total memory usage. If the Java heap is too large, the JVM may fail to start or run out of virtual memory when Java methods are compiled and optimized or native libraries are loaded. If this happens, you should try lowering the maximum heap size setting.

Use the Windows /3GB Startup Option

On Windows 2000 Advanced Server and Datacenter, Windows 2003 and Windows XP you have the option of starting the operating with the /3GB option by specifying so in BOOT.INI. This option changes the maximum virtual memory process size from 2GB to 3GB.

Check for Memory Leaks in JNI Code

Check any JNI code you are using for memory leaks. Incorrectly written or used JNI code may be leaking memory. This will grow the Java process until it reaches the maximum virtual memory size on the platform.

Record Virtual Memory Usage

Recording virtual memory usage shows memory usage growth, which will help Oracle Support identify and diagnose problems with running out of virtual memory. This section describes how you can collect virtual memory usage statistics on:

Windows

Use the Windows tool perfmon to record the PrivateBytes process counter. Collect information on the amount reserved virtual memory the JVM process. To do this:

  1. Open Performance Monitor, which you can find in the Administrative tools.
  2. Click + to open the Add Counters dialog box.
  3. Open the Performance Object drop-down list and select Process.
  4. Select the counter Private Bytes in the Process list.
  5. Select the process that you want to monitor and click Add.
Linux

Create a script to record the virtual memory usage with a regular interval; for example:

top -b -n 10 > virtualmemory.log

This script will do “top” every ten seconds and put the data in a file called virtualmemory.log. The virtual memory usage for all running processes can be found in the VIRT column in that file. To see just the current status, type top and press [Shift]-[M] to sort the output by memory usage. This usually puts the JVM process(es) at the top of the output.

Creating a recording like virtualmemory.log can be useful as it allows you to see that the JRockit JVM process is actually growing and provide evidence to Oracle Support that the growth is there.

If All Else Fails, Open a Case with Oracle Support

If none of these solutions works, you will need to open a case with Oracle Support. Please refer to Submitting Problems to Oracle Support for details on what kind of information you need to provide and how to submit that information.

 


Stack Overflow Crash

A stack overflow crash occurs when the JRockit JVM cannot gracefully handle a stack overflow error. According to the J2SE Javadoc, a “gracefully” handled java.lang.StackOverflowError is a java.lang.VirtualMachineError thrown “to indicate that the JVM is broken or has run out of resources necessary for it to continue operating.” For more information, please refer to these J2SE java.lang Javadocs:

The JRockit JVM R26 (and higher) dump files includes information on the number of stack overflow errors thrown.

Verify the Stack Overflow Crash

A stack overflow crash is easy to identify: The text dump file says, Error Message: Stack overflow somewhere near the top of the file. Other indications might be an extremely long stack trace in the crash file or, paradoxically, no stack trace at all. If the dump file says something like StackOverFlow: 2 StackOverFlowErrors occured, this is an indication that the crash might be triggered by a previous stack overflow problem.

Troubleshoot a Stack Overflow Crash

This section describes some possible solutions to stack overflow errors.

Application Level Changes

Often, a stack overflow error is caused by the application being coded to require stack space that exceeds the JRockit JVM’s memory limits. Examine the stack trace in the .dump file to determine if the Java code can be changed to use less stack space.

Increase the Default Stack Size

If changing the stack requirements of the application is not possible, you can change the thread stack size by using the -Xss option at JVM startup; for example:

-Xss:<value>[k|m]

Make the JRockit JVM More Robust Against Stack Overflow Errors

-XcheckedStacks makes the JRockit JVM more robust against stack overflow errors. It usually prevents the JVM from dumping and throwing a java.lang.StackOverflowError. There is a slight performance penalty when using this option as the JVM touches pages on the stack.

 


Unsupported Linux Configuration Crash

If your application crashes while running the JRockit JVM on Linux, even if the stack trace indicates a reason the crash occurred, you should ensure that you are running on a supported Linux configuration, as this might be contributing to the reason for the crash. You should do the following:

Verify that the OS Version is Supported

The JRockit JVM is generally only supported on generally available products from OS vendors. The JRockit JVM does not support custom built kernels. To verify that your version of Linux is supported, please refer to the specific section for your version of the JVM in Oracle JRockit JDK Supported Configurations.

Verify that You Have Installed the Correct glibc Binary

Linux on IA32 must be configured to use the glibc compiled for i686 architecture, otherwise you will see hangs and crashes with the JRockit JVM.

You can check what glibc is installed by running:

rpm -q --queryformat '\n%{NAME} %{VERSION} %{RELEASE} %{ARCH}\n' glibc

If the output says something like “i386”; for example:

glibc 2.3.4 2.25 i386

you are using an unsupported glibc. You need to upgrade your glibc version to one that doesn’t say “i386”. Output from a supported system will say something like:

glibc 2.3.4 2.25 i686

Examine the Thread Library

If you have a core file in gdb, you can get a hint of what thread library you are using by running:

info shared

Look at the path of the loaded libpthread<x>.so.

If it is in /lib/, then you should ask for the result of the rpm command. If the output says something like “i386”, you are using an unsupported glibc; you need to upgrade your glibc version to one that doesn’t say “i386”.

 


JVM Crash

A JVM crash is caused by a programming error in the JRockit JVM. Identifying and troubleshooting a JVM crash can help you find a temporary workaround until the problem is solved in the JRockit JVM. It may also help Oracle Support to identify and fix the problem faster.

Code Generation Crash

This section describes how to identify and troubleshoot a code generation crash. It contains the following subjects:

Identify a Code Generation Crash

The most common cause for a code generation crash is a mis-compiled method. If the JRockit JVM mis-compiles a method, either the JVM will crash or the method will do something other than what the source code says. If the JVM crashes while generating code, the text dump file should identify which method was being compiled at the time of the crash. It should be identified on a line towards the top starting with Method.

Knowing which method was causing the problem is the first step in resolving the problem.

Troubleshoot the Code Generation Crash

If the JRockit JVM mis-compiles a method, the fault is likely to be with the JVM’s optimizing compiler. To determine whether or not optimization itself is responsible, you can disable it by restarting the application with the -XnoOpt command-line option specified; for example:

java -XnoOpt myApp

If the JRockit JVM executes your program as expected, the problem is with code optimization.

Exclude the Offending Method

If disabling optimization stopped the JRockit JVM from crashing, you should next try excluding the offending method from optimization; you might be able run your application with almost full optimization if you can prevent just that method from being optimized. If this does not work, contact Oracle Support. Alternatively, try to use the -XXpreOpt command at startup to use the optimizing compiler for everything (be aware that using the optimizing compiler all the time can slow down the JVM startup).

You can exclude a method from optimization by using an optfile. If your application can run successfully without the offending method being optimized, this workaround should solve your problem.

Creating and Using an Optfile

An optfile is nothing more than a text file that contains directives, a single-character code that tells the optimizer that certain methods should either not be optimized or be forced optimized. Once you’ve created the file, you then use the -Djrockit.optfile=<filename> property (where <filename> is the optfile) to indicate the name and location of the optfile.

The structure of the file is illustrated in Listing 30-1.

Listing 30-1 Sample optfile
  - java/lang/FloatingDecimal.dtoa
- java/lang/Object
- sun/awt/windows/WComponentPeer.set*

In Listing 30-1, the “-” at the beginning of each line tells the optimizing compiler to never optimize this method. Thus, in this example, the “-” directive tell the optimizer to never optimize the following:

Note: If you are using a version of the JRockit JVM earlier than R26.4, using “-” will not disable regeneration of the method completely. If a method m is marked with a “-” and the hotspot detector thinks it is a hotspot, it will regenerate that method but not optimize it further.

-” is the only useful directive with this workaround. The other directives are h (allow this method to be optimized by the hotspot detector, but do not preoptimize it), p (preoptimize this method and do not allow the hotspot detector to optimize it), and + (preoptimize this method and allow the hotspot detector to optimize it), however, they are not useful in this workaround.

Verifying optfile Response

If you want to make sure your optfile does as you expect, use -Xverbose:opt and check the output. You should not see the method you’re excluding.

Setting and optfile with a Ctrl-Break Handler

You can also use a Ctrl-Break Handler to set the optfile. The handler is called run_optfile and takes a <filename> argument that is a regular optfile; for example:

run_optfile optfile=<filename>

When ctrl-break is pressed, any method matching the “-” directives in the optfile will not be optimized.

Rules for Directives

When you create an optfile, the following rules apply:

The Problem Might Lie With an External Instrumentation Tool

If you have eliminated a mis-compiled method as the problem for the crash and you are using an external instrumentation tool (for example JProbe or OptimizeIt), you might want to investigate whether this tool is causing the problem. These tools can alter bytecode, which can cause unexpected behavior. In some instances, the problem lies directly with the tool; however, the JRockit JVM might have issues with the tool that are causing the crash. To eliminate tools as a cause for the crash, disable the tool(s) and rerun the application. If the crash happens again, your problem is not with the instrumentation tool. If the application runs as expected, you should consider using a different tool or running without the tool.

If All Else Fails, Open a Support Case

If the optfile workaround doesn’t alleviate the problem or if you cannot run the application successfully without the problematic method optimized, you will need to open a case with Oracle Support. You can find instructions on how to report a problem to Oracle, including the sort of information to include, in Submitting Problems to Oracle Support.

For code generation crashes, you will need to provide the following data to Oracle Support:

Garbage Collection Crash

This section describes how to identify and troubleshoot crashes in garbage collection. It contains the following information:

Identify a Garbage Collection Crash

You can identify a garbage collection crash by looking at the stack trace in the text dump file. If garbage collection functions appear in the stack trace, or if the thread that caused the crash is one of the garbage collection threads, the crash is most likely to have occurred during garbage collection. Garbage collection functions in the stack trace are identified by prefixes like mm, gc, yc and oc.

Consider Upgrading to the Latest Version of the JRockit JVM

If you are experiencing garbage collection crashes, the simplest—and most strongly-recommended—solution is to upgrade your version of the JRockit JVM to the latest one available. This is because further diagnosis of the problem can be a a very complex and time-consuming exercise. You can avoid the problem by upgrading because it might have been fixed in the latest version of the JVM.

Try One of These Workarounds

If you do not (or cannot) upgrade to the latest version of the JVM or if you are already using the latest version, try using any of the following workarounds to prevent garbage collection crashes:

Change the Garbage Collector

It is possible that the garbage collector you are using has bugs that you can avoid by changing to another garbage collector. Be aware though, that if you change collector, you will not receive the same performance profile from the Oracle JRockit JVM.

If you are using deterministic garbage collection, you cannot change to another garbage collector and retain the deterministic garbage collection guarantees. Instead of changing your garbage collector, you should open a case with Oracle Support.

Disable Compaction

Bugs in heap compaction can sometimes cause trouble leading to crashes in garbage collection. You can disable it by setting -XXnoCompaction at startup. Be aware that using this option can lead to heap fragmentation and should only be used for troubleshooting purposes. If the heap becomes too fragmented, you might encounter Out of Memory Errors.

Disable Inlining

Erroneous inlining may cause broken code, which makes the garbage collector lose track of live objects. You can disable inlining by using the command-line option combination -XXnoJITInline -XnoOpt. You must use both options because -XXnoJITInline only disables inlining the first time a method is compiled. Unless you set -XnoOpt as well, methods can still be inlined when code is optimized.

Note: If -XnoOpt (without -XXnoJITInline) resolves the problem, your issue might be with code optimization. Conversely, if -XXnoJITInline without -XnoOpt resolves the problem, you should notify Oracle Support about this.
Use the Optimizing Compiler

You might be experiencing garbage collection crashes because the non-optimizing JIT compiler is generating broken code that makes the garbage collector lose track of live objects. Use the -XXpreOpt command at startup to use the optimizing compiler for everything. Be aware that using the optimizing compiler can slow down the JVM startup.

If All Else Fails, Open a Case with Oracle Support

If none of the above workarounds resolve the crash issue, you will need to open a case with Oracle Support. You must include the following information:

For information on communication with Oracle Support, please refer to Submitting Problems to Oracle Support.


  Back to Top       Previous  Next