5 Debugging in Coherence

This chapter provides instructions for configuring logging and includes general advice for debugging and troubleshooting Coherence applications.

This chapter includes the following sections:

5.1 Overview of Debugging in Coherence

Coherence applications are typically developed on a single computer. The cache server and application are started within the IDE and the application is debugged as required. This type of development environment is easy to setup, performs well, and is easy to debug. A majority of applications can be created and tested this way. See "Enabling Single-Server Mode" for details on configuring Coherence to run on a single server.

Ideally, most errors can be detected during development using logging, enabling JVM debug options, and capturing thread and heap dumps as required. Moreover, IDEs and profiling tools, such as Oracle's Java VisualVM and JConsole, provide features for diagnosing problems. However, Coherence applications must eventually be tested in a more distributed environment. Debugging and troubleshooting in the testing environment is more difficult since data and processes are fully distributed across the cluster and because the network affects the application. Remote debugging with Java Debug Wire Protocol (JDWP) together with Coherence's JMX management and reporting capabilities facilitate debugging and troubleshooting in a distributed environment.

Using Oracle Support

Oracle support can help debug issues and is available through https://support.oracle.com. When sending support an issue, always include the following items in a compressed file:

  • application code

  • configuration files

  • log files for all cluster members

  • Thread and heap dumps are required under certain circumstances. Thread dumps should be sent if the application is running slow and/or appears to be hung. Heap dumps should be sent if the application runs out of memory or is consuming more memory than expected.

5.2 Configuring Logging

Coherence has its own logging framework and also supports the use of log4j, slf4j, and Java logging to provide a common logging environment for an application. Logging in Coherence occurs on a dedicated and low-priority thread to reduce the impact of logging on the critical portions of the system. Logging is pre-configured and the default settings should be changed as required.

This section includes the following topics:

5.2.1 Changing the Log Level

The logger's log level determines which log messages are emitted. The default log level emits error, warning, informational, and some debug messages. During development, the log level should be raised to its maximum setting to ensure all debug messages are logged. The following log levels are available:

  • 0 – This level includes messages that are not associated with a logging level.

  • 1 – This level includes the previous level's messages plus error messages.

  • 2 – This level includes the previous levels' messages plus warning messages.

  • 3 – This level includes the previous levels' messages plus informational messages.

  • 4-9 – These levels include the previous levels' messages plus internal debugging messages. More log messages are emitted as the log level is increased. The default log level is 5.

  • -1 – No log messages are emitted.

To change the log level, edit the operational override file and add a <severity-level> element, within the <logging-config> element, that includes the level number. For example:

...
<logging-config>
   ...
   <severity-level system-property="tangosol.coherence.log.level">9
   </severity-level>
   ...
</logging-config>
...

The tangosol.coherence.log.level system property is used to specify the log level instead of using the operational override file. For example:

-Dtangosol.coherence.log.level=9

5.2.2 Changing the Log Destination

The logger can be configured to emit log messages to several destinations. For standard output to the console, both stdout and stderr (the default) can be used. The logger can also emit messages to a specified file.

Coherence also supports the use of JDK, log4j, and SLF4J to allow an application and Coherence to share a common logging framework. See "Using JDK Logging for Coherence Logs", "Using Log4J Logging for Coherence Logs", and "Using SLF4J for Coherence Logs" for detailed instructions.

To change the log destination, edit the operational override file and add a <destination> element, within the <logging-config> element, that includes the destination. For example:

...
<logging-config>
   <destination system-property="tangosol.coherence.log">stdout</destination>
   ...
</logging-config>
...

The tangosol.coherence.log system property is used to specify the log destination instead of using the operational override file. For example:

-Dtangosol.coherence.log=stdout

5.2.2.1 Sending Log Messages to a File

The logger can be configured to emit log messages to a file by providing a path and file name in the <destination> element. The specified path must already exist. Make sure the specified directory can be accessed and has write permissions. Output is appended to the file and there is no size limit. Process cannot share a log file and the log file is replaced when a process is restarted. Sending log messages to a file is typically used during development and testing and is useful if the log messages need to be sent to Oracle support.

The following example demonstrates specifying a log file named coherence.log that is written to the /tmp directory:

...
<logging-config>
   <destination system-property="tangosol.coherence.log">/tmp/coherence.log
   </destination>
   ...
</logging-config>
...

5.2.3 Changing the Log Message Format

The default format of log messages can be changed depending on the amount of detail that is required. A log message can include static text as well as any of the following parameters that are replaced at run time.

Note:

Changing the log message format must be done with caution as critical information (such as member or thread) can be lost which makes issues harder to debug.
Parameter Description
{date} This parameter shows the date/time (to a millisecond) when the message was logged.
{uptime} This parameter shows the amount of time that the cluster members has been operational.
{product} This parameter shows the product name and license type.
{version} This parameter shows Coherence version and build details.
{level} This parameter shows the logging severity level of the message.
{thread} This parameter shows the thread name that logged the message.
{member} This parameter shows the cluster member id (if the cluster is currently running).
{location} This parameter shows the fully cluster member identification: cluster-name, site-name, rack-name, machine-name, process-name and member-name (if the cluster is currently running).
{role} This parameter shows the specified role of the cluster member.
{text} This parameter shows the text of the message.
{ecid} This parameter shows the Execution Context ID (ECID). The ECID is a globally unique ID that is attached to requests between Oracle components. The ECID is an Oracle-specific diagnostic feature and is used to correlate log messages across log files from Oracle components and products and is also used to track log messages pertaining to the same request within a single component when multiple requests are processed in parallel. Coherence clients that want to include the ECID in their logs must have an activated Dynamic Monitoring Service (DMS) execution context when invoking Coherence.

Note: If JDK logging is used with an Oracle Diagnostic Logging (ODL) handler, then the {ecid} parameter does not apply because the ECID automatically becomes part of the ODL record form.


To change the log message format, edit the operational override file and add a <message-format> element, within the <logging-config> element, that includes the format. For example:

...
<logging-config>
   ...
   <message-format>[{date}] &lt;{level}&gt; (thread={thread})  -->{text}
   </message-format>
   ...
</logging-config>
...

5.2.4 Setting the Logging Character Limit

The logging character limit specifies the maximum number of characters that the logger daemon processes from the message queue before discarding all remaining messages in the queue. The messages that are discarded are summarized by the logging system with a single log entry that details the number of messages that were discarded and their total size. For example:

Asynchronous logging character limit exceeded; discarding 5 log messages (lines=14, chars=968)

The truncation is only temporary; when the queue is processed (emptied), the logger is reset so that subsequent messages are logged.

Note:

The message that caused the total number of characters to exceed the maximum is never truncated.

The character limit is used to avoid situations where logging prevents recovery from a failing condition. For example, logging can increase already tight timings, which causes additional failures, which produces more logging. This cycle may continue until recovery is not possible. A limit on logging prevents the cycle from occurring.

To set the log character limit, edit the operational override file and add a <character-limit> element, within the <logging-config> element. The character limit is entered as 0 (Integer.MAX_VALUE) or a positive integer. For example:

...
<logging-config>
   ...
   <character-limit system-property="tangosol.coherence.log.limit">12288
   </character-limit>
</logging-config>
...

The tangosol.coherence.log.limit system property is used to specify the log character limit instead of using the operational override file. For example:

-Dtangosol.coherence.log.limit=12288

5.2.5 Using JDK Logging for Coherence Logs

Applications that use the JDK logging framework can configure Coherence to use JDK logging as well. Detailed information about JDK logging is beyond the scope of this documentation. For details on JDK logging, see http://download.oracle.com/javase/7/docs/technotes/guides/logging/overview.html.

To use JDK logging for Coherence logs:

  1. Create a logging.properties file. The following example configures the JDK logger to emit messages to both the console and to a file. Output to the console and file is configured to use the FINEST log level. For the file handler pattern, the specified path must already exist. Also, ensure that the specified directory can be accessed and has write permissions.

    handlers=java.util.logging.FileHandler, java.util.logging.ConsoleHandler
    
    .level=INFO
    Coherence.level=FINEST
    
    java.util.logging.FileHandler.pattern=/tmp/coherence%u.log
    java.util.logging.FileHandler.limit=50000
    java.util.logging.FileHandler.count=1
    java.util.logging.FileHandler.formatter=java.util.logging.SimpleFormatter
    
    java.util.logging.ConsoleHandler.formatter=java.util.logging.SimpleFormatter
    

    Note:

    • In the above example, Coherence is used as the logger object name and is the default name that is used by the Coherence logging framework. A different name can be used by specifying the name within the <logger-name> element in the operational override file or by specifying the name as the value of the tangosol.coherence.log.logger system property.

    • Set the JDK logging level to FINEST and allow the Coherence logging settings to determine which log messages to construct for JDK logging.

  2. Configure Coherence to use JDK logging by specifying jdk as the value of the <destination> element in an operational override file. For example:

    ...
    <logging-config>
       <destination system-property="tangosol.coherence.log">jdk</destination>
       ...
    </logging-config>
    ...
    
  3. Make sure the logging.properties file is specified using the java.util.logging.config.file system property. For example:

    -Djava.util.logging.config.file=myfile
    

Mapping JDK Log Levels with Coherence Log Levels

The following table provides a mapping of how JDK log levels are mapped to Coherence log levels.

JDK Log Level Coherence Log Level
OFF NONE
FINEST INTERNAL
SEVERE ERROR
WARNING WARNING
INFO INFO
FINE LEVEL_D4
FINER LEVEL_D5
FINEST LEVEL_D6
FINEST LEVEL_D7
FINEST LEVEL_D8
FINEST LEVEL_D9
ALL ALL

5.2.6 Using Log4J Logging for Coherence Logs

Applications that use the log4j logging framework can configure Coherence to use log4j logging as well. Detailed information about log4j logging is beyond the scope of this documentation. For details on log4j logging, see http://logging.apache.org/log4j/1.2/manual.html.

To use log4j logging for Coherence logs:

  1. Create a log4j.properties file. The following example configures the log4j logger to emit messages to both the console and to a file. Output to the console and file are configured to use the FATAL log level. For the file appender, make sure the specified directory can be accessed and has write permissions.

    log4j.logger.Coherence=FATAL, stdout, file
    
    log4j.appender.stdout=org.apache.log4j.ConsoleAppender
    log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
    log4j.appender.stdout.layout.ConversionPattern=%m%n
    
    log4j.appender.file=org.apache.log4j.RollingFileAppender
    log4j.appender.file.File=/tmp/coherence.log
    log4j.appender.file.MaxFileSize=10MB
    log4j.appender.file.layout=org.apache.log4j.PatternLayout
    log4j.appender.file.layout.ConversionPattern=%m%n
    

    Note:

    • In the above example, Coherence is used as the logger object name and is the default name that is used by the Coherence logging framework. A different name can be used by specifying the name within the <logger-name> element in the operational override file or by specifying the name as the value of the tangosol.coherence.log.logger system property.

    • Set the log4j logging level to FATAL and allow the Coherence logging settings to determine which log messages to construct for log4j logging.

  2. Configure Coherence to use log4j logging by specifying log4j as the value of the <destination> element in an operational override file. For example:

    ...
    <logging-config>
       <destination system-property="tangosol.coherence.log">log4j</destination>
       ...
    </logging-config>
    ...
    
  3. Make sure both the log4j.jar file and the log4j.properties file are found on the classpath at run time.

Mapping Log4J Log Levels with Coherence Log Levels

The following table provides a mapping of how log4j log levels are mapped to Coherence log levels.

Log4J Log Level Coherence Log Level
OFF NONE
DEBUG INTERNAL
ERROR ERROR
WARN WARNING
INFO INFO
DEBUG LEVEL_D4
DEBUG LEVEL_D5
DEBUG LEVEL_D6
DEBUG LEVEL_D7
DEBUG LEVEL_D8
DEBUG LEVEL_D9
ALL ALL

5.2.7 Using SLF4J for Coherence Logs

Applications that use SLF4J logging can configure Coherence to use SLF4J logging as well. Detailed information about SLF4J logging is beyond the scope of this documentation. For details on SLF4J logging, see http://www.slf4j.org/.

To use SLF4J logging:

  1. specify slf4j as the value of the <destination> element in an operational override file. For example:

    ...
    <logging-config>
       <destination system-property="tangosol.coherence.log">slf4j</destination>
       ...
    </logging-config>
    ...
    
  2. Make sure both the slf4j-api.jar file and the appropriate logging framework SLF4J binding JAR is located on the classpath.

5.3 Performing Remote Debugging

Java Debug Wire Protocol (JDWP) provides the ability to debug a JVM remotely. Most IDE tools support JDWP and are used to connect to a remote JVM that has remote debugging enabled. See your IDE's documentation for instructions on how to connect to a remote JVM.

To enable remote debugging on a cache server, start the cache server with the following JVM options. Once the cache server has been started, use the IDE's debugger to connect to the JVM using the port specified (5005 in the example).

-Xdebug
-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5005

Remote debugging a Coherence application can be difficult when the application is no longer on a single node cluster because data is distributed across the members of the cluster. For example, when performing parallel grid operations, the operations are performed on the cluster members where the data is located. Since there are no guarantees on which members the data is located, it is best to constrain a test to use a singe cache server.

In addition, the guardian and packet timeout can make cluster debugging difficult. If the debugger pauses the packet publishing, cluster, and service threads, it will cause disruptions across the cluster. In such scenarios, disable the guardian and increase the packet timeout during the debugging session. See "service-guardian" for details on configuring these settings.

5.4 Troubleshooting Coherence-Based Applications

The topics in this section provide general troubleshooting advice. Troubleshooting Coherence-based applications is, for the most part, no different than troubleshooting other Java application. Most IDEs provide features that facilitate the process. In addition, many tools, such as: Java VisualVM, JConsole, and third party tools provide easy ways to monitor and troubleshoot Java applications. See the Troubleshooting Java SE section on OTN for detailed information on troubleshooting Java:

http://www.oracle.com/technetwork/java/javase/index-138283.html

Troubleshooting a Coherence application on a single server cluster is typically straight forward. Most Coherence development work is done in such an environment because it facilitates debugging. Troubleshooting an application that is deployed on a distributed cluster can become more challenging.

This section includes the following topics:

5.4.1 Using Coherence Logs

Log messages provide information that is used to monitor and troubleshoot Coherence. Most log messages are explained in the Log Glossary within the Administering Oracle Coherence. The glossary provides additional details as well as specific actions that can be taken when a message is encountered.

Configuring logging beyond the default out-of-box configuration is very important when developing and debugging an application. Specifically, use the highest log level (level 9 or ALL when using JDK or log4j logging) to ensure that all log messages are emitted. Also, consider using either JDK or log4j logging. Both of these frameworks support the use of rolling files and console output simultaneously. Lastly, consider placing all log files in a common directory. A common directory makes it easier to review the log files and package them for the Coherence support team. See "Configuring Logging" for detailed information on configuring all aspects of logging.

5.4.2 Using JMX Management and Coherence Reports

Coherence management is implemented using Java Management Extensions (JMX). Many MBeans are provided that detail the health and stability of Coherence. The MBeans provide valuable insight and should always be used when moving an application from a development environment to a fully distributed environment. MBeans are accessible using JConsole and VisualVM or any management tool that supports JMX. In addition, Coherence includes reports that gather information from the MBeans over time and provide a historical context that is not possible simply by monitoring the MBeans. The reports are most often used to identify trends that are valuable for troubleshooting. Management and reporting are not enabled by default and must be enabled. See Managing Oracle Coherence for detailed instructions on using the management features included with Coherence.

5.4.3 Using JVM Options to Help Debug

Most JVMs include options that facilitate debugging and troubleshooting. These options should be used to get as much information as possible. Consult your JVM vendor's documentation for their available options. The JVM options discussed in this section are Java HotSpot specific. See the Java HotSpot VM Options Web page for detailed information and usage instructions for all JVM options:

>http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html">>http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html

>The following JVM options (standard and non standard) can help when debugging and troubleshooting applications:

  • >
  • >

    >>-verbose:gc or >-Xloggc:>file – These options are used to enable additional logs for each garbage collection event. In a distributed system, a GC pause on a single JVM can affect the performance of many JVMs, so it is essential to monitor garbage collection very closely. The >-Xloggc option is similar to verbose GC but includes timestamps.

  • >

    >>-Xprof and >-Xrunhprof – These options are used to view JVM profile data and are not intended for production systems.

  • >

    >>-XX:-PrintGC, >-XX:-PrintGCDetails, and >-XX:-PrintGCTimeStamps – These options are also used print messages at garbage collection.

  • >

    >>-XX:-HeapDumpOnOutOfMemoryError and >-XX:HeapDumpPath=./java_pid><pid>>.hprof – These options are used to initiate a heap dump when a >java.lang.OutOfMemoryError is thrown.

  • >

    >>-XX:ErrorFile=./hs_err_pid><pid>>.log – This option saves error data to a file.

5.4.4 Capturing Thread Dumps

Thread dumps are used to see detailed thread information, such as thread state, for each thread in the JVM. A thread dump also includes information on each deadlocked thread (if applicable). Thread dumps are useful because of Coherence's multi-threaded and distributed architecture. Thread dumps are often used to troubleshoot an application that is operating slowly or is deadlocked. Make sure to always collect several dumps over a period of time since a thread dump is only snapshot in time. Always include a set of thread dumps when submitting a support issue.

Coherence provides a native logClusterState JMX operation that is located on the ClusterMBean MBean and a native logNodeState JMX operation that is located on the ClusterNodeMBean MBean. These operations initiate a thread dump (including outstanding polls) on multiple cluster members or on a single cluster member, respectively. For a detailed reference of Coherence MBeans, see See Managing Oracle Coherence.

To perform a thread dump locally on Unix or Linux operating systems, press Ctrl+\ at the application console. To perform a thread dump on Windows, press Ctrl+Break (or Pause). Both methods include a heap summary with the thread dump.

Most IDEs provide a thread dump feature that can be used to capture thread dumps while working in the IDE. In addition, Unix and Linux operating systems can use the kill -3 pid to cause a remote thread dump in the IDE. On Windows, use a third party tool (such as SendSignal) to send a ctrl+break signal to a remote Java process and cause a dump in the IDE.

Profiling tools, such as Oracle's Java VirtualVM (jvisualvm) and JConsole (jconsole) are able to perform thread dumps. These tools are very useful because they provide a single tool for troubleshooting and debugging and display many different types of information in addition to just thread details.

Lastly, the jstack tool can be used to capture a thread dump for any process. For example, use jps to find a Java process ID and then execute the following from the command line:

jstack <pid>

The jstack tool is unsupported and may or may not be available in future versions of the JDK.

5.4.5 Capturing Heap Dumps

Heap dumps are used to see detailed information for all the objects in a JVM heap. The information includes how many instance of an object are loaded and how much memory is allocated to the objects. Heap information is typically used to find parts of an application that may potentially be wasting resources and causing poor performance. In a fully distributed Coherence environment, heap dumps can be tricky because application processing is occurring across the cluster and problematic objects may not necessarily be local to a JVM. Make sure to always collect several dumps over a period of time since a heap dump is only a snapshot in time. Always include heap dumps when submitting a support issue.

The easiest way to capture a heap dump is to use a profiling tool. Oracle's Java VirtualVM (jvisualvm) and JConsole (jconsole) provide heap dump features. In addition, most IDEs provide a heap dump feature that can be used to capture heap dumps while working in the IDE.

As an alternative, the jmap tool can be used to capture heap dumps, and the jhat tool can be used to view heap dumps. For example, use jps to find a Java process ID and then execute the following from the command line:

jmap -dump:format=b,file=/coherence.bin pid

To view the heap dump in a browser, execute the following from the command line and then browse to the returned address. The file can also be loaded into VisualVM for viewing.

jhat /coherence.bin

The jmap and jhat tools are unsupported and may or may not be available in future versions of the JDK.

5.4.6 Monitoring the Operating System

Always monitor a cluster member's operating system when troubleshooting and debugging Coherence-based applications. Poorly tuned operating systems can affect the overall performance of the cluster and may have adverse effects on an application. See Administering Oracle Coherence for details on performance tuning.

In particular, the following areas are important to monitor:

  • CPU – Is the processor running at 100% for extended periods of time?

  • Memory/Swapping – Is the available RAM memory being exhausted and causing swap space to be used?

  • Network – Is buffer size, the datagram size, and the Maximum Transmission Unit (MTU) size affecting performance and success rates?

To monitor the overall health of the operating system, use tools such as vmstat and top for Unix/Linux; for Windows, use perfmon and tools available from Windows Sysinternals (for example procexp and procmon). See Administering Oracle Coherence for detailed instructions on how to test network performance.