19 Using the Diagnostic Frameworks to Diagnose Problems

This chapter describes how to identify Service Bus problems and take the proper corrective actions with the assistance of the WebLogic Diagnostic Framework (WLDF) and the Oracle Fusion Middleware Diagnostic Framework (DFW).

This appendix includes the following sections:

19.1 Understanding Diagnostics for Oracle Service Bus

Service Bus leverages the Oracle Fusion Middleware Diagnostic Framework along with WebLogic Diagnostic Framework (WLDF) to help you detect, diagnose, and resolve problems.

WLDF lets you monitor diagnostic scenarios by watching specific logs and metrics for specified conditions and sending a notification when a condition is met. The Diagnostic Framework lets you gather diagnostic scenarios specific to Service Bus into data dumps that are formatted for viewing and analyzing.

WebLogic and SOA Suite both provide several predefined diagnostic dumps to help you with diagnostics. In addition, Service Bus supports the following diagnostic dumps:

  • Derived Resource Caches

  • JMS Request/Response Correlation Table

  • MQ Request/Response Correlation Table

For information about the diagnostic frameworks, watches, and notifications, see "Diagnosing Problems" in the Administering Oracle Fusion Middleware. For information about using the diagnostic frameworks with SOA Suite (including generated dumps, setting up watches and notifications, and predefined diagnostic dumps), see "Diagnosing Problems with SOA Composite Applications" in Administering Oracle SOA Suite and Oracle Business Process Management Suite.

19.1.1 Oracle WebLogic Diagnostic Framework

WLDF is a monitoring and diagnostics framework included with Oracle WebLogic Server that defines and implements a set of services that run within WebLogic Server processes and that participate in the standard server life cycle. Using WLDF, you can capture the diagnostic data generated by a running server, and set watches and notifications when certain conditions are met. Defining watches and notifications helps you collect the diagnostic data to identify problems, enabling you to isolate and diagnose faults when they occur.

For more information about WLDF, see Configuring and Using the Diagnostics Framework for Oracle WebLogic Server.

19.1.1.1 Watches and Notifications

When you create a watch, it monitors server and application states and sends notifications based on criteria that you define. Watches and notifications are configured as part of a diagnostic module targeted to one or more server instances in a domain. When you create a watch, you build rule expressions for monitoring using the attributes of Service Bus and Oracle WebLogic Server MBeans in Oracle WebLogic Server Administration Console. As an example, you could set up a watch to be notified when the percentage of free heap memory falls below 25%. You can configure watches and notifications using Service Bus message IDs.

For information about creating watches and notifications, see Configuring the Diagnostic Framework in Administering Oracle Fusion Middleware.

19.1.1.2 Diagnostic Scenarios and MBeans

The Diagnostic Framework provides MBeans you can use to configure how data is collected. The watch rule expressions that you create use the attributes of Service Bus and Oracle WebLogic Server MBeans to collect data and perform monitoring. You diagnose scenarios with available MBeans to provide statistics about that scenario or to log messages. The attributes of the following MBeans are available:

  • Oracle WebLogic Server MBeans

  • Diagnostic Service Bus MBeans

  • DMS metrics exposed as MBeans

Service Bus provides several MBeans so you can monitor the following with watches and notifications:

  • Configuration Framework

  • Proxy and Business Services

  • Pipelines and Split-Joins

  • Sessions

For more information about Oracle WebLogic Server MBeans, see MBean Reference for Oracle WebLogic Server.

19.1.2 Oracle Fusion Middleware Diagnostic Framework

The Diagnostic Framework aids in detecting, diagnosing, and resolving problems by targeting critical errors, such as those caused by code bugs, metadata corruption, customer data corruption, deadlocked threads, and inconsistent state. The Diagnostic Framework detects critical failures and captures dumps of relevant diagnostics information. WLDF watches and notifications trigger the events for which the Diagnostic Framework listens and then generates appropriate data dumps.

For information about how the Diagnostic Framework processes events, see How the Diagnostic Framework Works in Administering Oracle Fusion Middleware.

19.1.2.1 Diagnostic Dumps

A diagnostic dump captures and dumps specific diagnostic information automatically when an incident is created or manually on the request of an administrator. When executed as part of incident creation, the dump is included with the set of incident diagnostics data. Examples of diagnostic dumps include JVM thread dumps, JVM class histogram dumps, and DMS metric dumps.

The Diagnostic Framework provides several predefined dumps. For more information, see Investigating, Reporting, and Solving a Problem in Administering Oracle Fusion Middleware. In addition to the dumps provided by the Diagnostic Framework, Service Bus includes dumps to provide diagnostics specific to Service Bus. For more information, see Working with Oracle Service Bus Diagnostic Dumps.

19.1.3 About the Automatic Diagnostic Repository

The Automatic Diagnostic Repository (ADR) is a file-based hierarchical repository for diagnostic data, such as traces and dumps. Oracle Fusion Middleware components store all incident data in the ADR, and each Oracle WebLogic Server stores diagnostic data in subdirectories of its own home directory within the ADR. For more information about the ADR, see Automatic Diagnostic Repository in Administering Oracle Fusion Middleware.

19.1.4 Predefined Incident Processing Rules

When you create a watch in the Oracle WebLogic Server Administrator's Console, you also define a notification. Oracle Fusion Middleware defines a default notification named FMWDFW notification. While you can create your own notifications, selecting FMWDFW notification creates the Service Bus dumps described in Working with Oracle Service Bus Diagnostic Dumps.

For information about creating custom notifications, see Configuring Custom Diagnostic Rules in Administering Oracle Fusion Middleware.

19.1.5 Dynamic Monitoring Service Metrics

Using the Oracle Dynamic Monitoring Service (DMS), Oracle Fusion Middleware components can provide administration tools, such as Fusion Middleware Control, with data regarding the component's performance, state, and on-going behavior. DMS measures and reports metrics, trace events, and system performance and provides a context correlation service for these components.

Dynamic Monitoring Service (DMS) metrics with noun types are exposed as Service Bus MBeans to use for diagnosing problems. DMS nouns can be used to create watches in Oracle WebLogic Server Administration Console. Service Bus uses DMS to capture the response time for a Service Bus proxy service.

Service Bus defines one phase event sensor, response, whose parent noun is the service path. Table 19-1 shows the supported Service Bus DMS nouns. It also includes the parent nouns to illustrate the noun hierarchy.

Table 19-1 Service Bus Sensors

Noun Path Noun Sensor Type Parent Noun

/domain_name/server_name/project_name

Context

NA

osb_context

None

PROXY or BIZ

Service Type

NA

osb_service_type

Context

Full path to the service, including folders and service name (replacing the slash or backslash with a hyphen).

Service Path

response

osb_service_path

Service Type

Given the following Service Bus environment, the examples provided below illustrate Context and Service Path names.

Environment

  • Domain name: servicebus

  • Server name: osb_server1

  • Service Bus project name: TravelPoints

  • Proxy services folder name (in the TravelPoints project): TravelProxyServices

  • Proxy service name: CalculatePoints

Examples

  • Context: /servicebus/osb_server1/TravelPoints

  • Service Path: TravelProxyServices-CalculatePoints

    DMS allows each noun to be referenced using a path delimited by '/'. The delimiter (/) in the path is used to identify the parent nouns. For example, the Service Path noun in the above example can be directly referenced by the following:

    /servicebus/osb_server1/TravelPoints/PROXY/TravelProxyServices-CalculatePoints
    

The response sensor captures the following information:

Metric Description

time

The total response time across all activations.

completed

The number of completed activations.

minTime

Shortest completed activation.

maxTime

Longest completed activation.

avg

The average time to complete activation.

active

The number of current incomplete activations.

maxActive

The maximum number on concurrent open activations.

For additional information about DMS, see Using the Oracle Dynamic Monitoring Service in Tuning Performance.

19.2 Working with Oracle Service Bus Diagnostic Dumps

In addition to the diagnostic dumps available with Oracle WebLogic Server and Oracle SOA Suite, Service Bus supports the creation of the diagnostic dumps in these locations.

Table 19-2 lists the locations.

Table 19-2 Service Bus Diagnostic Dumps

Dump Description

OSB.derived-caches

A collection of statistics about all Service Bus derived resource caches on the server

OSB.jms-async-table

Service Bus JMS request/response correlation table

OSB.mq-async-table

Service Bus MQ request/response correlation table

19.2.1 Listing the Available Diagnostic Dumps

This section describes how to use WebLogic Scripting Tool commands to work with diagnostic dumps. For more information about these commands, see Diagnostic Commands in WLST Command Reference for WebLogic Server. For more information about Diagnostic Framework dumps, see Diagnosing Problems in Administering Oracle Fusion Middleware.

To list the available diagnostic dumps:

  1. Navigate to MW_HOME/oracle_common/common/bin, and run the following command to start WLST:
    ./wlst.sh
    

    Note:

    You must start WLST from MW_HOME/oracle_common/common/bin. Otherwise, the ODF functions are missing.

  2. To connect to the server on which Service Bus is installed, run the following command:
    connect('user_name', 'password','t3://hostname:port_number')
    

    A message appears indicating whether the connection succeeded.

  3. To list the available Diagnostic Framework dumps, run the following command:
    listDumps()
    

    A list of available dumps appears on the console.

    Use the command describeDump(name=dumpName) for help with a specific dump.

  4. To list the available dumps for Service Bus, run the following command:
    listDumps(appName='OSB')
    

    A list of Service Bus dumps appears on the console.

19.2.2 Derived Resource Caches Diagnostic Dumps (OSB.derived-caches)

The following table describes the Service Bus derived resource caches diagnostic dumps. The information captured includes the name of each cache type, statistical information for each cache, and information about each cached entry.

Table 19-3 JMS Correction Table Diagnostic Dumps

Dump Name Dump Parameters/Dump Mode Information Captured

OSB.derived-caches

None

For each derived resource cache managed in the Service Bus runtime, the following information is provided:

  • Derived resource cache type

  • Product version

  • Total number of configured cache entries

  • Cache entries in use

  • Total hits to entries in the cache server since the server was last started

  • Total misses while trying to access cached information since the server was last started

  • Hit ratio of the cache sine the server was last started

For each cache entry, the following information is provided:

  • Ref that is being cached

  • Create date and time

  • Amount of time spent computing the cache entry. This is the time taken to create the cached information in milliseconds.

19.2.2.1 Oracle Service Bus Derived Resource Caches

The following table lists each Service Bus cache included in the diagnostic information.

Table 19-4 Oracle Service Bus Derived Resource Caches

Cache Description

Archive ClassLoader

Dependency-aware archive class loaders.

Archive Summary

Archive summaries.

CodecFactory

Codec factories.

EffectiveWSDL

Effective WSDL objects that are derived from the service or WSDL resources of business or proxy services.

Flow_Info

Message flow information objects.

LightweightEffectiveWSDL

Effective WSDL objects that are derived from the service or WSDL resources of business or proxy services.

MflExecutor

MFL executors.

RouterRuntime

Compiled router run times for proxy services.

RuntimeEffectiveWSDL

Session valid effective WSDL objects derived from the service or WSDL resources of business or proxy services.

RuntimeEffectiveWSPolicy

WS policies for business or proxy services.

SchemaTypeSystem

Type system information for MFL, XS, and WSDL documents.

ServiceAlertsStatisticInfo

Service alert statistics for business or proxy services.

ServiceInfo

Compiled service information for business or proxy services and for WSDL documents.

Wsdl_Info

WSDL information objects.

WsPolicyMetadata

Complied WS-Policy metadata.

XMLSchema_Info

XML schema information for XML schema objects.

XqueryExecutors

XQuery executors.

XsltExecutor

XSLT executors.

alsb.transports.ejb. bindingtype

EJB binding information for EJB business services.

alsb.transports.jejb.business. bindingtype

JEJB binding information for JEJB business services.

alsb.transports.jejb.proxy. bindingtype

JEJB binding information for JEJB proxy services.

19.2.2.2 Viewing a description of the derived resource caches dump

To view a description of the derived resource caches dump:

  • Run the following WLST command:

    describeDump(name='OSB.derived-caches',appName='OSB')
    

    The name, description, and arguments for the dump appear on the console.

19.2.2.3 Running the derived resource caches dump

To run the derived resource caches dump:

19.2.2.4 Sample Output of the Derived Resource Cache Dump

Information similar to the following example appears after running the derived resource caches dump, as described in Running the derived resource caches dump. Parts of this dump have been truncated for readability.

<derivedCaches xmlns="http://www.bea.com/wli/config/xmltypes">
  <derivedCache cacheType="RuntimeEffectiveWSDL">
    <configuredEntries>2147483647</configuredEntries>
    <cacheEntriesInUse>0</cacheEntriesInUse>
    <totalHits>0</totalHits>
    <totalMisses>0</totalMisses>
    <hitRatio>0.0</hitRatio>
    <cacheEntries/>
  </derivedCache>
 ...
 <derivedCache cacheType="ServiceAlertsStatisticInfo">
    <configuredEntries>2147483647</configuredEntries>
    <cacheEntriesInUse>9</cacheEntriesInUse>
    <totalHits>0</totalHits>
    <totalMisses>51</totalMisses>
    <hitRatio>0.0</hitRatio>
    <cacheEntries>
        <cacheEntry>
            <ref>services/bs_dq_uri4.BusinessService</ref>
            <creationTime>2012-03-22T23:44:53.737-07:00</creationTime>
            <computeTimeMSecs>0</computeTimeMSecs>
        </cacheEntry>
        <cacheEntry>
            <ref>services/bs_dq_nopooling.BusinessService</ref>
            <creationTime>2012-03-22T23:44:53.736-07:00</creationTime>
            <computeTimeMSecs>0</computeTimeMSecs>
        </cacheEntry>
        <cacheEntry>
            <ref>services/bs_dq_uri1.BusinessService</ref>
            <creationTime>2012-03-22T23:44:53.738-07:00</creationTime>
            <computeTimeMSecs>0</computeTimeMSecs>
        </cacheEntry>
        <cacheEntry>
            <ref>services/proxy_dq_uri.ProxyService</ref>
            <creationTime>2012-03-22T23:44:53.736-07:00</creationTime>
            <computeTimeMSecs>0</computeTimeMSecs>
        </cacheEntry>
        <cacheEntry>
            <ref>services/bs_dq_conn_pooling.BusinessService</ref>
            <creationTime>2012-03-22T23:44:53.736-07:00</creationTime>
            <computeTimeMSecs>0</computeTimeMSecs>
        </cacheEntry>
        <cacheEntry>
            <ref>services/bs_dq_conn_nopooling.BusinessService</ref>
            <creationTime>2012-03-22T23:44:53.737-07:00</creationTime>
            <computeTimeMSecs>0</computeTimeMSecs>
        </cacheEntry>
        <cacheEntry>
            <ref>services/bs_dq_uri2.BusinessService</ref>
            <creationTime>2012-03-22T23:44:53.737-07:00</creationTime>
            <computeTimeMSecs>0</computeTimeMSecs>
        </cacheEntry>
        <cacheEntry>
            <ref>services/bs_dq_pooling.BusinessService</ref>
            <creationTime>2012-03-22T23:44:53.736-07:00</creationTime>
            <computeTimeMSecs>0</computeTimeMSecs>
        </cacheEntry>
        <cacheEntry>
            <ref>services/bs_dq_uri3.BusinessService</ref>
            <creationTime>2012-03-22T23:44:53.737-07:00</creationTime>
            <computeTimeMSecs>0</computeTimeMSecs>
        </cacheEntry>
    </cacheEntries>
 </derivedCache>
 ...
</derivedCaches>

19.2.3 Running a JMS Correlation Table Diagnostic Dump (OSB.jms-async-table)

Table 19-5 provides details about Service Bus JMS request/response correlation table diagnostic dumps. The information captured includes the correlation ID, expiration, and destination for each message.

Table 19-5 JMS Correlation Table Diagnostic Dumps

Dump Name Dump Parameters/Dump Mode Information Captured

OSB.jms-async-table

None

In addition to the Service Bus version, the following information is provided for each pending message in each service reference:

  • Correlation ID (could be the actual correlation ID or a message ID)

  • Expiration date and time

  • Message destination

19.2.3.1 Viewing a Description of the JMS Correlation Table Dump

To view a description of the JMS correlation table dump:

  • Run the following WLST command:

    describeDump(name='OSB.jms-async-table',appName='OSB')
    

    The name, description, and arguments for the dump appear on the console.

19.2.3.2 Running the JMS Correlation Table Dump

To run the JMS correlation table dump:

19.2.3.3 Sample Output of the JMS Correlation Table Dump

The following example is a sample of the output of a JMS correlation table dump.

<transportDiagnosticsContents xmlns="http://www.bea.com/wli/sb/transportdiags">
 <version>11.1.1.7</version>
 <transportDiagnostics transportType="jms">
   <correlationTable>
     <services>
       <service>
         <ref>default/testJmsResponseRollback_out</ref>
         <message>
           <correlationMsgId responsePattern="JMSCorrelationID">
             ID:42454153155cc06b7f5ab312000001363d5bd59effff8d4
           </correlationMsgId>
         <expirationTime>2012-03-22T19:53:43.621-07:00</expirationTime>
         <msgDestination>testJmsResponseRollback_outRequest</msgDestination>
         </message>
       </service>
     </services>
   </correlationTable>
 </transportDiagnostics>
</transportDiagnosticsContents>

19.2.4 Running an MQ Correlation Table Diagnostic Dump (OSB.mq-async-table)

Table 19-6 provides details about Service Bus MQ request/response correlation table diagnostic dumps. The information captured includes the correlation ID, expiration, and destination for each message.

Table 19-6 MQ Correlation Table Diagnostic Dumps

Dump Name Dump Parameters/Dump Mode Information Captured

OSB.mq-async-table

None

In addition to the Service Bus version, the following information is provided for each pending message in each service reference:

  • Correlation ID (could be the actual correlation ID or a message ID)

  • Expiration date and time

  • Message destination

19.2.4.1 Viewing a Description of the MQ Correlation Table Dump

To view a description of the MQ correlation table dump:

  • Run the following WLST command:

    describeDump(name='OSB.mq-async-table',appName='OSB')
    

    The name, description, and arguments for the dump appear on the console.

19.2.4.2 Running the MQ Correlation Table Dump

To run the MQ correlation table dump:

19.2.4.3 Sample Output of the MQ Correlation Table Dump

The following example is a sample of the output of an MQ correlation table dump.

Example - Sample Output of the MQ Correlation Table Dump

<transportDiagnosticsContents xmlns="http://www.bea.com/wli/sb/transportdiags">
 <version>11.1.1.7</version>
 <transportDiagnostics transportType="mq">
   <correlationTable>
     <services>
       <service>
         <ref>services/mq_Biz_cached</ref>
         <message>
           <correlationMsgId responsePattern="MQCorrelationID">
             000000000000000000000000000000000000000000000000
           </correlationMsgId>
           <expirationTime>2012-03-22T23:48:09.085-07:00</expirationTime>
           <msgDestination>rc_req</msgDestination>
         </message>
       </service>
     </services>
   </correlationTable>
 </transportDiagnostics>
</transportDiagnosticsContents>

19.3 Generating Diagnostic Dumps Using RDA

In addition to generating Service Bus diagnostic dumps using WSLT, you can also use Oracle Remote Diagnostic Agent (RDA). Before performing the following steps, make sure RDA is installed on your system.

For more information and full instructions on using RDA for Service Bus, refer to the knowledge base article, "How to Run Remote Diagnostic Agent (RDA) Against SOA Products," on Oracle support. The ID for this document is 1571554.2. This document describes an additional command to run RDA with minimal prompts. Additional information and instructions are also provided in the README files in the oracle_common directory in your Fusion Middleware home directory

To generate a diagnostic dump using RDA

  1. Set the environment variables by running the following command:
    <DOMAIN_HOME>/bin/setDomainEnv
    
  2. From a command line, run the following command:

    For Windows:

    rda.cmd -vSCRP OSB
    

    For UNIX or LINUX:

    rda.sh -vSCRP OSB
    
  3. Enter information as prompted on the command line. When asked whether you want RDA to collect Service Bus information, accept the default (Y).
  4. You can display the results in your web browser. Access the file from the output directory you specified.

    The name of the file is prefix__start.htm, where prefix is the prefix you specified.

19.4 Viewing Incident Packages with ADR Tools

ADRCI is a command-line utility that enables you to investigate problems and package and upload first-failure diagnostic data to Oracle Support Services.

ADRCI also enables you to view the names of dump files in the ADR, and to view the alert log with XML tags stripped, with and without content filtering.

For more information about ADRCI, see ADRCI: ADR Command Interpreter in Oracle Database Utilities.

19.5 Querying Problems and Incidents

The Diagnostic Framework provides WLST commands that let you view information about problems and incidents.

This includes the following:

  • Querying problems across Oracle WebLogic Servers

  • Querying incidents across Oracle WebLogic Servers

  • Viewing dump files associated with an incident on an Oracle WebLogic Server

For more information about these WLST commands, see Understanding the Diagnostic Framework in Administering Oracle Fusion Middleware and Diagnostic Commands in WLST Command Reference for WebLogic Server.