Skip Headers
Oracle® Fusion Applications Performance and Tuning Guide
11g Release 1 (11.1.2)

Part Number E16686-01
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
PDF · Mobi · ePub

2 Troubleshooting Process

This chapter discusses the basic process to use to troubleshoot Oracle Fusion Applications, and presents specific steps for the most commonly-seen problems.

This chapter includes these sections:

2.1 Introduction to Troubleshooting

This section provides guidelines and a process for using the information in this chapter. Using the following guidelines and process will focus and minimize the time you spend resolving problems.

Guidelines

When using the information in this chapter, Oracle recommends:

Process

Follow the process outlined in Table 2-1 when using the information in this guide. If the information in a particular section does not resolve your problem, proceed to the next step in this process.

Table 2-1 Process for Resolving Performance Issues

Step Section to Use Purpose

1

Chapter 1, "Monitoring and Tuning Oracle Fusion Applications"

Oracle Fusion Applications Administrator's Guide:

Chapter 9 and Chapter 10 for locating key metrics

Chapter 11 for diagnosing Java applications in the middle tier

Chapter 12 for monitoring and tuning the Oracle database

Collect symptoms about the performance problem to determine if the problem is related the following:

  • Response time or throughput

  • Widespread or limited to specific users and flows

Determine what changed since the system was last performing well.

2

Section 2.1.1 through Section 2.1.6

Use Section 2.1.1 if the problem is widespread. Otherwise, review the problem description in Section 2.1.2 through Section 2.1.6 to if there is a match

These sections describe:

  • Possible causes of the problems

  • Solution procedures corresponding to each of the possible causes

4

Section 2.3

Use My Oracle Support to get additional troubleshooting information about Oracle Fusion Applications or performance, scalability, and reliability. My Oracle Support provides access to several useful troubleshooting resources, including Knowledge Base articles and Community Forums and Discussions.

5

Section 2.3

Log a service request if the information in this chapter and My Oracle Support does not resolve your problem. You can log a service request using My Oracle Support at https://support.oracle.com.


In addition to this process, for more information about determining if database cache sizes need to be increased, see "Automatic Database Performance Monitoring" chapter in the Oracle Database 2 Day + Performance Tuning Guide to use Automatic Database Diagnostic Monitor (ADDM) reports

2.1.1 How to Troubleshoot Overall System Slowness

To troubleshoot overall system slowness:

  • Navigate to the Fusion Applications target in Enterprise Manager.

  • Select the application that is having the problem, such as Payables.

  • Navigate to the product home page, such as the Payables product home page.

  • Select Monitoring > System Performance to open the System Performance page.

  • Click the name of the target metric name for Request Processing Time. A popup displays.

  • Click Problem Analysis to see the related metrics in the problem analysis page.

    You should be able to see the Source Metric and Related Metrics in the Problem Analysis page.

  • Check if any key metrics are beyond threshold, or look for sudden changes.

    In particular, check the heap usage metrics.

    If the heap is constantly close to 100 percent, search using the string OutOfMemoryErrors in the Oracle WebLogic Server server_name.out in the following directories:

    (UNIX) DOMAIN_HOME/servers/server_name/logs
    (Windows) DOMAIN_HOME\servers\server_name\logs
    

    If there are OutOfMemoryErrors, a heap dump would have been generated in the directory specified by the -XX:HeapDumpPath parameter from the Oracle WebLogic Server startup JVM option. Submit the heap dump to Oracle Support for further analysis of what is retaining memory. In many cases a bounce is needed to resolve the issue.

  • Scroll down and you will be able to see the Related Targets at the bottom of the page.

  • Select the Related Targets Topology to view the Topology of the target on which the metric is being analyzed.

  • Review incidents and logs by clicking the View Related Log Messages link at the top right side of the page. The target context, the chart time duration and related targets are chosen to do log search.

  • Follow the appropriate triage process depending on which metric is in question.

2.1.2 How to Find the Slowest JSPX/JSFF Pages

An administrator would want this information to understand which part of the application has poor performance.

To find the slowest JSPX/JSFF pages:

  • Navigate to Middleware Targets in Oracle Enterprise Manager Fusion Applications Control.

  • Select Application Dependency and Performance from the drop-down menu.

  • Expand the ADF node in the tree on the left hand side of the screen.

  • Click JSF pages and sort by response time.

  • Look for the slowest page name, and find that page name in the tree on the left. Click to drill down.

  • If applicable, it will show the managed beans for that page.

2.1.3 How to Troubleshoot Slow UI Request

A sluggish UI response directly affects each user's experience. Administrators will want to use the suggested troubleshooting methods in this section to track down and fix slow UI response.

2.1.3.1 Troubleshooting Historical Requests

These steps are applicable to requests that have completed.

  • Find Execution Context Identifiers (ECIDs) submitted by user and find the slowest requests using server logs. Compare response time as recorded in Oracle HTML Server (OHS) and WebLogic Server access logs. If there is a big gap, check OHS health. Otherwise, look at request details in WebLogic Server.

    Find the nodes that serviced the request by searching for the ECID using Request Monitor in Enterprise Manager.

    Drill down into the JVM Diagnostics screen.

    The data displayed should be in the context of that ECID. Observe the thread state transitions and check the call stacks to find the top methods and top SQL statements.

  • Check if slowness is caused by customized code.

  • If slowness is caused by lock contention, check the thread that is holding the lock and what it is doing.

  • If the request is slow enough to cause a STUCK thread, an incident should have been generated. Search the log file for the incident number The incident directory would have a WebLogic Server diagnostic image which includes a JRockit Flight Recorder (JFR) recording from running WebLogic Server.

  • If there are not enough details in the JVM Diagnostic data, follow instructions in Section 2.2, "View Detailed Timing Of a Request Using a JRockit Flight Recorder (JFR) File" to extract and view JFR recordings.

2.1.3.2 Troubleshooting Live Requests

To troubleshoot slow UI requests that are still running, extract and view a JFR recording following instructions in Section 2.2, "View Detailed Timing Of a Request Using a JRockit Flight Recorder (JFR) File" against the server on which the user session is running.

2.1.3.3 Troubleshooting StuckThreads

Problem

Stuck threads may result if the server is nearing out of memory. If the server is close to out of memory, all requests should slow down. To resolve an out-of-memory issue, see Section 2.2, "View Detailed Timing Of a Request Using a JRockit Flight Recorder (JFR) File."

If a request is taking longer than 10 minutes, the stuck thread is reported to Oracle WebLogic Server server_name.out in the following directories:

(UNIX) DOMAIN_HOME/servers/server_name/logs
(Windows) DOMAIN_HOME\servers\server_name\logs

For example:

<Mar 4, 2011 7:44:08 AM PST> <Error> <WebLogicServer> <BEA-000337> <[STUCK] ExecuteThread: '19' for queue: 'weblogic.kernel.Default (self-tuning)' has been busy for "600" seconds working on the request "weblogic.servlet.internal.ServletRequestImpl@18986012[
GET /productManagement/faces/PimDashboardUiShellPage?_afrLoop=1398820150000&_afrWindowMode=0&_adf.ctrl-state=a44e7uxcc_13 HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/x-shockwave-flash, application/x-ms-application, application/x-ms-xbap, application/vnd.ms-xpsdocument, application/xaml+xml, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/msword, */*
Accept-Language: fr
UA-CPU: x86
...
]", which is more than the configured time (StuckThreadMaxTime) of "600" seconds
. Stack trace:
Thread-164 "[STUCK] ExecuteThread: '19' for queue: 'weblogic.kernel.Default (self-tuning)'" <alive, in native, suspended, priority=1, DAEMON> {
    jrockit.net.SocketNativeIO.readBytesPinned(SocketNativeIO.java:???)
    jrockit.net.SocketNativeIO.socketRead(SocketNativeIO.java:24)
    java.net.SocketInputStream.socketRead0(SocketInputStream.java:???)
    java.net.SocketInputStream.read(SocketInputStream.java:107)
...

In this example, the request has been running longer than the configured 600 seconds. Here is the associated stack trace showing the thread is stuck:

Thread-164 "[STUCK] ExecuteThread: '19' for queue: 'weblogic.kernel.Default (self-tuning)'" <alive, in native, suspended, priority=1, DAEMON> {
jrockit.net.SocketNativeIO.readBytesPinned(SocketNativeIO.java:???)
jrockit.net.SocketNativeIO.socketRead(SocketNativeIO.java:24)
java.net.SocketInputStream.socketRead0(SocketInputStream.java:???)
java.net.SocketInputStream.read(SocketInputStream.java:107)
...

Solution

If the stack shows the thread is waiting for a response from another server, check the status of the other server and see it has performance problems before proceeding with this solution.

To determine what the stuck thread was doing prior to becoming stuck, perform the following steps:

  1. Look at the next few log messages in server_name.outfor a message indicating an incident has been created. For example:

    <Mar 4, 2011 7:44:10 AM PST> <Alert> <Diagnostics> <BEA-320016> 
    
    <Creating diagnostic image in DOMAIN_HOME/servers
    
    /ProductManagementServer_1/adr/diag/ofm/SCMDomain/
    
    ProductManagementServer_1/incident/incdir_394 with a lockout minute 
    
    period of 1.>
    

    The above message may not always appear after each stuck thread reported. It is printed at most four times an hour. If the message does not appear, manually look for the incident directory by checking the readme file in the subdirectories under the following directories:

    (UNIX) DOMAIN_HOME/servers/server_name/adr/diag/ofm/domain_name/server_name/incident
    (Windows) DOMAIN_HOME\servers\server_name\adr\diag\ofm\domain_name\server_name\incident
    

    The incident directory contains a WLDF diagnostic image which contains the JFR recording, and a file containing the thread dump

    For more information about diagnosing incidents, see the "Diagnosing Problems" chapter in the Oracle Application Server Administrator's Guide.

  2. Review thread dump to see call stack of the thread. If thread is blocked waiting for lock, check what the thread holding the lock is doing.

  3. If call stack involves executing JDBC calls, you can go to Grid Control and check the top activity around that time window, and see if there is a session with a matching module and action. See "Finding the Top SQL Queries" in the Oracle Fusion Applications Administrator's Guide.

  4. Review the JRockit flight recording file JRockitFlightRecorder.jfr for more details. You will also need the ECID of the request which is recorded in the readme.txt file of the incident directory, and also the Oracle WebLogic Server log.

  5. Perform the tasks in Section 2.2, "View Detailed Timing Of a Request Using a JRockit Flight Recorder (JFR) File."

Since the ECID of the request that caused the stuck thread is recorded in the error message, you also can follow the steps for troubleshooting slow requests that already have completed as documented in Section 2.1.3.1, "Troubleshooting Historical Requests."

2.1.3.4 Troubleshooting Slow Requests Using JFR Recording

See Section 2.2, "View Detailed Timing Of a Request Using a JRockit Flight Recorder (JFR) File."

2.1.3.5 Troubleshooting Memory Leaks and Heap Usage Pressure

Problem

Application performance degrades over time, heap usage and garbage collection activity increases overtime, sometimes OutOfMemoryErrors are seen.There could be memory leaks in the application, which causes the amount of free memory in the JVM to continuously decrease.

Solution

To solve this problem, perform the following:

  1. Review the server_name.out file for OutOfMemoryErrors errors, which indicate a heap dump file has been written. The server_name.out file is located in the following directories:

    (UNIX) DOMAIN_HOME/servers/server_name/logs
    (Windows) DOMAIN_HOME\servers\server_name\logs
    
  2. Restart the Managed Server.

    See the following documentation resources to learn more about other methods for starting and stopping the Managed Servers:

    If the problem persists, proceed to Step 3.

  3. Open the file with a heap-dump analysis tool that can handle binary HPROF format, such as Eclipse Memory Analyzer.

  4. Review what objects and classes are retaining most memory. Send the heap dump file to Oracle Support for further analysis.

  5. Sometimes it may be necessary to take several heap dumps to see what objects or classes are consuming and increasing the amount of memory.

    To take heap dumps on demand, use the jrcmd command-line tool. See the "Running Diagnostic Commands" chapter in the Oracle JRockit JDK Tools Guide. Many heap dump analysis tools, such as Eclipse Memory Analyzer, enable you to compare two heap dumps to identify memory growth areas.

    Heap dumps provide information on why memory is retained. Sometimes it is necessary to know how memory is allocated to further resolve the issue. For these cases, proceed to Step 6.

  6. Use the JRockit Memory Leak Detector tool that is part of JRockit Mission Control Client to understand how memory is allocated.

    For more information, see the JRockit Mission Control online help.

2.1.4 How to Troubleshoot Slow Batch Jobs

Batch jobs are often critical parts of key business processes. If throughput or execution time is beyond expectation, these steps may help diagnose the root cause.

2.1.4.1 Troubleshooting Jobs That Are in Wait/Ready/Blocked State for a Long Time

  • Launch Oracle Enterprise Manager Cloud Control and look up the WebLogic Server domain target, or launch Fusion Applications Control for the domain.

  • Expand Scheduling Service and select the one to which the job is submitted.

  • Select Job Request > Search Job Request.

  • Enter the appropriate search criteria.

  • Expand the display to the request details. The top of the page will have an information block explaining why the job is not yet run.

  • If many jobs are in a wait state, check Performance > Historical Reports > View: Requests Metrics By Work Assignment.

  • Inspect Wait Time versus Processing Time.

    • If Wait Time is high, check several jobs and see why they are in a wait state.

    • If there is spare CPU capacity on the servers where the jobs are running, consider adding more threads to the work assignment.

2.1.4.2 Troubleshooting Jobs in Running State for a Long Time

See the Oracle Fusion Applications Administrator's Guide.

  • Sometimes a job could be in a running state even if it had completed. See the ESS Troubleshooting Guide to first determine if job is still running.

  • Otherwise, open the request details page and click View Log Message to get the ECID. Go to the Java Diagnostics for the cluster to which the ESS job sends the request, such as for service calls, search for the ECID and see what, if any, Java Diagnostics data is recorded.

2.1.4.3 Troubleshooting Slow BI Publisher Jobs

See "Troubleshooting Oracle Enterprise Scheduler" in the Oracle Fusion Applications Administrator's Guide.

  • Open the home page of the Oracle Fusion instance / product family / product in Cloud Control. The Top Long Running Job Requests and Recently Completed Job Requests region shows the health of ESS jobs. Since BI Publisher jobs are scheduled as ESS jobs, the health of those jobs will be shown in this region.

  • Click the request ID link to show the Request Detail page.

  • Select Actions > Request Log in the right hand side drop down and look for the log entries specific to this job request.

  • Select Actions > JVM Diagnostics in the right hand side drop down and view the health of this job request from the JVM perspective.

    • The JVM page shows the data pertaining to the ECID corresponding to this job request.

    • Inspect thread, cpu, memory and database diagnostic data for this specific job.

2.1.4.4 Troubleshooting Slow SOA Jobs

  • Open the home page of the Oracle Fusion product family in Cloud Control. The Recently Completed SOA Instances region shows the health of recently-completed SOA jobs in the product family context.

  • Click Composite Name to show the detailed information about the composite in the home page.

  • Click Dehydration Diagnostics to view the overall database activity associated with the dehydration store and any abnormal bottlenecks. To view database diagnostics, click individual SQL IDs.

  • Click Faults and Rejected Message to view more details about faulted SOA instances.

  • Select SOA Composites > Logs > View Log Messages to view the related log entries.

  • Select SOA Composites > Trace Instance to trace a particular SOA instance. To display all SOA instances, click Search without specifying a filter.

  • Click the instance id to view the complete trace details for this instance.

  • Click JVM Diagnostics to view the health of this SOA instance from a JVM perspective.

    • The JVM page shows the data pertaining to the ECID corresponding to this job request.

    • Inspect thread, cpu, memory and database diagnostic data for this specific job.

2.1.4.5 Troubleshooting Slow SQL Jobs

  • From Cloud Control, look up the domain target, or go to Fusion Applications Control.

  • Expand Scheduling Service and select the appropriate scheduling service group.

  • On the right side, from the drop-down menu, select Job Requests > Search Job Requests and search for the job of interest.

  • On the job details page, click the eye-glasses icon next to the Execution Type field. This will display a pop-up with the database session id that was used to process this job.

  • From Enterprise Manager, follow these steps to get an ASH report for the session within the relevant time window to inspect top SQL statements and top wait events.

    • On the Performance page in the Average Active Sessions section, click Run ASH Report.

    • On the resulting Run ASH Report page, enter the date and time for the start and end of the time period for the report. (Presumably when some transient performance problem occurred.)

    • Click Generate Report and wait for the resulting ASH report that will appear under Report Results on the Run ASH Report page.

    • Click Save to File if you want to save the report in HTML format for future analysis.

2.1.4.6 Troubleshooting Slow Java Jobs

  • Open the request details page and click View Log Message to get the ECID.

  • Open the Java Diagnostics for the cluster to which the ESS job sends the request, such as for service calls, or the ESS server itself if most of the logic executes there.

  • Search for the ECID to view any Java Diagnostics data that is recorded.

2.1.5 How to Troubleshoot a Slow BPEL Instance

  • From Fusion Applications Control, select the SOA instance under SOA.

  • Select Instances and search for the instance in question.

  • Click the Instance link to display the flow trace to obtain the ECID.

  • Click the BPEL component to see the audit trail that records the completion time for each step.

  • Look for big gaps in the timing.

    If a gap is due to a service call, use the request monitor and search for the ECID, check JVMD or JRockit Flight Recorder (JFR) to determine why it took so long.

2.1.5.1 Troubleshooting High Connection Usage

Problem

The connection usage on the Oracle Database is high, or there is an Oracle process on the database host consuming high amount of CPU.

Solution

To find out the source of the connection causing the high CPU on To adjust the reference pool size from Fusion Applications Control:

  1. Oracle Fusion Applications set values on a number of v$session attributes to indicate how the connection is being used. When looking at a connection consuming high CPU on the database, or when trying to understand what connections are used for what processes, inspect the value of these attributes as follows:

    Attribute in v$session Value Being Set
    Process Data Source Name (for example, ApplicationDB)
    Program Oracle WebLogic Server Domain plus the Managed Server name, prefixed by DS (for example, DS/FinancialDomain/AccountsReceivableServer_1)
    Module Oracle Application Development Framework: ADF BC application module name

    Oracle Enterprise Scheduler:

    • Java job type: Class name, except oracle.apps

    • PLSQL: the package and procedure name (for example, mypkg.myproc)

    • Other jobs: Static: Executable name

    Oracle BI Publisher: Name of the report

    Action Oracle Application Development Framework: jspx name

    Oracle Enterprise Scheduler: Job definition name

    Oracle BI Publisher, if request is submitted:

    • Oracle Enterprise Scheduler: Oracle Enterprise Scheduler job definition name

    • Oracle BI Publisher Scheduler Job: Oracle BI Publisher job name submitted by the user

    • Oracle BI Publisher online: Static string BIP:Online

    • Oracle BI Publisher Web services: Name of the web services

    Client_Identifier Application User Name

  2. If the error messages related to connection pool capacity being reached are also seen in Oracle WebLogic Server logs, use the solution for connection leaks described in Section 2.1.5.2.

2.1.5.2 Troubleshooting Connection Leaks

Problem

When there are errors in the log, and the error message indicates connection pool size has been reached

Solution

To resolve this problem:

  1. When data source is at maximum capacity and there are errors during connection reservation requests, then there may be connection leaks in the code

  2. Enable JDBC profiling from the Oracle WebLogic Server Administration Console:

    1. In the Domain Structure, expand Services and then Data Sources.

    2. Click on the data source that needs to profiled, for example, ApplicationDB.

    3. In the Settings page, click on the Configuration tab, then click on Diagnostics subtab.

    4. Check the profiles that need to be collected (PROFILE_TYPE_CONN_USAGE_STR).

    5. Click Save.

  3. Configure the diagnostic archive where the profiling data is saved from the Oracle WebLogic Server Administration Console:

    1. In the Domain Structure, expand Services, Diagnostics, and then Archives.

    2. Click on the server where you want to make changes (archives are stored for each server)

    3. In the Settings page, you can change archive location, size and how to retire data.

    4. Check the profiles that need to be collected (PROFILE_TYPE_CONN_USAGE_STR).

    5. Click Save.

  4. To retrieve profiling data, use the sample code (http://download.oracle.com/docs/cd/E15051_01/wls/docs103/wldf_configuring/access_diag_data.html#wp1100898), with changes to the URL, username and password in the initialize method.

  5. Run the sample code as a standalone program.

  6. The program will capture the stack trace for each request for a connection from that data source. Inspect the callers to see the suspicious stack. This sample program requires connecting to a live Oracle WebLogic Server instance.

    The diagnostic archive file under the archive location can also be provided to Oracle Support for further analysis.

    Oracle WebLogic Server will not report a leak unless inactive connection timeout connection pool setting is set to a positive value. This cannot be done for Oracle Fusion Applications, as it will break functionality.

2.1.5.3 Troubleshooting Slow Requests Using SQL Trace

When a user reports that a specific operation is slow, and the slowness is reproducible and that slow database operations are suspected but the top activity reports did not provide sufficient information for resolving the problem.

To resolve this problem:

  1. Enable SQL trace for the user session. See "Using SQL Tracing" in the Oracle Fusion Applications Administrator's Guide.

  2. Ask user to re-run the problematic flow and collect the SQL trace files and review

2.1.5.4 Troubleshooting Slow Oracle Enterprise Scheduler Jobs of SQL Type

When the user submits a SQL job type, the job remains in a RUNNING state for too long.

Solution

To resolve this problem, perform the following steps:

  1. Use Fusion Applications Control to find the database session ID that was used to process the job:

    1. Search for the request, as described in "Searching for Oracle Enterprise Scheduler Job Requests" in the Oracle Fusion Applications Administrator's Guide.

    2. On the Request Details page, in the Request Properties section, next to the Execution Type field, click the eye glasses icon.

      The Spawned Process Details dialog displays. This will bring up a pop-up with the database session id that was used to process this job

    3. Take note of the value in the Session Id field, and then click OK.

  2. Use Grid Control to ensure the request processor and request dispatcher are running:

    1. Run an Active Session History (ASH) report for the session within the relevant time window to inspect top SQL statements and top wait events. See the "Resolving Transient Performance Problems" section in the Oracle Database 2 Day + Performance Tuning Guide.

    2. Identify time consuming SQL statements and tune following normal SQL tuning procedures. See "Finding the Top SQL Queries" in the Oracle Fusion Applications Administrator's Guide.

2.1.5.5 Troubleshooting Excessive Activation

Problem

When response time suddenly increases with rising user count, even though there is no memory pressure, it is possible that the reference pool size for key application modules needs to be increased. If there is a JFR recording to review, and you observe many events containing callstacks containing the activateState method, you should also try adjusting the reference pool size.

Solution

To adjust the reference pool size from Fusion Applications Control:

  1. Review the number of web sessions from Performance Summary pages:

    For Fusion Applications Control:

    1. From the navigation pane, expand the farm, Application Deployments.

    2. From the Applications Deployments page, select the application.

    3. From the Application Deployments menu, choose ADF > ADF Performance.

      The ADF Performance page displays.

    4. Click the Application Module Pools tab.

    5. Sort the request by descending order.

    6. For the top 10 or so application modules, click the application module name to view the Activations count.

    For Grid Control:

    1. Click the Targets tab.

    2. Click the Middleware secondary tab.

    3. From the Search list, select Oracle WebLogic Server Domain, and then click Go.

    4. Click on a domain.

      The WebLogic Server Domain home page displays.

    5. In the table on the right-hand side of the page, expand the Application Deployments node.

    6. Click the target application.

      The Application Deployment page displays.

    7. From the Application Deployments menu, choose ADF > ADF Performance.

      The ADF Performance page displays.

    8. In the Application Module Pools table, from the View list, select Total Requests, and once selected, from the Total Requests column, click Sort Descending.

    9. For the top 10 or so application modules, click to see the details of each one.

    10. After selecting an application module, on the Requests graph, from the Select metric to display in chart list, select Passivation and Activation to add to the graph.

      If activation count is close to passivation and is constantly above 0, then following Step 2 to adjust

  2. If the activation count constantly increases, increase the application module reference pool size from Fusion Applications Control:

    1. From the Application Deployments menu, choose ADF > Configure ADF Business Components.

      The ADF Configuration BC Configurations page displays.

    2. From the Application Modules section, click the application module of interest. From the left hand side, select the local by selecting a name that ends in Local.

    3. Click the Pooling and Scalability tab, and change the Reference Pool Size parameter.

2.1.6 How to Troubleshoot Oracle Business Intelligence

The usual indication that you should troubleshoot Oracle BI will be sluggish performance of BI components embedded across various applications. BI components can be in the form of such things as charts, tables, dashboards, and queries. Many of the configuration issues can be detected from the nqquery.log and nqserver.log log files.

A configuration problem generally will produce an error message. You would examine the same two log files for either sluggish response or an error.

Oracle BI Query Logging

To debug a query issue, you need to enable query logging. Query logging can be enabled by setting the LOGLEVEL variable to a value from 0 through 7. 0 denotes no logging, and 7 will generate detailed logging.

It is sufficient to use loglevel 2 to obtain logical BI server queries and the corresponding physical database queries in the log file.

Figure 2-1 shows how to use the Variable Manager to set the log level to 2.

Figure 2-1 Setting the Log Level for Oracle BI

Setting the Log Level for Oracle BI

Once you have enabled logs, you can obtain log files from $ORACLE_INSTANCE/diagnostics/logs/OracleBIServerComponent/coreapplication_obis1/

Relevant files are:

  • nqquery.log - Will contain logical SQL, physical SQL and an execution plan chosen by OBIS for the logical SQL.

  • nqserver.log - Will contain server-related data, such as initialization block-related errors.

Note that a single logical SQL can spawn multiple physical SQL statements.

If you run large queries and logs are being rotated quickly, you can modify parameters in logconfig.xml to control the frequency for log rotation. logconfig.xml is located at $ORACLE_INSTANCE/config/OracleBIServerComponent/coreapplication_obis1/logconfig.xml where ORACLE_INSTANCE is a path similar to /u47/st99/instance/BIInstance.

Once you have logging enabled, you can search for response time and identify where time is being spent by the query. In this example of a useful search string, logical query execution took 1 second, and physical query execution also took 1 second.

Physical query response time 1 (seconds)
Logical Query Summary Stats: Elapsed time 1, Response time 1, Compilation time 0 (seconds)

BI Connection Pool Settings

If you anticipate a higher load on a system, you can change the number of Maximum connections for various data sources to make resource use more efficient, as shown in Figure 2-2. This change must be propagated if the location of the Business Intelligence metadata is replaced.

Figure 2-2 Changing the Maximum Number of Connections

Changing the Maximum Number of Connections

2.2 View Detailed Timing Of a Request Using a JRockit Flight Recorder (JFR) File

Problem

Certain requests are slow and there is a need to find out where time is spent

Solution

The JRockit Flight Recorder (JFR) file contains a record of various events that consume time, and can be used to help understand why a request is taking time

To resolve this problem, create a JFR file:

  1. Extract a JFR file from an Oracle WebLogic Server server by running the following command:

    (UNIX) JROCKIT_HOME/bin/jrcmd jrockit_pid dump_flightrecording recording=1 copy_to_file=path compress_copy=true
    (Windows) JROCKIT_HOME\bin\jrcmd.exe jrockit_pid dump_flightrecording recording=1 copy_to_file=path compress_copy=true
    

    See the "Running Diagnostic Commands" chapter in the Oracle JRockit JDK Tools Guide for more information about the jrcmd command-line tool.

  2. To view the file, start the JRockit Mission Control Client from the following directories:

    (UNIX) JAVA_HOME/bin/bin/jrmc
    (Windows) JAVA_HOME\bin\jrmc.exe
    
  3. Choose File > Open File to select the JFR file.

  4. Locate the slowest requests or investigate a specific request:

    To locate the slowest requests: To investigate a specific request:
    1. In the JRockitFlightRecorder.jfr page, click the Events icon.
    2. Click the Log tab at the bottom of the page.

    3. In the Event Type navigation pane on the left, locate Dynamic Monitoring System and then HttpRequest.

    4. Click HTTP request; de-select all the other event types.

    5. In the Log tab, in the Event Log section, click the Duration column to sort the duration in descending order.

      Each row corresponds to a HTTP Request and the duration column shows the response time for that request.

    6. Click the row in the table to view the attributes of the requests.

    7. In the Event Attributes sections, note the start time and the thread that serviced the request.

    1. Find the Execution Context Identifier (ECID) of that request.

      If the request is related to an incident triggered by a STUCK thread, the incident readme.txt file will contain the ECID.

      Alternatively, you can search the Oracle WebLogic Server HTTP access.log for requests from specific users. See the "Viewing and Searching Log Files" section in the Oracle Application Server Administrator's Guide.

    2. In the JRockit Mission Control Client, in the JRockitFlightRecorder.jfr page, choose the WebLogic icon, and then

      If the Weblogic icon is not available, choose Help > Install Plugins to download the Oracle WebLogic Server plug-in.

    3. Click the ECIDs tab at the bottom of the age.

    4. In the ECIDs section, from Filter Column list, select ECID.

    5. Enter the ECID in the search box and choose <Enter>.

    6. In the results table, highlight the row with the matching ECID and right-click to bring up the menu.

    7. Choose Operative Set > Clear, and then Operative Set > Add matching ECID > ECID to add the ECID to the operative set.

      This enables users to view only events associated with the operative set.

    8. Click the Events icon.

    9. In the Event Type navigation pane on the left, locate Dynamic Monitoring System and then HttpRequest.

    10. Click HTTP request; de-select all the other event types. ** In the Event Log section, click Show Only Operative Set.

      Each row corresponds to the request with the matching ECID

    11. Click the row in the table to view the attributes of the requests.

    12. Note the start time and the thread that serviced the request


  5. Once the start time and the thread that serviced the request are identified, in the Logs tab, drag the time selector at the top of the screen to include only the time window for the duration of the request.

  6. In the Event Log section, perform the following search:

    1. Deselect Show Only Operative Set.

    2. Enter the thread name in the search box.

    3. From the Filter Column list, select Thread.

    4. Choose <Enter>.

  7. In the Event Type navigation pane on the left, click the events of interest. Typically, these events are located under nodes Dynamic Monitoring System, Java Application, and WebLogic > JDBC.

    The selected events appear in the table in the Event Log section.

  8. Click the Start Time column to sort y the time when these events occur, or click the Duration column to view the events that took longest

    The JDBC Statement Execute events corresponds to SQL execution. If there are slow SQL statements, the event details give the SQL text. These events do not have callstacks.

  9. To see to callstack for slow SQL statements, view the Socket Read event that happens right after the JDBC Statement Execute event.

    This event corresponds to Oracle WebLogic Server waiting for the SQL results to return, and it has callstack in the event details.

  10. Review the callstacks for long Java Blocked and Java Wait events to see if the cause can be identified. See the "Analyzing Flight Recorder Data in JRockit Mission Control" section in the Oracle Fusion Middleware Configuring and Using the Diagnostics Framework for Oracle WebLogic Server.

  11. If more details are needed to compare with what is captured in the default recording, and the user can reproduce the slowness, start an explicit recording. See the "Starting an Explicit Recording" section in the Oracle JRockit Flight Recorder Run Time Guide.

2.3 Using My Oracle Support for Additional Troubleshooting Information

You can use My Oracle Support (formerly MetaLink) to help resolve Oracle Fusion Applications problems. My Oracle Support contains several useful troubleshooting resources, such as:

Note:

You can also use My Oracle Support to log a service request.

You can access My Oracle Support at https://support.oracle.com.