Skip Headers
Oracle® Database 2 Day + Performance Tuning Guide
11g Release 1 (11.1)

B28275-02
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Master Index
Master Index
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
PDF · Mobi · ePub

7 Resolving Transient Performance Problems

Transient performance problems are short-lived and do not appear in the Automatic Database Diagnostic Monitor (ADDM) analysis. ADDM tries to report the most significant performance problems during an analysis period in terms of their effect on DB time. If a problem lasts for a brief time, then its severity might be averaged out or minimized by other performance problems in the entire analysis period. Therefore, the problem may not appear in the ADDM findings. Whether or not a performance problem is captured by ADDM depends on its duration compared to the interval between the Automatic Workload Repository (AWR) snapshots.

If a performance problem lasts for a significant portion of the time between snapshots, then it will be captured by ADDM. For example, if the snapshot interval is one hour, a performance problem that lasts 30 minutes should not be considered a transient performance problem because its duration represents a significant portion of the snapshot interval and will likely be captured by ADDM.

On the other hand, a performance problem that lasts 2 minutes could be a transient performance problem because its duration represents a small portion of the snapshot interval and will likely not show up in the ADDM findings. For example, if the system was slow between 10:00 p.m. and 10:10 p.m., but the ADDM analysis for the time period between 10:00 p.m. and 11:00 p.m. does not show a performance problem, a transient performance problem may have occurred for only a few minutes of the 10-minute interval reported by the user.

This chapter contains the following sections:

Overview of Active Session History

To capture a detailed history of database activity, Oracle Database samples active sessions each second with the Active Session History (ASH) sampler. The Automatic Workload Repository (AWR) snapshot processing collects the sampled data into memory and writes it to persistent storage. ASH is an integral part of the Oracle Database self-management framework and is extremely useful for diagnosing performance problems.

Unlike instance-level statistics gathered by AWR, sampled data is gathered at the session level by ASH. By capturing statistics for only active sessions, a manageable set of data is represented. The size of this data is directly related to the work being performed, rather than to the entire database instance.

Sampled data captured by ASH can be aggregated based on the various dimensions that it captures, including the following:

You can run ASH reports to analyze transient performance problems with the database that only occur during specific times. This technique is especially useful when you are trying to do either of the following:

Running Active Session History Reports

This section describes how to generate ASH reports using Oracle Enterprise Manager (Enterprise Manager).

To run ASH reports:

  1. On the Performance page, under Average Active Sessions, click Run ASH Report.

    The Run ASH Report page appears.

  2. Enter the date and time for the start and end of the time period when the transient performance problem occurred.

    In this example, database activity increased between 1:45 p.m. and 2:00 p.m., so an ASH report needs to be created for that time period.

    Description of run_ash.gif follows
    Description of the illustration run_ash.gif

  3. Click Generate Report.

    The Processing: View Report page appears while the report is being generated.

    After the report is generated, the ASH report appears under Report Results on the Run ASH Report page.

    Description of ash_report.gif follows
    Description of the illustration ash_report.gif

  4. Optionally, click Save to File to save the report in HTML for future analysis.

Active Session History Reports

You can use an ASH report to identify the source of transient performance problems. The report is divided into titled sections. The following sections of the ASH report are a useful place to begin the investigation:

See Also:

Top Events

The Top Events section of the report describes the top wait events of the sampled session activity categorized by user, background, and priority. Use this information to identify the wait events that may be the cause of the transient performance problem.

The Top Events section of the report contains the following subsections:

Top User Events

The Top User Events subsection of the report lists the top wait events from user processes that accounted for the highest percentages of sampled session activity.

The example in Figure 7-1 shows that 84 percent of database activity is consumed by the CPU + Wait for CPU event. In this example, the Load Profile section should be examined next to determine the type of activity that is causing this wait event.

Figure 7-1 Top User Events

Description of Figure 7-1 follows
Description of "Figure 7-1 Top User Events"

Top Background Events

The Top Background Events subsection of the report lists the top wait events from the background events that accounted for the highest percentages of sampled session activity.

The example in Figure 7-2 shows that 17.65 percent of sampled session activity is consumed by the CPU + Wait for CPU event.

Figure 7-2 Top Background Events

Description of Figure 7-2 follows
Description of "Figure 7-2 Top Background Events"

Load Profile

The Load Profile section of the report describes the load analyzed in the sampled session activity. Use the information in this section to identify the service, client, or SQL command type that may be the cause of the transient performance problem. The Top Service/Module subsection lists the services and modules that accounted for the highest percentages of sampled session activity.

The example in Figure 7-3 shows that 81 percent of database activity is consumed by the SYS$USERS service running the SQL*Plus module. In this example, it appears that the user is running a high-load SQL statement that is causing the performance problem indicated in Figure 7-1. The Top SQL section of the report should be analyzed next to determine whether a particular type of SQL statement makes up the load.

Figure 7-3 Top Service/Module

Description of Figure 7-3 follows
Description of "Figure 7-3 Top Service/Module"

Top SQL

The Top SQL section of the report describes the top SQL statements of the sampled session activity. Use this information to identify high-load SQL statements that may be the cause of the transient performance problem. One useful subsection is Top SQL with Top Events, which lists the SQL statements that accounted for the highest percentages of sampled session activity. The Sampled # of Executions column shows how many distinct executions of a particular SQL statement were sampled. To view the text of the SQL statements, click the SQL ID link.

The example in Figure 7-4 shows that 75 percent of database activity is consumed by a particular SELECT statement. This statement was executed in the SQL*Plus module shown in Figure 7-3. It appears that this high-load SQL statement is causing the performance problem. The Top Sessions section should be analyzed to identify the session running this SQL statement.

Figure 7-4 Top SQL with Top Events

Description of Figure 7-4 follows
Description of "Figure 7-4 Top SQL with Top Events"

Top Sessions

The Top Sessions section lists the sessions that were waiting for the wait event that accounted for the highest percentages of sampled session activity. Use this information to identify the sessions that accounted for the highest percentages of sampled session activity, which may be the cause of the performance problem.

The example in Figure 7-5 shows that 81 percent of database activity is used by the user SH with the session ID of 147. Thus, it appears that this user was running the high-load SQL statement identified in Figure 7-4. You should investigate this session to determine whether it is performing a legitimate operation and tune the SQL statement if possible. If tuning the SQL is not possible and the session is causing an unacceptable performance impact on the system, consider terminating the session.

Top DB Objects

The Top DB Objects subsection lists the database objects (such as tables and indexes) that accounted for the highest percentages of sampled session activity.

The example in Figure 7-6 shows that the objects accounting for the most session activity are in the SYSTEM and SYSAUX tablespaces. In each row of the table, the event is db file sequential read, which signifies that a user process is reading a buffer into the system global area (SGA) buffer cache and is waiting for a physical I/O call to return.

Figure 7-6 Top DB Objects

Description of Figure 7-6 follows
Description of "Figure 7-6 Top DB Objects"

Top DB Files

The Top DB Files subsection lists the database files that accounted for the highest percentages of sampled session activity.

The example in Figure 7-7 shows that most of the session activity involves the datafile in the SYSTEM tablespace. This information is consistent with Figure 7-6, which shows that the objects accounting for the most session activity are located in the SYSTEM and SYSAUX tablespaces.

Activity Over Time

The Activity Over Time section of the ASH report is particularly useful for longer time periods because it provides in-depth details about activities and workload profiles during the analysis period. The Activity Over Time section is divided into multiple time slots.

Figure 7-8 Activity Over Time

Description of Figure 7-8 follows
Description of "Figure 7-8 Activity Over Time"

Each of the time slots contains information regarding that particular time slot, as described in Table 7-1.

Table 7-1 Activity Over Time

Column Description

Slot Time (Duration)

Duration of the slot

Slot Count

Number of sampled sessions in the slot

Event

Top three wait events in the slot

Event Count

Number of ASH samples waiting for the wait event

% Event

Percentage of ASH samples waiting for wait events in the entire analysis period


All inner slots are 2 minutes each and can be compared to each other. The first and last slots, which are also called the outer slots, are odd-sized because they are the only slots that do not have a fixed slot time.

When comparing the inner slots, perform a skew analysis by identifying spikes in the Event Count and Slot Count columns. A spike in the Event Count column indicates an increase in the number of sampled sessions waiting for a particular event. A spike in the Slot Count column indicates an increase in active sessions, because ASH data is sampled from active sessions only and a relative increase in database workload. Typically, when the number of active session samples and the number of sessions associated with a wait event increase, the slot may be the cause of the transient performance problem.

The example in Figure 7-8 indicates that the number of sampled sessions rose sharply in the first inner slot and fell sharply in the last inner slot. The slot count and event count peaked in the 13:54 p.m. time slot.