Oracle8 Tuning
Release 8.0






Prev Next

Tuning CPU Resources

This chapter describes how to identify and solve problems with central processing unit (CPU) resources. Topics in this chapter include

Understanding CPU Problems

Establish appropriate expectations for the amount of CPU resources your system should be using. You can then distinguish whether or not sufficient CPU resources are available, and know when your system is consuming too much of those resources. Begin by determining the amount of CPU resources the Oracle instance utilizes in three cases:

Workload is a very important factor when evaluating your system's level of CPU utilization. During peak workload hours, 90% CPU utilization with 10% idle and waiting time may be understandable and acceptable; 30% utilization at a time of low workload may also be understandable. However, if your system shows high utilization at normal workload, there is no room for peak workload. For example, Figure 13-1 illustrates workload over time for an application which has peak periods at 10:00 AM and 2:00 PM.

Figure 13-1 Average Workload and Peak Workload

This example application has 100 users working 8 hours a day, for a total of 800 hours per day. If each user enters one transaction every 5 minutes, this would mean 9,600 transactions per day. Over the course of 8 hours, the system must support 1,200 transactions per hour, which is an average of 20 transactions per minute. If the demand rate were constant, you could build a system to meet this average workload.

However, usage patterns form peaks and valleys-and in this context 20 transactions per minute can be understood as merely a minimum requirement. If the peak rate you need to achieve is 120 transactions per minute, then you must configure a system that can support this peak workload.

For this example, assume that at peak workload Oracle can use 90% of the CPU resource. For a period of average workload, then, Oracle should be using no more than about 15% of the available CPU resource.

15% = 20 tpm/120 tpm * 90%

If the system requires 50% of the CPU resource to achieve 20 transactions per minute, then it is clear that a problem exists: the system cannot possibly achieve 120 transactions per minute using 90% of the CPU. However, if you could tune this system so that it does achieve 20 transactions per minute using only 15% of the CPU, then (assuming linear scalability) the system might indeed attain 120 transactions per minute using 90% of the CPU resources.

Note that as users are added to an application over time, the average workload can rise to what had previously been peak levels. No further CPU capacity is then available for the new peak rate, which is actually higher than before.

How to Detect and Solve CPU Problems

If you suspect a problem with CPU usage, you must evaluate two areas:

Checking System CPU Utilization

Oracle statistics report CPU utilization only of Oracle sessions, whereas every process running on your system affects the available CPU resources. Effort spent tuning non-Oracle factors can thus result in better Oracle performance.

Use operating system monitoring tools to see what processes are running on the system as a whole. If the system is too heavily loaded, check the memory, I/O, and process management areas described later in this section.

Tools such as sar -u on many UNIX-based systems enable you to examine the level of CPU utilization on your entire system. CPU utilization in UNIX is described in statistics that show user time, system time, idle time, and time waiting for I/O. A CPU problem exists if idle time and time waiting for I/O are both close to zero (less than 5%) at a normal or low workload.

Performance Monitor is used on NT systems to examine CPU utilization. It provides statistics on processor time, user time, privileged time, interrupt time, and DPC time. (NT Performance Monitor is not the same as Performance Manager, which is an Oracle Enterprise Manager tool.)

Attention: This section describes how to check system CPU utilization on most UNIX-based and NT systems. For other platforms, please check your operating system documentation.

Memory Management

Check the following memory management issues:

Paging and Swapping. Use the appropriate tools (such as sar or vmstat on UNIX or Performance Monitor on NT) to investigate the cause of paging and swapping, should they occur.

Oversize Page Tables. On UNIX systems, if the processing space becomes too large, it may result in the page tables becoming too large. This is not an issue on NT systems.

I/O Management

Check the following I/O management issues:

Thrashing. Make sure that your workloads fits in memory so that the machine is not thrashing (swapping and paging processes in and out of memory). The operating system allocates fixed slices of time during which CPU resources are available to your process. If the process squanders a large portion of each time slice checking to be sure that it can run, that all needed components are in the machine, it may be using only 50% of the time allotted to actually perform work.

Client/Server Round Trips. The latency of sending a message may result in CPU overload. An application often generates messages that need to be sent through the network over and over again. This results in a lot of overhead that must be completed before the message is actually sent. To alleviate this problem you can batch the messages and perform the overhead only once, or reduce the amount of work. For example, you can use array inserts, array fetches, and so on.

Process Management

Check the following process management issues:

Scheduling and Switching. The operating system may spend a lot of time in scheduling and switching processes. Examine the way in which you are using the operating system: you could be using too many processes. On NT systems, do not overload your server with a great deal of non-Oracle processes.

Context Switching. Due to operating system specific characteristics, your system could be spending a lot of time in context switches. This could be expensive, especially with a very large SGA. Note that context switching is not an issue on NT, which has only one process per instance; all threads share the same page table.

Programmers often create single-purpose processes on the fly; then they exit the process, and create a new one so that the process is re-created and destroyed all the time. This is very CPU intensive, especially with large SGAs, because you have to build up the page tables each time. The problem is aggravated when you nail or lock shared memory, because you have to touch every page.

For example, if you have a 1 gigabyte SGA, you may have page table entries for every 4K, and a page table entry may be 8 bytes. You could end up with
(1G /4K) * 8B entries. This becomes expensive, because you have to continually make sure that the page table is loaded.

Parallel query and multithreaded server are areas of concern here if MINSERVICE has been set too low (set to 10, for example, when you need 20).

For the user, doing small lookups may not be wise. In a situation like this, it becomes inefficient for the user and for the system as well.

Checking Oracle CPU Utilization

This section explains how to examine the processes running in Oracle. Two dynamic performance views provide information on Oracle processes:

For example, if you have 8 CPUs, then for any given minute in real time, you have 8 minutes of CPU time available. On NT and UNIX-based systems this can be either user time or time in system mode ("privileged" mode, in NT). If your process is not running, it is waiting. CPU time utilized by all systems may thus be greater than one minute per interval.

At any given moment you know how much time Oracle has utilized the system. So if 8 minutes are available and Oracle uses 4 minutes of that time, then you know that 50% of all CPU time is used by Oracle. If your process is not consuming that time, then some other process is. Go back to the system and find out what process is using up the CPU time. Identify the process, determine why it is using so much CPU time, and see if you can tune it.

The major areas to check for Oracle CPU utilization are:

This section describes each area, and indicates the corresponding Oracle statistics to check.

Reparsing SQL Statements

Ineffective SQL sharing can result in reparsing.

  1. Begin by checking V$SYSSTAT to see if parsing in general is a problem:
    ('parse time cpu', 'parse time elapsed', 'parse count (hard)');

In interpreting these statistics, remember

In this way you can detect the general response time on parsing. The more your application is parsing, the more contention exists and the more time you will spend waiting. Note that

  • Next, query V$SQLAREA to find frequently reparsed statements:
  • Now that you have identified problematic statements, you have the following three options for tuning them:
    • Rewrite the application so statements do not continually reparse.
    • If this is not possible, reduce parsing by using the initialization parameter SESSION_CACHED_CURSORS.
    • If the parse count is small, the execute count is small, and the SQL statements are very similar except for the WHERE clause, you may find that hard coded values are being used instead of bind variables. Change to bind variables in order to reduce parsing.
  • Inefficient SQL Statements

    Inefficient SQL statements can consume large amounts of CPU resource. To detect such statements, enter the following query. You may be able to reduce CPU usage by tuning SQL statements that have a high number of buffer gets.


    See Also: "Approaches to SQL Statement Tuning" on page 7-6

    Read Consistency

    Your system could spend a lot of time rolling back changes to blocks in order to maintain a consistent view.

    Scalability Limitations Within the Application

    In most of this CPU tuning discussion we assume linear scalability, but this is never actually the case. How flat or nonlinear the scalability is indicates how far away from the ideal you are. Problems in your application might be hurting scalability: examples include too many indexes, right-hand index problems, too much data in blocks, or not partitioning the data. Contention problems like these waste CPU cycles and prevent the application from attaining linear scalability.

    Latch Contention

    Latch contention is a symptom; it is not normally the cause of CPU problems. Your task is to translate the latch contention to an application area: track down the contention to determine which part of your application is poorly written.

    The spin count may be set too high. Some other process may be holding a latch that your process is attempting to get, and your process may be spinning and spinning in an effort to get the latch. After a while your process may go to sleep before waking up to repeat its ineffectual spinning.

    Solving CPU Problems by Changing System Architecture

    If you have reached the limit of CPU power available on your system, and have exhausted all means of tuning its CPU usage, then you must consider redesigning your system. Consider whether moving to a different architecture might result in adequate CPU power. This section describes various possibilities.

    Attention: If you are running a multi-tier system, check all levels for CPU utilization. For example, on a three-tier system you might learn that your server is mostly idle and your second tier is completely busy. The solution then would be clear: tune the second tier, rather than the server or the third tier. In a multi-tier situation, it is usually not the server that has a performance problem: it is usually the clients and the middle tier.

    Single Tier to Two-Tier

    Consider whether changing from clients and server all running on a single machine (single tier) to a two-tier client/server configuration could help to relieve CPU problems.

    Figure 13-2 Single Tier to Two-Tier

    Multi-Tier: Using Smaller Client Machines

    Consider whether CPU usage might be improved if you used smaller clients, rather than multiple clients on bigger machines. This strategy may be helpful with either two-tier or three-tier configurations.

    Figure 13-3 Multi-Tier Using Smaller Clients

    Two-Tier to Three-Tier: Using a Transaction Processing Monitor

    If your system currently runs with multiple layers, consider whether moving from a two-tier to three-tier configuration, introducing the use of a transaction processing monitor, might be a good solution.

    Figure 13-4 Two-Tier to Three-Tier

    Three-Tier: Using Multiple TP Monitors

    Consider whether using multiple transaction processing monitors might be a good solution.

    Figure 13-5 Three-Tier with Multiple TP Monitors

    Oracle Parallel Server

    Consider whether your CPU problems could be solved by incorporating Oracle Parallel Server.

    Figure 13-6 Oracle Parallel Server


    Copyright © 1997 Oracle Corporation.

    All Rights Reserved.