9
Optimizing SQL Statements

This chapter describes how Oracle optimizes Structured Query Language (SQL) using the cost-based optimizer (CBO).

This chapter contains the following sections:

Approaches to SQL Statement Tuning
Tuning Goals
Best Practices
SQL Tuning Tips
Using EXISTS versus IN
Trouble Shooting

Note:
Although some Oracle tools and applications mask the use of SQL, all database operations are performed using SQL. Any other data access method circumvents the security built into Oracle and potentially compromises data security and integrity.

Approaches to SQL Statement Tuning

This section describes five ways you can improve SQL statement efficiency:

Restructuring the Indexes
Restructuring the Statement
Modifying or Disabling Triggers
Restructuring the Data

Keeping Statistics Current and Using Plan Stability to Preserve Execution Plans

Note:
The guidelines described in this section are oriented to production SQL that will be executed frequently. Most of the techniques that are discouraged here can legitimately be employed in ad hoc statements or in applications run infrequently where performance is not critical.

Restructuring the Indexes

Restructuring the indexes is a good starting point, because it has more impact on the application than does restructuring the statement or the data.

Remove nonselective indexes to speed the DML.
Index performance-critical access paths.
Consider hash clusters, but watch uniqueness.
Consider index clusters only if the cluster keys are similarly sized.

Do not use indexes as a panacea. Application developers sometimes think that performance will improve if they write more indexes. If a single programmer creates an appropriate index, then this might indeed improve the application's performance. However, if 50 programmers each create an index, then application performance will probably be hampered!

Restructuring the Statement

After restructuring the indexes, you can try restructuring the statement. Rewriting an inefficient SQL statement is often easier than repairing it. If you understand the purpose of a given statement, then you may be able to quickly and easily write a new statement that meets the requirement.

Consider Alternative SQL Syntax

Because SQL is a flexible language, more than one SQL statement may meet the needs of your application. Although two SQL statements may produce the same result, Oracle may process one faster than the other. You can use the results of the EXPLAIN PLAN statement to compare the execution plans and costs of the two statements and determine which is more efficient.

This example shows the execution plans for two SQL statements that perform the same function. Both statements return all the departments in the dept table that have no employees in the emp table. Each statement searches the emp table with a subquery. Assume there is an index, deptno_index, on the deptno column of the emp table.

The first statement and its execution plan:

SELECT dname, deptno
   FROM dept
   WHERE deptno NOT IN
      (SELECT deptno FROM emp);

Figure 9-1 Execution Plan with Two Full Table Scans

Step 3 of the output indicates that Oracle executes this statement by performing a full table scan of the emp table despite the index on the deptno column. This full table scan can be a time-consuming operation. Oracle does not use the index, because the subquery that searches the emp table does not have a WHERE clause that makes the index available.

However, this SQL statement selects the same rows by accessing the index:

SELECT dname, deptno
FROM dept
WHERE NOT EXISTS

(SELECT deptno
FROM emp
WHERE dept.deptno = emp.deptno);

Figure 9-2 Execution Plan with a Full Table Scan and an Index Scan

The WHERE clause of the subquery refers to the deptno column of the emp table, so the index deptno_index is used. The use of the index is reflected in step 3 of the execution plan. The index range scan of deptno_index takes less time than the full scan of the emp table in the first statement. Furthermore, the first query performs one full scan of the emp table for every deptno in the dept table. For these reasons, the second SQL statement is faster than the first.

If you have statements in your applications that use the NOT IN operator, as the first query in this example does, then you should consider rewriting them so that they use the NOT EXISTS operator. This would allow such statements to use an index, if one exists.

Note:
Alternative SQL syntax is effective only with the rule-based optimizer.

Compose Predicates Using AND and =

Use equijoins whenever possible. Without exception, statements that perform equijoins on untransformed column values are the easiest to tune.

Choose an Advantageous Join Order

Join order can have a significant effect on performance. The main objective of SQL tuning is to avoid performing unnecessary work to access rows that do not affect the result. This leads to three general rules:

Avoid a full-table scan if it is more efficient to get the required rows through an index.
Avoid using an index that fetches 10,000 rows from the driving table if you could instead use another index that fetches 100 rows.
Choose the join order so as to join fewer rows to tables later in the join order.

The following example shows how to tune join order effectively:

SELECT info
FROM taba a, tabb b, tabc c
WHERE a.acol BETWEEN :alow AND :ahigh

AND b.bcol BETWEEN :blow AND :bhigh
AND c.ccol BETWEEN :clow AND :chigh
AND a.key1 = b.key1
AND a.key2 = c.key2;

Choose the driving table and the driving index (if any).

The first three conditions in the example above are filter conditions applying to only a single table each. The last two conditions are join conditions.

Filter conditions dominate the choice of driving table and index. In general, the driving table should be the one containing the filter condition that eliminates the highest percentage of the table. Thus, if the range of :alow to :ahigh is narrow compared with the range of acol, but the ranges of :b* and :c* are relatively large, then taba should be the driving table, all else being equal.
Choose the right indexes.

After you know your driving table, choose the most selective index available to drive into that table. Alternatively, choose a full table scan if that would be more efficient. From there, the joins should all happen through the join indexes, the indexes on the primary or foreign keys used to connect that table to an earlier table in the join tree. Rarely should you use the indexes on the non-join conditions, except for the driving table. Thus, after taba is chosen as the driving table, you should use the indexes on b.key1 and c.key2 to drive into tabb and tabc, respectively.
Choose the best join order, driving to the best unused filters earliest.

The work of the following join can be reduced by first joining to the table with the best still-unused filter. Thus, if "bcol BETWEEN ..." is more restrictive (rejects a higher percentage of the rows seen) than "ccol between ...", the last join can be made easier (with fewer rows) if tabb is joined before tabc.

Use Untransformed Column Values

Use untransformed column values. For example, use:

WHERE a.order_no = b.order_no

Rather than:

WHERE TO_NUMBER (SUBSTR(a.order_no, instr(b.order_no, '.') - 1))
= TO_NUMBER (SUBSTR(a.order_no, instr(b.order_no, '.') - 1))

Do not use SQL functions in predicate clauses or WHERE clauses. The use of an aggregate function, especially in a subquery, often indicates that you could have held a derived value on a master record.

Avoid Mixed-Type Expressions

Avoid mixed-mode expressions, and beware of implicit type conversions. When you want to use an index on the VARCHAR2 column charcol, but the WHERE clause looks like this:

AND charcol = <numexpr>

Where numexpr is an expression of number type (for example, 1, USERENV('SESSIONID'), numcol, numcol+0,...), Oracle translates that expression into:

AND TO_NUMBER(charcol) = numexpr

This has the following consequences:

Any expression using a column, such as a function having the column as its argument, causes the optimizer to ignore the possibility of using an index on that column, even a unique index.
If the system processes even a single row having charcol as a string of characters that does not translate to a number, then an error is returned.

You can avoid this problem by replacing the top expression with the explicit conversion:

AND charcol = TO_CHAR(<numexpr>)

Alternatively, make all type conversions explicit. The statement:

numcol = charexpr

allows use of an index on numcol, because the default conversion is always character-to-number. This behavior, however, is subject to change. Making type conversions explicit also makes it clear that charexpr should always translate to a number.

Write Separate SQL Statements for Specific Values

SQL is not a procedural language. Using one piece of SQL to do many different things is not a good idea: it usually results in a less-than-optimal result for each task. If you want SQL to accomplish different things, then write two different statements rather than writing one statement that will do different things depending on the parameters you give it.

Optimization (determining the execution plan) takes place before the database knows what values will be substituted into the query. An execution plan should not, therefore, depend on what those values are. For example:

SELECT info 
FROM tables
WHERE ... 

AND somecolumn BETWEEN DECODE(:loval, 'ALL', somecolumn, :loval)
AND DECODE(:hival, 'ALL', somecolumn, :hival);

Written as shown, the database cannot use an index on the somecolumn column, because the expression involving that column uses the same column on both sides of the BETWEEN.

This is not a problem if there is some other highly selective, indexable condition you can use to access the driving table. Often, however, this is not the case. Frequently, you may want to use an index on a condition like that shown, but need to know the values of :loval, and so on, in advance. With this information, you can rule out the ALL case, which should not use the index.

If you want to use the index whenever real values are given for :loval and :hival (that is, if you expect narrow ranges, even ranges where :loval often equals :hival), then you can rewrite the example in the following logically equivalent form:

SELECT /* change this half of union all if other half changes */ info
FROM tables 
WHERE ... 

AND somecolumn BETWEEN :loval AND :hival
AND (:hival != 'ALL' AND :loval != 'ALL') 

UNION ALL 

SELECT /* Change this half of union all if other half changes. */ info
FROM tables
WHERE ... 

AND (:hival = 'ALL' OR :loval = 'ALL');

If you run EXPLAIN PLAN on the new query, then you seem to get both a desirable and an undesirable execution plan. However, the first condition the database evaluates for either half of the UNION ALL is the combined condition on whether :hival and :loval are ALL. The database evaluates this condition before actually getting any rows from the execution plan for that part of the query.

When the condition comes back false for one part of the UNION ALL query, that part is not evaluated further. Only the part of the execution plan that is optimum for the values provided is actually carried out. Because the final conditions on :hival and :loval are guaranteed to be mutually exclusive, then only one half of the UNION ALL actually returns rows. (The ALL in UNION ALL is logically valid because of this exclusivity. It allows the plan to be carried out without an expensive sort to rule out duplicate rows for the two halves of the query.)

Use Hints to Control Access Paths

Use optimizer hints, such as /*+ORDERED */ to control access paths. This is a better approach than using traditional techniques or "tricks of the trade" such as CUST_NO + 0. For example, use

SELECT /*+ FULL(emp) */ e.ename
FROM emp e
WHERE e.job = 'CLERK';

rather than

SELECT e.ename FROM emp e
WHERE e.job || '' = 'CLERK';

See Also:
For more information on hints, see Chapter 7, "Using Optimizer Hints".

Use Care When Using IN and NOT IN with a Subquery

Remember that WHERE (NOT) EXISTS is a useful alternative.

Note:
(NOT) EXISTS is not always equivalent to NOT IN.

Use Care When Embedding Data Value Lists in Applications

Data value lists are generally a sign that an entity is missing. For example:

WHERE transport IN ('BMW', 'CITROEN', 'FORD', HONDA')

The real objective in the WHERE clause above is to determine whether the mode of transport is an automobile, and not to identify a particular make. A reference table should be available in which transport type = 'AUTOMOBILE'.

Minimize the use of DISTINCT. DISTINCT always creates a sort; all the data must be instantiated before your results can be returned.

Reduce the Number of Calls to the Database

When appropriate, use INSERT, UPDATE, or DELETE... RETURNING to select and modify data with a single call. This technique improves performance by reducing the number of calls to the database.

Use Care When Managing Views

Be careful when joining views, when performing outer joins to views, and when you consider recycling views.

Use Care When Joining Views

The shared SQL area in Oracle reduces the cost of parsing queries that reference views. In addition, optimizer improvements make the processing of predicates against views very efficient. Together, these factors make possible the use of views for ad hoc queries. Despite this, joins to views are not recommended, particularly joins from one complex view to another.

The following example shows a query upon a column which is the result of a GROUP BY. The entire view is first instantiated, and then the query is run against the view data.

CREATE VIEW dx(deptno, dname, totsal)
AS SELECT d.deptno, d.dname, e.sum(sal)
FROM emp e, dept d

WHERE e.deptno = d.deptno
GROUP BY deptno, dname


SELECT * 
FROM dx 
WHERE deptno=10;

Use Care When Unnesting Subqueries

Setting the UNNEST_SUBQUERY session parameter to TRUE enables subquery unnesting. Subquery unnesting unnests and merges the body of the subquery into the body of the statement that contains it, allowing the optimizer to consider them together when evaluating access paths and joins.

This parameter not cost based, and it is not set by default. UNNEST_SUBQUERY first verifies if the statement is valid. If the statement is not valid, then subquery unnesting cannot proceed. The statement must then must pass a heuristic test.

The UNNEST hint checks the subquery block for validity only. If it is valid, then subquery unnesting is enabled without Oracle checking the heuristics. If you enabled subquery unnesting with the UNNEST_SUBQUERY parameter, then the NO_UNNEST hint turns it off for specific subquery blocks.

See Also:
For more information on unnesting nested subqueries and the conditions that make a subquery block valid, see the Oracle8i SQL Reference. For more information on the UNNEST and NO_UNNEST hints, see Chapter 7, "Using Optimizer Hints".

Because subquery unnesting generates views, some views will be merged in the main query block with complex view merging. When the subquery contains an aggregate function, it is a good idea to have complex view merging enabled. This allows the inline view generated by unnesting to be merged in the main query block.

Use Care When Performing Outer Joins To Views

An outer join to a multi-table view can be problematic. For example, you may start with the usual emp and dept tables with indexes on e.empno, e.deptno, and d.deptno, and create the following view:

CREATE VIEW empdept (empno, deptno, ename, dname) 
AS SELECT e.empno, e.deptno, e.ename, d.dname 

FROM dept d, emp e 
WHERE e.deptno = d.deptno(+);

You may then construct the simplest possible query to do an outer join into this view on an indexed column (e.deptno) of a table underlying the view:

SELECT e.ename, d.loc 
FROM dept d, empdept e 

WHERE d.deptno = e.deptno(+) 
AND d.deptno = 20;

The following execution plan results:

QUERY_PLAN 
--------------------------------------------
MERGE JOIN OUTER
    TABLE ACCESS BY ROWID DEPT
     INDEX UNIQUE SCAN DEPT_U1: DEPTNO
    FILTER
     VIEW EMPDEPT
      NESTED LOOPS OUTER
       TABLE ACCESS FULL EMP
       TABLE ACCESS BY ROWID DEPT
         INDEX UNIQUE SCAN DEPT_U1: DEPTNO

Until both tables of the view are joined, the optimizer does not know whether the view will generate a matching row. The optimizer must therefore generate all the rows of the view and perform a MERGE JOIN OUTER with all the rows returned from the rest of the query. This approach would be extremely inefficient if all you want is a few rows from a multi-table view with at least one very large table.

Solving the problem in the preceding example is relatively easy. The second reference to dept is not needed, so you can do an outer join straight to emp. In other cases, the join need not be an outer join. You can still use the view simply by getting rid of the (+) on the join into the view.

Do Not Recycle Views

Beware of writing a view for one purpose and then using it for other purposes, to which it may be ill-suited. Consider this example:

SELECT dname 
FROM dx 
WHERE deptno=10;

You can obtain dname and deptno directly from the dept table. It would be inefficient to obtain this information by querying the DX view (which was declared earlier in the present example). To answer the query, the view would perform a join of the dept and emp tables, even though you do not need any data from the emp table.

Modifying or Disabling Triggers

Using triggers consumes system resources. If you use too many triggers, then you may find that performance is adversely affected and you may need to modify or disable them.

Restructuring the Data

After restructuring the indexes and the statement, you can consider restructuring the data.

Introduce derived values. Avoid GROUP BY in response-critical code.
Implement missing entities and intersection tables.
Reduce the network load. Migrate, replicate, partition data.

The overall purpose of any strategy for data distribution is to locate each data attribute such that its value makes the minimum number of network journeys. If the current number of journeys is excessive, then moving (migrating) the data is a natural solution.

Often, however, no single location of the data reduces the network load (or message transmission delays) to an acceptable level. In this case, consider either holding multiple copies (replicating the data) or holding different parts of the data in different places (partitioning the data).

Where distributed queries are necessary, it may be effective to code the required joins with procedures either in PL/SQL within a stored procedure, or within the user interface code.

When considering a cross-network join, you can either bring the data in from a remote node and perform the join locally, or you can perform the join remotely. The option you choose should be determined by the relative volume of data on the different nodes.

Keeping Statistics Current and Using Plan Stability to Preserve Execution Plans

After you have tuned your application's SQL statements, consider maintaining statistics with the useful procedures of the DBMS_STATS package. Also consider implementing plan stability to maintain application performance characteristics despite system changes.

See Also:
For more information on using statistics, see Chapter 8, "Gathering Statistics". For more information on using plan stability, see Chapter 10, "Using Plan Stability".

Tuning Goals

Structured Query Language (SQL) is used to perform all database operations, although some Oracle tools and applications simplify or mask its use. This chapter provides an overview of the issues involved in tuning database operations from the SQL point-of-view.

This section introduces:

Always approach the tuning of database operations from the standpoint of the particular goals of your application. Are you tuning serial SQL statements or parallel operations? Do you have an online transaction processing (OLTP) application or a data warehousing application?

Data warehousing operations process high volumes of data, and they have a high correlation with the goals of parallel operations.
OLTP applications have a large number of concurrent users, and they correlate more with serial operations.

As a result, these two divergent types of applications have contrasting goals for tuning as described in Table 9-1.

Table 9-1 Contrasting Goals for Tuning

Tuning Situation	Goal
Serial SQL Statement	Minimize resource use by the operation.
Parallel Operations	Maximize throughput for the hardware.

Tuning a Serial SQL Statement

The goal of tuning one SQL statement in isolation is: Minimize resource use by the operation being performed.

You can experiment with alternative SQL syntax without actually modifying your application. To do this, use the EXPLAIN PLAN statement with the alternative statement that you are considering, and compare the alternative statement's execution plan and cost with that of the existing one. The cost of a SQL statement appears in the POSITION column of the first row generated by EXPLAIN PLAN. You must run the application to see which statement can actually be executed more quickly.

Tuning Parallel Execution

The goal of tuning parallel execution is: Maximize throughput for the given hardware.

If you have a powerful system and a massive, high-priority SQL statement to run, then parallelize the statement so that it uses all available resources.

Note:
Parallel execution is only available with the Oracle8i Enterprise Edition.

Oracle can perform the following operations in parallel:

Parallel query
Parallel DML (includes INSERT, UPDATE, DELETE; APPEND hint, parallel index scans)
Parallel DDL
Parallel recovery
Parallel loading
Parallel propagation (for replication)

Look for opportunities to parallelize operations in the following situations:

Long elapsed time

Whenever an operation you are performing in the database takes a long time, whether it is a query or a batch job, you may be able to reduce the elapsed time by using parallel operations.

High number of rows processed

You can split rows so that they are not all accessed by a single process.

See Also:
For more information about parallel execution, see Oracle8i Concepts and your platform-specific Oracle documentation.
For information on using the following features, see Oracle8i Data Warehousing Guide:

Setting the degree of parallelism and enabling adaptive multi-user

Tuning parallel execution parameters

Creating indexes in parallel

Partitioned index scans

Using bulk inserts, updates, and deletes

You can also use parallel execution to access object types within an Oracle database. For example, you can use parallel execution to access Large Binary Objects (LOBs).

Parallel execution benefits systems if they have all of the following characteristics:

Symmetric multi-processors (SMP), clusters, or massively parallel systems.
Sufficient I/O bandwidth.
Under-utilized or intermittently used CPUs (for example, systems where CPU usage is typically less than 30%).
Sufficient memory to support additional memory-intensive processes, such as sorts, hashing, and I/O buffers.

If your system lacks any of these characteristics, then parallel execution may not significantly improve performance. In fact, parallel execution can reduce system performance on over-utilized systems or systems with small I/O bandwidth.

When to Implement Parallel Execution

Parallel execution provides the best performance improvements in decision support systems (DSS). However, online transaction processing (OLTP) systems also benefit from parallel execution; for example, parallel index creation greatly benefits ecommerce businesses where there is little scheduled downtime.

During the day, most OLTP systems should probably not use parallel execution. During off-hours, however, parallel execution can effectively process high-volume batch operations. For example, a bank might use parallelized batch programs to perform millions of updates to apply interest to accounts.

Tuning OLTP Applications

Tuning OLTP applications mostly involves tuning serial SQL statements. You should consider two design issues: use of SQL and shared PL/SQL, and use of different transaction modes.

SQL and Shared PL/SQL

To minimize parsing, use bind variables in SQL statements within OLTP applications. This way, all users can share the same SQL statements while using fewer resources for parsing.

Transaction Modes

Sophisticated users can use discrete transactions if performance is of the utmost importance, and if the users are willing to design the application accordingly.

Serializable transactions can be used if the application must be ANSI compatible. Because of the overhead inherent in serializable transactions, Oracle strongly recommends the use of read-committed transactions instead.

Triggers

If excessive use of triggers degrades system performance, then modify the conditions under which triggers fire by executing the CREATE TRIGGER or CREATE OR REPLACE TRIGGER statements. You can also turn off triggers with the ALTER TRIGGER statement.

Note:
Excessive use of triggers for frequent events such as logons, logoffs, and error events can degrade performance, because these events affect all users.

Best Practices

This section documents the best practices for developing and tuning SQL with the cost-based optimizer (CBO). This includes the following:

Avoiding Rule-Based Optimizer Techniques

The traditional RBO tuning techniques include:

Disabling indexes
- col+0 or col || ``
- Wrap function around column, such as NVL (col, -999) or TO_NUMBER
Because the CBO is cost based, it is not necessary to force or disable a particular index. The CBO chooses the access path with the best cost.
Working the table order in the FROM clause.

The CBO chooses the most efficient join order based on cost after permuting the possible join graphs. Hence, there is no need, or benefit, to ordering the FROM clause under the CBO.

Index Cost

In the following example, the CBO may choose a full table scan if the index probe on employee_num is too costly (e.g., the estimated cardinality for employees having employee numbers beginning with 20 is high).

SELECT employee_num, full_name NAME, employee_id 
FROM mtl_employees_current_view 
WHERE (employee_num LIKE '20%') AND 
      (organization_id = :1) 
ORDER BY employee_num;

Analyzing Object Statistics

The object statistics include the following:

Column statistics
Data skew
Table statistics
Index statistics
Partition statistics

The following example illustrates the cost model and selectivity of a query which, under the RBO, used an inefficient index. The CBO chooses a more efficient plan.

SELECT item.expenditure_item_id 
FROM pa_tasks t, 
     pa_expenditures exp, 
     pa_expenditure_types etype, 
     pa_expenditure_items item 
WHERE   
TRUNC(exp.expenditure_ending_date)<=TRUNC(NVL(TO_DATE(:b0),
exp.expenditure_ending_date)) 
    AND exp.expenditure_status_code||''='APPROVED' 
    AND exp.expenditure_group=NVL(:b1,exp.expenditure_group) 
    AND exp.expenditure_id=item.expenditure_id 
    AND (NVL(item.request_id,(:b2+1))<>:b2 OR item.cost_dist_rejection_code IS 
NULL ) 
    AND item.cost_distributed_flag='N' and t.task_id=item.task_id 
    AND t.project_id=DECODE(:b4,0,T.project_id,:b4) 
    AND item.expenditure_type=etype.expenditure_type 
    AND etype.system_linkage_function||''=:b6 
 ORDER BY item.expenditure_item_date;
COST DISTRIBUTED FLAG
C                          7
N                     80,251
Y                 16,534,822

Rule Plan

Cost= SELECT STATEMENT                          
COUNT(*)
Cost=   SORT ORDER BY                           
===================================
Cost=     NESTED LOOPS                              
Cost=       NESTED LOOPS                            
Cost=         NESTED LOOPS                          
Cost=           TABLE ACCESS BY INDEX ROWID PA_EXPENDITURE_ITEMS_ALL
Cost=             INDEX RANGE SCAN PA_EXPENDITURE_ITEMS_N3: COST_DISTRIBUTED_
FLAG
Cost=           TABLE ACCESS BY INDEX ROWID PA_EXPENDITURE_TYPES
Cost=             INDEX UNIQUE SCAN PA_EXPENDITURE_TYPES_U1: EXPENDITURE_TYPE
Cost=         TABLE ACCESS BY INDEX ROWID PA_EXPENDITURES_ALL
Cost=           INDEX UNIQUE SCAN PA_EXPENDITURES_U1: EXPENDITURE_ID
Cost=       TABLE ACCESS BY INDEX ROWID PA_TASKS
Cost=         INDEX UNIQUE SCAN PA_TASKS_U1: TASK_ID

CBO Plan (default)

Cost=6503 SELECT STATEMENT
Cost=6503   SORT ORDER BY 
Cost=6489     NESTED LOOPS 
Cost=6487       NESTED LOOPS 
Cost=6478         MERGE JOIN CARTESIAN 
Cost=6477           TABLE ACCESS FULL PA_EXPENDITURES_ALL
Cost=1           SORT JOIN
Cost=1             TABLE ACCESS FULL PA_EXPENDITURE_TYPES 
Cost=9         TABLE ACCESS BY INDEX ROWID PA_EXPENDITURE_ITEMS_ALL 
Cost=4           INDEX RANGE SCAN PA_EXPENDITURE_ITEMS_N1: EXPENDITURE_ID
Cost=2       TABLE ACCESS BY INDEX ROWID PA_TASKS
Cost=1         INDEX UNIQUE SCAN PA_TASKS_U1: TASK_ID

Force Rule Plan Using Hints

This illustrates that the cost of the RBO plan is significantly higher than that of the the default CBO generated plan.

Cost=592532 SELECT STATEMENT
Cost=592532   SORT ORDER BY
Cost=592518     NESTED LOOPS
Cost=592516       NESTED LOOPS
Cost=587506         NESTED LOOPS
Cost=504831           TABLE ACCESS BY INDEX ROWID PA_EXPENDITURE_ITEMS_ALL
Cost=32573             INDEX RANGE SCAN PA_EXPENDITURE_ITEMS_N3:
Cost=1           TABLE ACCESS BY INDEX ROWID PA_EXPENDITURE_TYPES
Cost=             INDEX UNIQUE SCAN PA_EXPENDITURE_TYPES_U1:
Cost=2         TABLE ACCESS BY INDEX ROWID PA_EXPENDITURES_ALL
Cost=1           INDEX UNIQUE SCAN PA_EXPENDITURES_U1:
Cost=2       TABLE ACCESS BY INDEX ROWID PA_TASKS
Cost=1         INDEX UNIQUE SCAN PA_TASKS_U1:

Rewrite SQL

In order to avoid the full table scan, the query can be rewritten in order to optimize by using a more selective filter. In this case, the expenditure group is rather selective, but the NVL() function prevented an index from being used.

SELECT item.expenditure_item_id 
FROM pa_tasks t, 
     pa_expenditures exp, 
     pa_expenditure_types etype, 
     pa_expenditure_items item 
WHERE   
TRUNC(exp.expenditure_ending_date)<=TRUNC(NVL(TO_DATE(:b0),
exp.expenditure_ending_date)) 
    AND exp.expenditure_status_code||''='APPROVED' 
    AND exp.expenditure_group=:b1 

AND exp.expenditure_id=item.expenditure_id 

    AND (NVL(item.request_id,(:b2+1))<>:b2 OR item.cost_dist_rejection_code IS 
NULL)
    AND item.cost_distributed_flag='N' and t.task_id=item.task_id 
    AND t.project_id=DECODE(:b4,0,t.project_id,:b4) 
    AND item.expenditure_type=etype.expenditure_type 
    AND etype.system_linkage_function||''=:b6 
 ORDER BY item.expenditure_item_date

New CBO Plan

Cost=32 SELECT STATEMENT  
Cost=32   SORT ORDER BY 
Cost=18     NESTED LOOPS 
Cost=16       NESTED LOOPS 
Cost=7         MERGE JOIN CARTESIAN 
Cost=1           TABLE ACCESS FULL PA_EXPENDITURE_TYPES
Cost=6           SORT JOIN 
Cost=6             TABLE ACCESS BY INDEX ROWID PA_EXPENDITURES_ALL 
Cost=2               INDEX RANGE SCAN PA_EXPENDITURES_N3: EXPENDITURE_GROUP
Cost=9         TABLE ACCESS BY INDEX ROWID PA_EXPENDITURE_ITEMS_ALL
Cost=4           INDEX RANGE SCAN PA_EXPENDITURE_ITEMS_N1: EXPENDITURE_ID
Cost=2       TABLE ACCESS BY INDEX ROWID PA_TASKS 
Cost=1         INDEX UNIQUE SCAN PA_TASKS_U1: TASK_ID

Note:
Although there is a full table scan on the pa_expenditure_types table, this is only a small lookup table.

Avoiding Complex Expressions

Avoid the following kind of complex expressions:

col1 = NVL (:b1,col1)
NVL (col1,-999) = ....
TO_DATE(), TO_NUMBER(), etc.

These expressions prevent the optimizer from assigning valid cardinality or selectivity estimates, and can in turn affect the overall plan and the join method.

Add the predicate versus using NVL() technique.

For example:

SELECT employee_num, full_name NAME, employee_id 
FROM mtl_employees_current_view 
WHERE (employee_num = NVL (:b1,employee_num)) AND (organization_id=:1) 
ORDER BY employee_num;

Also:

SELECT employee_num, full_name NAME, employee_id 
FROM mtl_employees_current_view 
WHERE (employee_num = :b1) AND (organization_id=:1) 
ORDER BY employee_num;

Avoiding Balloon Tactic for Coding SQL

The balloon tactic is when a developer chooses to write a single complex SQL statement which incorporates complex application and business logic, as opposed to writing a few simple queries to achieve the same results. Developing a very large complex SQL statement has performance implications in terms of sharable memory and optimization. Coding a few simple queries in place of a single complex query is a better approach, because the individual SQL statements are easier to optimize and maintain.

Oracle Forms and Reports are powerful development tools which allow application logic to be coded using PL/SQL (triggers or program units). This helps reduce the complexity of SQL by allowing complex logic to be handled in the Forms or Reports. In addition, you can also invoke a server side PL/SQL package which performs the few SQL statements in place of a single large complex SQL statement. Because the package is a server-side unit, there are no issues surrounding client to database round-trips and network traffic.

Handling Complex Logic in the Application

Complex logic should be handled in the application via Oracle Forms triggers, PL/SQL logic, or C-Code.

For example:

SELECT * 
FROM ar_addresses_v 
WHERE (customer_id=:1) 
==================================================
AR_ADDRESSES_V: 
SELECT *
FROM AR_LOOKUPS L_CAT,
 FND_TERRITORIES_VL TERR,
 FND_LANGUAGES_VL LANG,
 RA_SITE_USES SU_SHIP,
 RA_SITE_USES SU_STMT,
 RA_SITE_USES SU_DUN,
 RA_SITE_USES SU_LEGAL,
 RA_SITE_USES SU_BILL,
 RA_SITE_USES SU_MARKET,
 RA_ADDRESSES ADDR

The following steps were taken to improve the above query, which accessed a complex view with many outer joins:

Rewrote the SQL statement and eliminated 6 table joins.
Added a Forms post query trigger to populate address type fields.
Reduced the number of rows processed.

SQL Tuning Tips

Table 9-2 lists recommended tuning tips you should implement during your SQL statement design phase:

Table 9-2 SQL Tuning Tips

SQL Tuning Tip	Notes
Do the same work faster, or do less work. Tun by selectivity.	Aim to have the least rows selected. This leads to less work and less time taken by SQL execution. It also reduces parse times.
Decompose join layers.	Analyze the joins one by one and check that their use makes sense in each circumstance. See Chapter 4, "The Optimizer" .
Examine the underlying views.	If your query accesses a view, or joins with a view, then you should examine the view thoroughly to determine if the view is optimized, or if your query even needs all the complexity from the view.
Do not be afraid of full table scans, especially for small tables.	Full table scans may make sense and be cheaper than index scans in certain situations, like with smaller tables or non-selective indexes.
Examine the execution plan in detail.	Index access and NL joins may not be optimal. For example, the query could be returning too many rows for this particular join type.
Do the math for long-running queries: For example a query may need to run in 3 minutes The query joins so_lines and so_ headers table	Verify the following: selectivity of so_headers is 5% selectivity of so_lines is 15% so_headers = 1GB, so_lines = 25GB Data working set (resultant set)=3.04GB Throughput needed = 22MB/second In other words, your expectations of needing the query to run in 3 minutes could be too high, depending on the system configuration.
Monitor disk reads and buffer gets	For instructions on how to do this, see "Disk Reads and Buffer Gets".
Joins * Review the outer joins * Replace join with sub-query	For advice on how to do this, see "Choose an Advantageous Join Order".
Choosing `EXISTS` or `IN`	For advice on how to decide, see "Using EXISTS versus IN".
Predicate collapsing	See "Predicate Collapsing".
Tune for the typical case	See "Tuning for the Typical Case".

Using EXPLAIN PLAN on All Queries

It is important that you generate and review execution plans for all your SQL statements to ensure optimal performance.

See Also:
For more information on execution plans, see Chapter 5, "Using EXPLAIN PLAN".

Predicate Collapsing

Predicate collapsing occurs when a column predicate involves more than one bind variable. An expression of the form [col = DECODE (:b1,'',:b3,col)] is a an example of predicate collapsing. This implies that if the bind variable 1 is null, then the bind variable 3 should be used; otherwise, the expression will result in [ col = col]. This prevents the optimizer from utilizing the index on the "col" column due to the decode construct.

The following example demonstrates how predicate collapsing is used to collapse a name bind variable with the delivery_id bind variable in a single filter. As can be seen from the EXPLAIN PLAN, this results in a full table scan on the wsh_deliveries table because of the NVL() construct on the delivery_id column, as well as the DECODE() construct on the name column.

SELECT delivery_id, planned_departure_id, organization_id, status_code
FROM wsh_deliveries
WHERE delivery_id = NVL(:b1,delivery_id) AND name = DECODE(:b1,'',:b3, NAME)
ORDER BY UPPER(HRE.full_name)

PLAN:

Cost=2090 SELECT STATEMENT
Cost=2090   TABLE ACCESS FULL WSH_DELIVERIES

This query can be rewritten using a UNION to short-circuit one-side of the UNION based on the bind variable values. For example, if the delivery_id bind is supplied, only the first branch of the UNION is executed.

If a value for the name bind variable is supplied, then the second branch of the UNION is executed. In either case, both sides of the UNION use rather selective indexes on either the delivery_id column or the name column. This is much more efficient than the original query which performed a full table scan.

SELECT delivery_id, planned_departure_id, organization_id, status_code
FROM wsh_deliveries
WHERE delivery_id = :b1 AND (:b1 IS NOT NULL)
UNION
SELECT delivery_id, planned_departure_id, organization_id, status_code
FROM wsh_deliveries
WHERE name = :b2 AND (:b1 is null)

Cost=34 SELECT STATEMENT
Cost=34   SORT UNIQUE
Cost=     UNION-ALL
Cost=       FILTER
Cost=3         TABLE ACCESS BY INDEX ROWID WSH_DELIVERIES
Cost=2           INDEX UNIQUE SCAN WSH_DELIVERIES_U1: DELIVERY_ID
Cost=       FILTER
Cost=3         TABLE ACCESS BY INDEX ROWID WSH_DELIVERIES
Cost=2           INDEX UNIQUE SCAN WSH_DELIVERIES_U2: NAME

Tuning for the Typical Case

The following example illustrates how a query can be optimized for the general case. Specifically, this purchasing query determines the list of approvers which can approve a purchase order for a given organizational structure. However, in most cases, the end user provides the approver name via a name pattern, and, therefore, it is not necessary to scan all the approvers.

SELECT COUNT(*), COUNT(DISTINCT HR.employee_id ), HR.full_name,
  HR.employee_num, HR.employee_id
FROM hr_employees_current_v HR, 

(SELECT DISTINCT PEH.superior_id 
FROM po_employee_hierarchies PEH 
WHERE PEH.position_structure_id = :1 
AND PEH.employee_id > 0) PEHV WHERE PEHV.superior_id = HR.employee_id 
AND (:2 =  'Y' OR (:3 = 'N' AND HR.employee_id != :4)) 

GROUP BY full_name, employee_num, employee_id
ORDER BY full_name

call     count       cpu    elapsed       disk      query  current        ros
------- ------  -------- ---------- ---------- ---------- ---------- ---------
Parse        1      0.00       0.00          0          0   0            0 
Execute      1      0.00       0.00          0          0   0            0
Fetch       42     39.34      39.51       3756       7752   3           82
------- ------  -------- ---------- ---------- ---------- ----------   ---------
total       44     39.34      39.51       3756       7752   3           82

SELECT STATEMENT   GOAL: ALL_ROWS
         SORT (GROUP BY)
         FILTER
           NESTED LOOPS
            NESTED LOOPS
             VIEW
              SORT (UNIQUE)
               INDEX GOAL: ANALYZED (RANGE SCAN) OF
'PO_EMPLOYEE_HIERARCHIES_U1' (UNIQUE)
             TABLE ACCESS GOAL: ANALYZED (BY INDEX ROWID) OF
'PER_ALL_PEOPLE_F'
              INDEX (RANGE SCAN) OF 'PER_PEOPLE_F_PK' (UNIQUE)
            TABLE ACCESS GOAL: ANALYZED (BY INDEX ROWID) OF
'PER_ALL_ASSIGNMENTS_F'
             INDEX GOAL: ANALYZED (RANGE SCAN) OF
'PER_ASSIGNMENTS_F_N12' (NON-UNIQUE)
           SORT (AGGREGATE)
            TABLE ACCESS GOAL: ANALYZED (FULL) OF
'FINANCIALS_SYSTEM_PARAMS_ALL'

SELECT COUNT(*), COUNT(DISTINCT HR.employee_id ), HR.full_name,
 HR.employee_num, HR.employee_id
FROM hr_employees_current_v HR 
WHERE (full_name LIKE NVL(:1,'')||'%') 
AND (NVL(:2, 'N') = 'Y' OR (NVL(:3,'N') = 'N' 
AND HR.employee_id !=:4)) AND EXISTS

(SELECT PEH.superior_id 
FROM po_employee_hierarchies PEH 
WHERE PEH.position_structure_id = :5 
AND PEH.superior_id = HR.employee_id)

GROUP BY full_name, employee_num, employee_id
ORDER BY full_name

call     count       cpu    elapsed       disk      query   current        ros
------- ------  -------- ---------- ---------- ---------- ----------   ---------
Parse        0      0.00       0.00          0          0   0           0
Execute      1      0.00       0.01          0          0   0           0
Fetch        1      0.03       0.09         29         39   3           2
------- ------  -------- ---------- ---------- ---------- ----------   ---------
total        2      0.03       0.10         29         39   3           2

SELECT STATEMENT   GOAL: ALL_ROWS
  SORT (GROUP BY)
   FILTER
    NESTED LOOPS
     TABLE ACCESS   GOAL: ANALYZED (BY INDEX ROWID) OF'PER_ALL_PEOPLE_F'

      INDEX   GOAL: ANALYZED (RANGE SCAN) OF'PER_PEOPLE_F_N54'
(NON-UNIQUE)
     TABLE ACCESS   GOAL: ANALYZED (BY INDEX ROWID)
OF'PER_ALL_ASSIGNMENTS_F'
      INDEX   GOAL: ANALYZED (RANGE SCAN) OF'PER_ASSIGNMENTS_F_N12'
(NON-UNIQUE)
    TABLE ACCESS   GOAL: ANALYZED (BY INDEX ROWID) OF
'PO_EMPLOYEE_HIERARCHIES_ALL'
     INDEX   GOAL: ANALYZED (RANGE SCAN) OF 'PO_EMPLOYEE_HIERARCHIES_N2'
(NON-UNIQUE)
    SORT (AGGREGATE)
     TABLE ACCESS   GOAL: ANALYZED (FULL)
OF'FINANCIALS_SYSTEM_PARAMS_ALL'

Disk Reads and Buffer Gets

Monitor disk reads and buffer gets by executing the following statement:

SQL>  set autotrace on [explain] [stat]

Typical results returned are shown as follows:

Statistics
----------------------------------------------------------
         70  recursive calls
          0  db block gets
       591  consistent gets
      404  physical reads
          0  redo size
        315  bytes sent via SQL*Net to client
        850  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          3  sorts (memory)
          0  sorts (disk)
          0  rows processed

If 'consistent gets' or 'physical reads' are high relative to the amount of data returned, then this a sign that the query is expensive and needs to be reviewed for optimization.

For example, if you are expecting less than 1,000 rows back and 'consistent gets' is 1,000,000 and 'physical reads' is 10,000, then this query needs to be further optimized.

Using EXISTS versus IN

This section describes when to use EXISTS and when to use the IN clause in sub-queries.

Using EXISTS in a SELECT Statement

SELECT COUNT(*)
FROM so_picking_lines_all pl
WHERE (EXISTS (SELECT pld.picking_line_id
               FROM so_picking_line_details pld
               WHERE (pld.picking_line_id=pl.picking_line_id AND
                      pld.delivery_id=:b1))
      AND nvl(PL.SHIPPED_QUANTITY,0)>0)

Plan:

Cost=97740 SELECT STATEMENT
Cost=   SORT AGGREGATE
Cost=     FILTER
Cost=97740       TABLE ACCESS FULL SO_PICKING_LINES_ALL
Cost=4       TABLE ACCESS BY INDEX ROWID SO_PICKING_LINE_DETAILS
Cost=3         INDEX RANGE SCAN SO_PICKING_LINE_DETAILS_N3:

In this example, the use of EXISTS results in a full table scan because there is no 
selective criteria on the outer query. In this case, an IN operator is more appropriate. 
The IN operator enables Oracle to drive off of the delivery_id index, which is 
rather selective.

Using IN in a SELECT Statement with Nested Loop Join

SELECT COUNT(*)
FROM so_picking_lines_all pl
WHERE pl.picking_line_id in (SELECT pld.picking_line_id
                             FROM so_picking_line_details pld
                             WHERE pld.delivery_id=:b1)
      AND PL.SHIPPED_QUANTITY>0

Plan:

Cost=265 SELECT STATEMENT
Cost=   SORT AGGREGATE
Cost=265     NESTED LOOPS
Cost=19       VIEW
Cost=19         SORT UNIQUE
Cost=4           TABLE ACCESS BY INDEX ROWID SO_PICKING_LINE_DETAILS
Cost=3             INDEX RANGE SCAN SO_PICKING_LINE_DETAILS_N3:
Cost=2       TABLE ACCESS BY INDEX ROWID SO_PICKING_LINES_ALL
Cost=1         INDEX UNIQUE SCAN SO_PICKING_LINES_U1:

This is another example where IN is more appropriate than EXISTS.

Using EXISTS in an UPDATE Statement

UPDATE so_sales_credits_interface sc
SET request_id=:b0
WHERE request_id IS NULL AND error_flag IS NULL AND
      interface_status IS NULL AND
      EXISTS (SELECT NULL
              FROM so_headers_interface i
              WHERE sc.original_system_reference=i.original_system_reference AND
                    sc.order_source_id=i.order_source_id AND i.request_id=:b0)

Plan:

Cost=1459 UPDATE STATEMENT
Cost=   UPDATE  SO_SALES_CREDITS_INTERFACE
Cost=     FILTER
Cost=1459       TABLE ACCESS FULL SO_SALES_CREDITS_INTERFACE
Cost=2       TABLE ACCESS BY INDEX ROWID SO_HEADERS_INTERFACE_ALL
Cost=1         INDEX UNIQUE SCAN SO_HEADERS_INTERFACE_U1:

In this example, the use of EXISTS results in a full table scan because there is no selective criteria on the outer query. In this case, an IN operator is more appropriate. The IN operator enables Oracle to drive off of the request_id index, which is rather selective.

Trouble Shooting

This section documents the steps and procedures involved with diagnosing a CBO execution plan for a given SQL statement:

Generate SQL trace
Review EXPLAIN PLAN
Verify statistics
Try hints to obtain correct plan

Tuning Distributed Queries

Oracle supports transparent distributed queries to access data from multiple databases. It also provides many other distributed features, such as transparent distributed transactions and a transparent, fully automatic two-phase commit. This section explains how the Oracle8i optimizer decomposes SQL statements and how this affects the performance of distributed queries. The section also provides guidelines on how to influence the optimizer and avoid performance bottlenecks.

This section contains the following sections:

Remote and Distributed Queries

If a SQL statement references one or more remote tables, then the optimizer first determines whether all remote tables are located at the same site. If all tables are located at the same remote site, then Oracle sends the entire query to the remote site for execution. The remote site sends the resulting rows back to the local site. This is called a remote SQL statement. If the tables are located at more than one site, then the optimizer decomposes the query into separate SQL statements to access each of the remote tables. This is called a distributed SQL statement. The site where the query is executed, called the driving site, is usually the local site.

This section describes:

Remote Data Dictionary Information

If a SQL statement references multiple tables, then the optimizer must determine which columns belong to which tables before it can decompose the SQL statement. For example:

SELECT dname, ename 
FROM dept, emp@remote 
WHERE dept.deptno = emp.deptno

The optimizer must first determine that the dname column belongs to the dept table and the ename column to the emp table. After the optimizer has the data dictionary information of all remote tables, it can build the decomposed SQL statements.

Column and table names in decomposed SQL statements appear between double quotes. You must enclose in double quotes any column and table names that contain special characters, reserved words, or spaces.

This mechanism also replaces an asterisk (*) in the select list with the actual column names. For example:

SELECT * 
FROM dept@remote;

Results in the decomposed SQL statement

SELECT a1."DEPTNO", a1."DNAME", a1."LOC" 
FROM "DEPT" a1;

Note:
For simplicity, double quotes are not used in the remainder of this chapter.

Remote SQL Statements

If the entire SQL statement is sent to the remote database, then the optimizer uses table aliases A1, A2, and so on, for all tables and columns in the query, in order to avoid possible naming conflicts. For example:

SELECT dname, ename 
FROM dept@remote, emp@remote 
WHERE dept.deptno = emp.deptno;

This is sent to the remote database as the following:

SELECT a2.dname, a1.ename 
FROM dept a2, emp a1 
WHERE a1.deptno = a2.deptno;

Distributed SQL Statements

When a query accesses data on one or more databases, one site drives the execution of the query. This is known as the driving site; it is here that the data is joined, grouped, and ordered. By default, the local Oracle server is the driving site. A hint called DRIVING_SITE enables you to manually specify the driving site.

The decomposition of SQL statements is important, because it determines the number of records or even tables that must be sent through the network. A knowledge of how the optimizer decomposes SQL statements can help you achieve optimum performance for distributed queries.

If a SQL statement references one or more remote tables, then the optimizer must decompose the SQL statement into separate queries to be executed on the different databases. For example:

SELECT dname, ename 
FROM dept, emp@remote 
WHERE dept.deptno = emp.deptno;

This could be decomposed into the following:

SELECT deptno, dname 
FROM dept;

Which is executed locally, and:

SELECT deptno, ename 
FROM emp;

Which is sent to the remote database. The data from both tables is joined locally. All this is done automatically and transparently for the user or application.

In some cases, however, it might be better to send the local table to the remote database and join the two tables on the remote database. This can be achieved either by creating a view or by using the DRIVING_SITE hint. If you decide to create a view on the remote database, then a database link from the remote database to the local database is also needed.

For example (on the remote database):

CREATE VIEW dept_emp AS 

SELECT dname, ename 
FROM dept@local, emp 
WHERE dept.deptno = emp.deptno;

Next, select from the remote view instead of the local and remote tables:

SELECT * 
FROM dept_emp@remote;

Now, the local dept table is sent through the network to the remote database, joined on the remote database with the emp table, and the result is sent back to the local database.

Rule-Based Optimization

The rule-based optimizer does not have information about indexes for remote tables. It never, therefore, generates a nested loops join between a local table and a remote table with the local table as the outer table in the join. It uses either a nested loops join with the remote table as the outer table or a sort merge join, depending on the indexes available for the local table.

Cost-Based Optimization

The cost-based optimizer can consider more execution plans than the rule-based optimizer. The cost-based optimizer knows whether indexes on remote tables are available, and in which cases it makes sense to use them. The cost-based optimizer considers index access of the remote tables as well as full table scans, whereas the rule-based optimizer considers only full table scans.

The particular execution plan and table access that the cost-based optimizer chooses depends on the table and index statistics. For example:

SELECT dname, ename 
FROM dept, emp@remote 
WHERE dept.deptno = emp.deptno

Here, the optimizer might choose the local dept table as the driving table, and access the remote emp table using an index; so the decomposed SQL statement becomes the following:

SELECT ename FROM emp 
WHERE deptno = :1

This decomposed SQL statement is used for a nested loops operation.

Using Views

If tables are on more than one remote site, then it can be more effective to create a view than to use the DRIVING_SITE hint. If not all tables are on the same remote database, then the optimizer accesses each remote table separately. For example:

SELECT d.dname, e1.ename, e2.job 
FROM dept d, emp@remote e1, emp@remote e2 
WHERE d.deptno = e1.deptno 

AND e1.mgr = e2.empno;

This results in the decomposed SQL statements:

SELECT empno, ename 
FROM emp;

and:

SELECT ename, mgr, deptno 
FROM emp;

To join the two emp tables remotely, create a view with the join of the remote tables on the remote database. For example (on the remote database):

CREATE VIEW emps AS 

SELECT e1.deptno, e1.ename, e2.job 
FROM emp e1, emp e2 
WHERE e1.mgr = e2.empno;

Now, select from the remote view, instead of the remote tables:

SELECT d.dname, e.ename, e.job 
FROM dept d, emps@remote e 
WHERE d.deptno = e.deptno;

This results in the decomposed SQL statement:

SELECT deptno, ename, job 
FROM emps;

Using Hints

In a distributed query, all hints are supported for local tables. For remote tables, however, you can use only join order and join operation hints. (Hints for access methods, parallel hints, and so on, have no effect.) For remote mapped queries, all hints are supported.

See Also:
For more information on hints for join orders and hints for join operations, see Chapter 7, "Using Optimizer Hints".

EXPLAIN PLAN and SQL Decomposition

EXPLAIN PLAN gives information not only about the overall execution plan of SQL statements, but also about the way in which the optimizer decomposes SQL statements. EXPLAIN PLAN stores information in the PLAN_TABLE table. If remote tables are used in a SQL statement, then the OPERATION column contains the value REMOTE to indicate that a remote table is referenced, and the OTHER column contains the decomposed SQL statement that will be sent to the remote database. For example:

EXPLAIN PLAN FOR SELECT DNAME FROM DEPT@REMOTE 
SELECT OPERATION, OTHER FROM PLAN_TABLE 

OPERATION OTHER 
--------- -------------------------------------
REMOTE    SELECT A1."DNAME" FROM "DEPT" A1

Note the table alias and the double quotes around the column and table names.

See Also:
For more information on EXPLAIN PLAN, see Chapter 5, "Using EXPLAIN PLAN".

Partition Views

Partition views coalesce tables that have the same structure, but that contain different partitions of data. Partition views are supported for distributed databases where each partition resides on a database, and the data in each partition has common geographical properties.

When a query is executed on a partition view, and when the query contains a predicate that contains the result set to a subset of the view's partitions, the optimizer chooses a plan which skips partitions that are not needed for the query. This partition elimination takes place at run time, when the execution plan references all partitions.

Partition views were the only form of partitioning available in Oracle7 Release 7.3. They are not recommended for new applications in Oracle8i. Partition views that were created for Oracle7 databases can be converted to partitioned tables by using the EXCHANGE PARTITION option of the ALTER TABLE statement.

Note:
Oracle8i supports partition views only for distributed queries and for backwards compatibility with Oracle7 Release 7.3. Future releases of Oracle will not support partition views.

See Also:

For instructions on converting partition views to partitioned tables, see Oracle8i Administrator's Guide.

For instructions on migrating from partition views to partitioned tables, see Oracle8i Migration.

For general information on partition views and partitioned tables, see Oracle8i Concepts.

Using UNION ALL to Skip Partitions

There are circumstances under which a UNION ALL view enables the optimizer to skip partitions. The Oracle server that contains the partition view must conform to the following rules:

The PARTITION_VIEW_ENABLED initialization parameter is set to true.

The cost-based optimizer is used.

Note:
To use the cost-based optimizer, you must analyze all tables used in the UNION ALL views. Alternatively, you can use a hint or set the parameter OPTIMIZER_MODE to ALL_ROWS or FIRST_ROW. To set OPTIMIZER_MODE or PARTITION_VIEW_ENABLED, you can also use the ALTER SESSION statement.

Within a UNION ALL view, there are multiple select statements, and each of these is called a branch. A UNION ALL view is a partition view if each select statement it defines conforms to the following rules:

The branch has exactly one table in the FROM clause.
The branch contains a WHERE clause that defines the subset of data from the partition that is contained in the view.
None of the following are used within the branch: WHERE clause with subquery, GROUP BY, aggregate functions, DISTINCT, ROWNUM, or CONNECT BY/START WITH.
The SELECT list of each branch is * or an explicit expansion of "*". The FROM clause should be either the base table or a view of the base table that contains all the columns in the base table.
The column names and column datatypes for all branches in the UNION ALL view are exactly the same.
All tables used in the branch must have indexes (if any) on the same columns and number of columns.

Partition elimination is based on column transitivity with constant predicates. The WHERE clause used in the query that accesses the partition view is pushed down to the WHERE clause of each of the branches in the UNION ALL view definition. For example:

SELECT * FROM emp_view 
WHERE deptno=30;

Where the view emp_view is defined as the following:

SELECT * FROM emp@d10 WHERE deptno=10 

UNION ALL 

SELECT * FROM emp@d20 WHERE deptno=20 

UNION ALL 

SELECT * FROM emp@d30 WHERE deptno=30 

UNION ALL 

SELECT * FROM emp@d40 WHERE deptno=40

The "WHERE deptno=30" predicate used in the query is pushed down to the queries in the UNION ALL view. For a WHERE clause such as "WHERE deptno=10 and deptno=30", the optimizer applies transitivity rules to generate an extra predicate of "10=30". This extra predicate is always false; thus, the table (emp@d10) need not be accessed.

Transitivity applies to predicates which conform to the following rules:

The predicates in the WHERE clause for each branch are of the form:
```
RELATION AND RELATION ... 
```
where relation is of the form
```
COLUMN_NAME RELOP CONSTANT_EXPRESSION 
```
and relop is one of =, !=, >, >=, <, <=

Note:
BETWEEN ... AND is allowed by these rules, but IN is not.

At least one predicate in the query referencing the view exists in the same form.

EXPLAIN PLAN Output

To confirm that the system recognizes a partition view, check the EXPLAIN PLAN output. The following operations appear in the OPERATIONS column of the EXPLAIN PLAN output, if a query was executed on a partition view:

VIEW

This should include the optimizer cost in the COST column.

UNION-ALL

This should specify PARTITION in the OPTION column.

FILTER

When an operation is a child of the UNION-ALL operation, this indicates that a constant predicate was generated that will always be false. The partition is eliminated.

If PARTITION does not appear in the option column of the UNION-ALL operation, then the partition view was not recognized, and no partitions were eliminated. Make sure that the UNION ALL view adheres to the rules defined in "Using UNION ALL to Skip Partitions" .

Partition View Example

The following example shows the partition view customer partitioned into two partitions: the east database contains the East Coast customers, and the west database contains the West Coast customers.

The west database contains the following table customer_west:

CREATE TABLE customer_west 
   ( cust_no   NUMBER CONSTRAINT CUSTOMER_WEST_PK PRIMARY KEY, 
     cname     VARCHAR2(10), 
     location  VARCHAR2(10) 
    );

The east database contains the database customer_east:

CREATE TABLE customer_east 
   ( cust_no   NUMBER CONSTRAINT CUSTOMER_EAST_PK PRIMARY KEY, 
     cname     VARCHAR2(10), 
     location  VARCHAR2(10) 
    );

The following partition view is created at the east database (you could create a similar view at the west database):

CREATE VIEW customer AS 

SELECT * 
FROM customer_east 
WHERE location='EAST'  
UNION ALL 
SELECT * 
FROM customer_west@west 
WHERE location='WEST';

If you execute the following statement, then notice that the customer_west table in the west database is not accessed:

EXPLAIN PLAN FOR SELECT * FROM customer WHERE location='EAST';

Note:
The east database still needs column name and column datatype information for the customer_west table; therefore, it still needs a connection to the WEST database. In addition, the cost-based optimizer must be used. You could do this by issuing the following statement:
ALTER SESSION SET OPTIMIZER_MODE=ALL_ROWS

As shown in the EXPLAIN PLAN output, the optimizer recognizes that the customer_west partition need not be accessed:

SELECT LPAD(' ',LEVEL*3-3)||OPERATION OPERATION,COST,OPTIONS,
OBJECT_NODE, OTHER 
FROM PLAN_TABLE 
CONNECT BY PARENT_ID = PRIOR ID 
START WITH PARENT_ID IS NULL 

OPERATION                 COST OPTIONS    OBJECT_NOD OTHER 
------------------------- ---- ---------- ---------- ------------------------- 
SELECT STATEMENT             1 
   VIEW                      1 
      UNION-ALL                PARTITION 
         TABLE ACCESS        1 FULL 
         FILTER 
            REMOTE           1            WEST.WORLD SELECT "CUST_NO","CNAME", 
                                                     "LOCATION" FROM "CUSTOMER 
                                                     _WEST" "CUSTOMER_WEST" WH 
                                                     ERE "LOCATION"='EAST' AND 
                                                      "LOCATION"='WEST'

Distributed Query Restrictions

Distributed queries within the same version of Oracle have the following restrictions:

The cost-based optimizer should be used for distributed queries. The rule-based optimizer does not generate nested loop joins between remote and local tables when the tables are joined with equijoins.
In the cost-based optimizer, no more than 20 indexes per remote table are considered when generating query plans. The order of the indexes varies; if the 20-index limitation is exceeded, then random variation in query plans may result.
Reverse indexes on remote tables are not visible to the optimizer. This can prevent nested-loop joins from being used for remote tables if there is an equijoin using a column with only a reverse index.
The cost-based optimizer cannot recognize that a remote object is partitioned. Thus, the optimizer may generate less than optimal plans for remote partitioned objects, particularly when partition pruning would have been possible, had the object been local.
Remote views are not merged, and the optimizer has no statistics for them. It is best to replicate all mergeable views at all sites to obtain good query plans. (See the next restriction.)
Neither the cost-based nor the rule-based optimizer can execute joins remotely. All joins are executed at the driving site. This can affect performance for CREATE TABLE ... AS SELECT if all the tables in the select list are remote. In this case, you should create a view for the SELECT statement at the remote site.

Transparent Gateways

The Transparent Gateways transparently access data from a non-Oracle system (relational databases, hierarchical databases, file systems, and so on), just as if it were another Oracle database.

Optimizing Heterogeneous Distributed SQL Statements

When a SQL statement accesses data from non-Oracle systems, it is said to be a heterogeneous distributed SQL statement. To optimize heterogeneous distributed SQL statements, follow the same guidelines as for optimizing distributed SQL statements that access Oracle databases only. However, you must consider that the non-Oracle system usually does not support all the functions and operators that Oracle8i supports.

The Transparent Gateways tell Oracle (at connect time) which functions and operators they do support. If the other data source does not support a function or operator, then Oracle performs that function or operator. In this case, Oracle obtains the data from the other data source and applies the function or operator locally. This affects the way in which the SQL statements are decomposed and can affect performance, especially if Oracle is not on the same machine as the other data source.

Gateways and Partition Views

You can use partition views with Oracle Transparent Gateways release 8 or higher. Make sure you adhere to the rules that are defined in "Using UNION ALL to Skip Partitions". In particular:

The cost-based optimizer must be used, by using hints or setting the parameter OPTIMIZER_MODE to ALL_ROWS or FIRST_ROWS.
Indexes used for each partition must be the same. Consult your gateway- specific documentation to find out whether the gateway sends index information of the non-Oracle system to the Oracle Server. If the gateway sends index information to the optimizer, then make sure that each partition uses the same number of indexes, and that you have indexed the same columns. If the gateway does not send index information, then the Oracle optimizer is not aware of the indexes on partitions. Indexes are, therefore, considered to be the same for each partition in the non-Oracle system. If one partition resides on an Oracle server, then you cannot have an index defined on that partition.
The column names and column datatypes for all branches in the UNION ALL view must be the same. Non-Oracle system datatypes are mapped onto Oracle datatypes. Make sure that the datatypes of each partition that reside in the different non-Oracle systems all map to the same Oracle datatype. To see how datatypes are mapped onto Oracle datatypes, execute a DESCRIBE statement in SQL*Plus.

Optimizing Performance of Distributed Queries

You can improve performance of distributed queries in several ways:

Choose the best SQL statement.

In many cases, there are several SQL statements which can achieve the same result. If all tables are on the same database, then the difference in performance between these SQL statements might be minimal; but, if the tables are located on different databases, then the difference in performance might be more significant.
Use the cost-based optimizer.

The cost-based optimizer uses indexes on remote tables, considers more execution plans than the rule-based optimizer, and generally gives better results. With the cost-based optimizer, performance of distributed queries is generally satisfactory. Only in rare occasions is it necessary to change SQL statements, create views, or use procedural code.
Use views.

In some situations, views can be used to improve performance of distributed queries. For example:
- Joining several remote tables on the remote database.
- Sending a different table through the network.
- Using procedural code.
In some rare occasions, it can be more efficient to replace a distributed query by procedural code, such as a PL/SQL procedure or a precompiler program. This option is mentioned here only for completeness, not because it is often needed.

VIEW	This should include the optimizer cost in the `COST` column.
UNION-ALL	This should specify `PARTITION` in the `OPTION` column.
FILTER	When an operation is a child of the `UNION`-`ALL` operation, this indicates that a constant predicate was generated that will always be `false`. The partition is eliminated.

9 Optimizing SQL Statements

Approaches to SQL Statement Tuning

Restructuring the Indexes

Restructuring the Statement

Consider Alternative SQL Syntax

Figure 9-1 Execution Plan with Two Full Table Scans

Figure 9-2 Execution Plan with a Full Table Scan and an Index Scan

Compose Predicates Using AND and =

Choose an Advantageous Join Order

Use Untransformed Column Values

Avoid Mixed-Type Expressions

Write Separate SQL Statements for Specific Values

Use Hints to Control Access Paths

Use Care When Using IN and NOT IN with a Subquery

Use Care When Embedding Data Value Lists in Applications

Reduce the Number of Calls to the Database

Use Care When Managing Views

Use Care When Joining Views

Use Care When Unnesting Subqueries

Use Care When Performing Outer Joins To Views

Do Not Recycle Views

Modifying or Disabling Triggers

Restructuring the Data

Keeping Statistics Current and Using Plan Stability to Preserve Execution Plans

Tuning Goals

Table 9-1 Contrasting Goals for Tuning

Tuning a Serial SQL Statement

Tuning Parallel Execution

When to Implement Parallel Execution

Tuning OLTP Applications

SQL and Shared PL/SQL

Transaction Modes

Triggers

Best Practices

Avoiding Rule-Based Optimizer Techniques

Index Cost

Analyzing Object Statistics

Rule Plan

CBO Plan (default)

Force Rule Plan Using Hints

Rewrite SQL

New CBO Plan

Avoiding Complex Expressions

Avoiding Balloon Tactic for Coding SQL

Handling Complex Logic in the Application

SQL Tuning Tips

Table 9-2 SQL Tuning Tips

Using EXPLAIN PLAN on All Queries

Predicate Collapsing

Tuning for the Typical Case

Disk Reads and Buffer Gets

Using EXISTS versus IN

Using EXISTS in a SELECT Statement

Using IN in a SELECT Statement with Nested Loop Join

Using EXISTS in an UPDATE Statement

Trouble Shooting

Tuning Distributed Queries

Remote and Distributed Queries

Remote Data Dictionary Information

Remote SQL Statements

Distributed SQL Statements

Rule-Based Optimization

Cost-Based Optimization

Using Views

Using Hints

EXPLAIN PLAN and SQL Decomposition

Partition Views

Using UNION ALL to Skip Partitions

EXPLAIN PLAN Output

Partition View Example

Distributed Query Restrictions

Transparent Gateways

Optimizing Heterogeneous Distributed SQL Statements

Gateways and Partition Views

Optimizing Performance of Distributed Queries

9
Optimizing SQL Statements