Oracle8i Concepts
Release 2 (8.1.6)

Part Number A76965-01

Library

Product

Contents

Index

Go to previous page Go to next page

23
Parallel Execution of SQL Statements

This chapter describes the parallel execution of SQL statements. The topics in this chapter include:

Introduction to Parallel Execution of SQL Statements

When Oracle is not parallelizing the execution of SQL statements, each SQL statement is executed sequentially by a single process. With parallel execution, however, multiple processes work together simultaneously to execute a single SQL statement. By dividing the work necessary to execute a statement among multiple processes, Oracle can execute the statement more quickly than if only a single process executed it.

Parallel execution can dramatically improve performance for data-intensive operations associated with decision support applications or very large database environments. Symmetric multiprocessing (SMP), clustered systems, and massively parallel systems (MPP) gain the largest performance benefits from parallel execution because statement processing can be split up among many CPUs on a single Oracle system.

Parallel execution helps systems scale in performance by making optimal use of hardware resources. If your system's CPUs and disk controllers are already heavily loaded, you need to alleviate the system's load or increase these hardware resources before using parallel execution to improve performance.

See Also:

Oracle8i Designing and Tuning for Performance for specific information on tuning your parameter files and database to take full advantage of parallel execution 

Operations That Can Be Parallelized

The Oracle server can use parallel execution for any of these operations:

How Oracle Parallelizes Operations

A SELECT statement contains a query only. A DML or DDL statement usually contains a query portion and a DML or DDL portion, each of which can be parallelized.


Note:

Although data manipulation language (DML) includes queries, in this chapter "DML" refers only to INSERTs, UPDATEs, and DELETEs.  


Oracle primarily parallelizes SQL statements in the following ways:

  1. Parallelize by block ranges for scan operations (SELECTs and subqueries in DML and DDL statements).

  2. Parallelize by partitions for DDL and DML operations on partitioned tables and indexes.

  3. Parallelize by parallel execution servers for inserts into nonpartitioned tables only.

Parallelizing by Block Range

Oracle parallelizes a query dynamically at execution time. Dynamic parallelism divides the table or index into ranges of database blocks (rowid range) and executes the operation in parallel on different ranges. If the distribution or location of data changes, Oracle automatically adapts to optimize the parallelization for each execution of the query portion of a SQL statement.

Parallel scans by block range break the table or index into pieces delimited by high and low rowid values. The table or index can be nonpartitioned or partitioned.

For partitioned tables and indexes, no rowid range can span a partition although one partition can contain multiple rowid ranges. Oracle sends the partition numbers with the rowid ranges to avoid partition map lookup. Compile and run-time predicates on partitioning columns restrict the rowid ranges to relevant partitions, eliminating unnecessary partition scans (partition pruning).

This means that a parallel query which accesses a partitioned table by a table scan performs the same or less overall work as the same query on a nonpartitioned table. The query on the partitioned table executes with equivalent parallelism, although the total number of disks accessed might be reduced by the partition pruning.

Oracle can parallelize the following operations on tables and indexes by block range (rowid range):

Parallelizing by Partition

Partitions are a logical static division of tables and indexes which can be used to break some long-running operations into smaller operations executed in parallel on individual partitions. The granule of parallelism is a partition; there is no parallelism within a partition except for:

Operations on partitioned tables and indexes are performed in parallel by assigning different parallel execution servers to different partitions of the table or index. Compile and run-time predicates restrict the partitions when the operation references partitioning columns. The operation executes serially when compile or run-time predicates restrict the operation to a single partition.

The parallel operation may use fewer parallel execution servers than the number of accessed partitions (because of resource limits, hints, or table attributes), but each partition is accessed by a single parallel execution server. A parallel execution server, however, can access multiple partitions.

Operations on partitioned tables and indexes are performed in parallel only when more than one partition is accessed.

Oracle can parallelize the following operations on partitioned tables and indexes by partition:

Parallelizing by Parallel Execution Servers

For nonpartitioned tables only, Oracle parallelizes insert operations by dividing the work among parallel execution servers. Because new rows do not have rowids, the rows are distributed among the parallel execution servers to insert them into the free space.

Process Architecture for Parallel Execution

When parallel execution is not being used, a single server process performs all necessary processing for the sequential execution of a SQL statement. For example, to perform a full table scan (such as SELECT * FROM EMP), one process performs the entire operation, as illustrated in Figure 23-1.

Figure 23-1 Serial Full Table Scan


Parallel execution performs these operations in parallel using multiple parallel processes. One process, known as the parallel execution coordinator, dispatches the execution of a statement to several parallel execution servers and coordinates the results from all of the server processes to send the results back to the user.


Note:

Parallel execution server does not mean a process of an Oracle Parallel Server, but instead means a process that performs an operation in parallel with other processes. In an Oracle Parallel Server, the parallel execution servers may be spread across multiple instances. Parallel execution servers are also sometimes called slave processes.  


When an operation is divided into pieces for parallel execution in a massively parallel processing (MPP) configuration, Oracle assigns a particular piece of the operation to a parallel execution server by taking into account the affinity of the process for the piece of the table or index to be used for the operation. The physical layout of partitioned tables and indexes impacts on the affinity used to assign work for parallel execution servers.

Figure 23-2 illustrates several parallel execution servers simultaneously performing a partial scan of the EMP table, which is divided by block range dynamically (dynamic partitioning). The parallel execution servers send results back to the parallel execution coordinator, which assembles the pieces into the desired full table scan.

Figure 23-2 Parallel Full Table Scan


The parallel execution coordinator breaks down execution functions into parallel pieces and then integrates the partial results produced by the parallel execution servers. The number of parallel execution servers assigned to a single operation is the degree of parallelism (DOP) for an operation. Multiple operations within the same SQL statement all have the same degree of parallelism.

See Also:

 

The Parallel Execution Server Pool

When an instance starts up, Oracle creates a pool of parallel execution servers which are available for any parallel operation. The initialization parameter PARALLEL_MIN_SERVERS specifies the number of parallel execution servers that Oracle creates at instance startup.

When executing a parallel operation, the parallel execution coordinator obtains parallel execution servers from the pool and assigns them to the operation. If necessary, Oracle can create additional parallel execution servers for the operation. These parallel execution servers remain with the operation throughout job execution, then become available for other operations. After the statement has been processed completely, the parallel execution servers return to the pool.


Note:

The parallel execution coordinator and the parallel execution servers can only service one statement at a time. A parallel execution coordinator cannot coordinate, for example, a parallel query and a parallel DML statement at the same time.  


When a user issues a SQL statement, the optimizer decides whether to execute the operations in parallel and determines the degree of parallelism for each operation. You can specify the number of parallel execution servers required for an operation in various ways.

If the optimizer targets the statement for parallel processing, the following sequence of events takes place:

  1. The SQL statement's foreground process becomes a parallel execution coordinator.

  2. The parallel execution coordinator obtains as many parallel execution servers as needed (determined by the degree of parallelism) from the server pool or creates new parallel execution servers as needed.

  3. Oracle executes the statement as a sequence of operations. Each operation is performed in parallel, if possible.

  4. When statement processing is completed, the coordinator returns any resulting data to the user process that issued the statement and returns the parallel execution servers to the server pool.

The parallel execution coordinator calls upon the parallel execution servers during the execution of the SQL statement, not during the parsing of the statement. Therefore, when parallel execution is used with the multi-threaded server, the server process that processes the EXECUTE call of a user's statement becomes the parallel execution coordinator for the statement.

See Also:

"Setting the Degree of Parallelism" 

Variations in the Number of Parallel Execution Servers

If the number of parallel operations processed concurrently by an instance changes significantly, Oracle automatically changes the number of parallel execution servers in the pool.

If the number of parallel operations increases, Oracle creates additional parallel execution servers to handle incoming requests. However, Oracle never creates more parallel execution servers for an instance than what is specified by the initialization parameter PARALLEL_MAX_SERVERS.

If the number of parallel operations decreases, Oracle terminates any parallel execution servers that have been idle for a threshold period of time. Oracle does not reduce the size of the pool below the value of PARALLEL_MIN_SERVERS no matter how long the parallel execution servers have been idle.

Processing Without Enough Parallel Execution Servers

Oracle can process a parallel operation with fewer than the requested number of processes.

If all parallel execution servers in the pool are occupied and the maximum number of parallel execution servers has been started, the parallel execution coordinator switches to serial processing.

See Also:

 

How Parallel Execution Servers Communicate

To execute a query in parallel, Oracle generally creates a producer queue server and a consumer server. The producer queue server retrieves rows from tables and the consumer server performs operations such as join, sort, DML, and DDL on these rows. Each server in the producer execution process set has a connection to each server in the consumer set. This means that the number of virtual connections between parallel execution servers increases as the square of the degree of parallelism.

Each communication channel has at least 1, and sometimes up to 4 memory buffers. Multiple memory buffers facilitate asynchronous communication among the parallel execution servers.

A single-instance environment uses at most 3 buffers per communication channel. An Oracle Parallel Server environment uses at most 4 buffers per channel. Figure 23-3 illustrates message buffers and how producer parallel execution servers connect to consumer parallel execution servers.

Figure 23-3 Parallel Execution Server Connections and Buffers


When a connection is between two processes on the same instance, the servers communicate by passing the buffers back and forth. When the connection is between processes in different instances, the messages are sent using external high-speed network protocols. In Figure 23-3, the degree of parallelism is equal to the number of parallel execution servers, which in this case is "n". Figure 23-3 does not show the parallel execution coordinator: each parallel execution server actually has an additional connection to the parallel execution coordinator.

Parallelizing SQL Statements

Each SQL statement undergoes an optimization and parallelization process when it is parsed. Therefore, when the data changes, if a more optimal execution plan or parallelization plan becomes available, Oracle can automatically adapt to the new situation.

After the optimizer determines the execution plan of a statement, the parallel execution coordinator determines the parallelization method for each operation in the execution plan. For example, parallelize a full table scan by block range or parallelize an index range scan by partition. The coordinator must decide whether an operation can be performed in parallel and, if so, how many parallel execution servers to enlist. The number of parallel execution servers is the degree of parallelism.

See Also:

 

Dividing Work Among Parallel Execution Servers

The parallel execution coordinator examines the redistribution requirements of each operation. An operation's redistribution requirement is the way in which the rows operated on by the operation must be divided, or redistributed, among the parallel execution servers.

After determining the redistribution requirement for each operation in the execution plan, the optimizer determines the order in which the operations in the execution plan must be performed. With this information, the optimizer determines the data flow of the statement.

Figure 23-4 illustrates the data flow of the following query:

SELECT dname, MAX(sal), AVG(sal) 
  FROM emp, dept 
    WHERE emp.deptno = dept.deptno 
    GROUP BY dname; 

Figure 23-4 Data Flow Diagram for a Join of the EMP and DEPT Tables


Parallelism Between Operations

Operations that require the output of other operations are known as parent operations. In Figure 23-4 the GROUP BY SORT operation is the parent of the MERGE JOIN operation because GROUP BY SORT requires the MERGE JOIN output.

Parent operations can begin consuming rows as soon as the child operations have produced rows. In the previous example, while the parallel execution servers are producing rows in the FULL SCAN DEPT operation, another set of parallel execution servers can begin to perform the MERGE JOIN operation to consume the rows.

Each of the two operations performed concurrently is given its own set of parallel execution servers. Therefore, both query operations and the data flow tree itself have parallelism. The parallelism of an individual operation is called intra-operation parallelism and the parallelism between operations in a data flow tree is called inter-operation parallelism.

Due to the producer/consumer nature of the Oracle server's operations, only two operations in a given tree need to be performed simultaneously to minimize execution time.

To illustrate intra-operation parallelism and inter-operator parallelism, consider the following statement:

SELECT * FROM emp ORDER BY ename; 

The execution plan implements a full scan of the EMP table followed by a sorting of the retrieved rows based on the value of the ENAME column. For the sake of this example, assume the ENAME column is not indexed. Also assume that the degree of parallelism for the query is set to four, which means that four parallel execution servers can be active for any given operation.

Figure 23-5 illustrates the parallel execution of our example query.

Figure 23-5 Inter-Operation Parallelism and Dynamic Partitioning


As you can see from Figure 23-5, there are actually eight parallel execution servers involved in the query even though the degree of parallelism is four. This is because a parent and child operator can be performed at the same time (inter-operation parallelism).

Also note that all of the parallel execution servers involved in the scan operation send rows to the appropriate parallel execution server performing the sort operation. If a row scanned by a parallel execution server contains a value for the ENAME column between A and G, that row gets sent to the first ORDER BY parallel execution server. When the scan operation is complete, the sorting processes can return the sorted results to the coordinator, which in turn returns the complete query results to the user.


Note:

When a set of parallel execution servers completes its operation, it moves on to operations higher in the data flow. For example, in the previous diagram, if there was another ORDER BY operation after the ORDER BY, the parallel execution servers performing the table scan perform the second ORDER BY operation after completing the table scan.  


Setting the Degree of Parallelism

The parallel execution coordinator may enlist two or more of the instance's parallel execution servers to process a SQL statement. The number of parallel execution servers associated with a single operation is known as the degree of parallelism.

The degree of parallelism is specified in the following ways:

The following example shows a statement that sets the degree of parallelism to 4 on a table:

ALTER TABLE emp PARALLEL 4;


This next example sets the degree of parallelism on an index:

ALTER INDEX iemp PARALLEL;


This last example sets a hint to 4 on a query:

SELECT /*+ PARALLEL(emp,4) */ COUNT(*) FROM emp ;


Note that the degree of parallelism applies directly only to intra-operation parallelism. If inter-operation parallelism is possible, the total number of parallel execution servers for a statement can be twice the specified degree of parallelism. No more than two operations can be performed simultaneously.

Parallel execution is designed to effectively use multiple CPUs and disks to answer queries quickly. When multiple users use parallel execution at the same time, it is easy to quickly exhaust available CPU, memory, and disk resources. Oracle provides several ways to deal with resource utilization in conjunction with parallel execution, including:

How Oracle Determines the Degree of Parallelism for Operations

The parallel execution coordinator determines the degree of parallelism by considering several specifications. The coordinator:

  1. Checks for hints or a PARALLEL clause specified in the SQL statement itself.

  2. Looks at the table's or index's definition.

  3. Checks for the default degree of parallelism.

After a degree of parallelism is found in one of these specifications, it becomes the degree of parallelism for the operation.

Hints, PARALLEL clauses, table or index definitions, and default values only determine the number of parallel execution servers that the coordinator requests for a given operation. The actual number of parallel execution servers uses depends upon how many processes are available in the parallel execution server pool and whether inter-operation parallelism is possible.

See Also:

 

Hints

You can specify hints in a SQL statement to set the degree of parallelism for a table or index and the caching behavior of the operation.

Table and Index Definitions

You can specify the degree of parallelism within a table or index definition. Use one of the following SQL statements to set the degree of parallelism for a table or index: CREATE TABLE, ALTER TABLE, CREATE INDEX, or ALTER INDEX.

See Also:

Oracle8i SQL Reference for the complete syntax of SQL statements 

Default Degree of Parallelism

The default degree of parallelism is used when you ask to parallelize an operation but you do not specify a degree of parallelism in a hint or within the definition of a table or index. The default degree of parallelism is appropriate for most applications.

The default degree of parallelism for a SQL statement is determined by the following factors:

The above factors determine the default number of parallel execution servers to use. However, the actual number of processes used is limited by their availability on the requested instances during run time. The initialization parameter PARALLEL_MAX_SERVERS sets an upper limit on the total number of parallel execution servers that an instance can have.

If a minimum fraction of the desired parallel execution servers is not available (specified by the initialization parameter PARALLEL_MIN_PERCENT), a user error is produced. The user can then retry the query with less parallelism.

See Also:

Oracle8i Designing and Tuning for Performance for information about adjusting the degree of parallelism 

Adaptive Multi-User Algorithm

When the adaptive multi-user algorithm is enabled, the parallel execution coordinator varies the degree of parallelism according to the system load. The Database Resource Manager determines the load by calculating the number of allocated threads. If the number of threads currently allocated is larger than the optimal number of threads, given the number of available CPUs, the algorithm reduces the degree of parallelism. This reduction improves throughput by avoiding overallocation of resources.

Minimum Number of Parallel Execution Servers

Oracle can perform an operation in parallel as long as at least two parallel execution servers are available. If too few parallel execution servers are available, your SQL statement may execute slower than expected. You can specify that a minimum percentage of requested parallel execution servers must be available in order for the operation to execute. This strategy ensures that your SQL statement executes with a minimum acceptable parallel performance. If the minimum percentage of requested parallel execution servers are not available, the SQL statement does not execute and returns an error.

The initialization parameter PARALLEL_MIN_PERCENT specifies the desired minimum percentage of requested parallel execution servers. This parameter affects DML and DDL operations as well as queries.

For example, if you specify 50 for this parameter, then at least 50% of the parallel execution servers requested for any parallel operation must be available in order for the operation to succeed. If 20 parallel execution servers are requested, then at least 10 must be available or an error is returned to the user. If PARALLEL_MIN_PERCENT is set to null, then all parallel operations will proceed as long as at least two parallel execution servers are available for processing.

Limiting the Number of Available Instances

In an Oracle Parallel Server, instance groups can be used to limit the number of instances that participate in a parallel operation. You can create any number of instance groups, each consisting of one or more instances. You can then specify which instance group is to be used for any or all parallel operations. Parallel execution servers will only be used on instances which are members of the specified instance group.

See Also:

Oracle8i Parallel Server Concepts for more information about instance groups 

Balancing the Work Load

To optimize performance, all parallel execution servers should have equal work loads. For SQL statements parallelized by block range or by parallel execution servers, the work load is dynamically divided among the parallel execution servers. This minimizes workload skewing, which occurs when some parallel execution servers perform significantly more work than the other processes.

For SQL statements parallelized by partitions, if the work load is evenly distributed among the partitions then you can optimize performance by matching the number of parallel execution servers to the number of partitions, or by choosing a degree of parallelism such that the number of partitions is a multiple of the number of processes.

For example, if a table has ten partitions and a parallel operation divides the work evenly among them, you can use ten parallel execution servers (degree of parallelism = 10) to do the work in approximately one-tenth the time that one process would take, or you can use five processes to do the work in one-fifth the time, or two processes to do the work in one-half the time.

If, however, you use nine processes to work on ten partitions, the first process to finish its work on one partition then begins work on the tenth partition; and as the other processes finish their work they become idle. This does not give good performance when the work is evenly divided among partitions. When the work is unevenly divided, the performance varies depending on whether the partition that is left for last has more or less work than the other partitions.

Similarly, if you use four processes to work on ten partitions and the work is evenly divided, then each process works on a second partition after finishing its first partition, but only two of the processes work on a third partition while the other two remain idle.

In general, you cannot assume that the time taken to perform a parallel operation on N partitions with P parallel execution servers will be N/P, because of the possibility that some processes might have to wait while others finish working on the last partition(s). By choosing an appropriate degree of parallelism, however, you can minimize the workload skewing and optimize performance.

See Also:

"Affinity and Parallel DML" for information about balancing the work load with disk affinity 

Parallelization Rules for SQL Statements

A SQL statement can be parallelized if it includes a parallel hint or if the table or index being operated on has been declared PARALLEL with a CREATE or ALTER statement. In addition, a data definition language (DDL) statement can be parallelized by using the PARALLEL clause. However, not all of these methods apply to all types of SQL statements.

Parallelization has two components: the decision to parallelize and the degree of parallelism. These components are determined differently for queries, DDL operations, and DML operations.

To determine the degree of parallelism, Oracle looks at the reference objects:

Rules for Parallelizing Queries

Decision to Parallelize

A SELECT statement can be parallelized only if the following conditions are satisfied:

  1. The query includes a parallel hint specification (PARALLEL or PARALLEL_INDEX) or the schema objects referred to in the query have a PARALLEL declaration associated with them.

  2. At least one of the tables specified in the query requires one of the following:

    • A full table scan

    • An index range scan spanning multiple partitions

Degree of Parallelism

The degree of parallelism for a query is determined by the following rules:

  1. The query uses the maximum degree of parallelism taken from all of the table declarations involved in the query and all of the potential indexes that are candidates to satisfy the query (the reference objects). That is, the table or index that has the greatest degree of parallelism determines the query's degree of parallelism (maximum query directive).

  2. If a table has both a parallel hint specification in the query and a parallel declaration in its table specification, the hint specification takes precedence over parallel declaration specification.

Rules for Parallelizing UPDATE and DELETE

Update and delete operations are parallelized by partition or subpartition. Updates and deletes can only be parallelized on partitioned tables; update/delete parallelism is not possible within a partition, nor on a nonpartitioned table.

You have two ways to specify parallel directives for UPDATE and DELETE operations (assuming that PARALLEL DML mode is enabled):

  1. Parallel clause specified in the definition of the table being updated or deleted (the reference object).

  2. Update or delete parallel hint specified at the statement.

Parallel hints are placed immediately after the UPDATE or DELETE keywords in UPDATE and DELETE statements. The hint also applies to the underlying scan of the table being changed.

Parallel clauses in CREATE TABLE and ALTER TABLE statements specify table parallelism. If a parallel clause exists in a table definition, it determines the parallelism of DML statements as well as queries. If the DML statement contains explicit parallel hints for a table, however, then those hints override the effect of parallel clauses for that table.

You can use the ALTER SESSION FORCE PARALLEL DML statement to override parallel clauses for subsequent update and delete statements in a session. Parallel hints in update and delete statements override the ALTER SESSION FORCE PARALLEL DML statement.

See Also:

"Composite Partitioning" for information about updating and deleting when there are subpartitions 

Decision to Parallelize

The following rule determines whether the update/delete operation should be parallelized in an update/delete statement:

If the statement contains subqueries or updatable views, then they may have their own separate parallel hints or clauses, but these parallel directives do not affect the decision to parallelize the update or delete.

Although the parallel hint or clause on the tables is used by both query and update/delete portions to determine parallelism, the decision to parallelize the update/delete portion is made independently of the query portion, and vice versa.

Degree of Parallelism

The degree of parallelism is determined by the same rules as for the queries. Note that in the case of update and delete operations, only the target table to be modified (the only reference object) is involved.

The precedence rule to determine the degree of parallelism for the update/delete operation is that the update or delete parallel hint specification takes precedence over the parallel declaration specification of the target table:

The maximum degree of parallelism you can achieve is equal to the number of partitions (or subpartitions in the case of composite subpartitions) in the table. A parallel execution server can update into or delete from multiple partitions, but each partition can only be updated or deleted by one parallel execution server.

If the degree of parallelism is less than the number of partitions, then the first process to finish work on one partition continues working on another partition, and so on until the work is finished on all partitions. If the degree of parallelism is greater than the number of partitions involved in the operation, then the excess parallel execution servers would have no work to do.

Example 1:
UPDATE tbl_1 SET c1=c1+1 WHERE c1>100; 

If TBL_1 is a partitioned table and its table definition has a parallel clause, then the update operation will be parallelized even if the scan on the table is serial (such as an index scan), assuming that the table has more than one partition with C1 greater than 100.

Example 2:
UPDATE /*+ PARALLEL(tbl_2,4) */ tbl_2 SET c1=c1+1; 

Both the scan and update operations on TBL_2 will be parallelized with degree 4.

Rules for Parallelizing INSERT ... SELECT

An INSERT ... SELECT statement parallelizes its INSERT and SELECT operations independently, except for the degree of parallelism.

You can specify a parallel hint after the INSERT keyword in an INSERT ... SELECT statement. Because the tables being queried are usually not the same as the table being inserted into, the hint allows you to specify parallel directives specifically for the insert operation.

You have the following ways to specify parallel directives for an INSERT... SELECT statement (assuming that PARALLEL DML mode is enabled):

You can use the ALTER SESSION FORCE PARALLEL DML statement to override parallel clauses for subsequent insert operations in a session. Parallel hints in insert operations override the ALTER SESSION FORCE PARALLEL DML statement.

Decision to Parallelize

The following rule determines whether the insert operation should be parallelized in an INSERT... SELECT statement:

Hence the decision to parallelize the insert operation is made independently of the select operation, and vice versa.

Degree of Parallelism

Once the decision to parallelize the select and/or insert operation is made, one parallel directive is picked for deciding degree of parallelism of the whole statement using the following precedence rule:

where Maximum Query directive means that among multiple tables and indexes, the table or index that has the maximum degree of parallelism determines the parallelism for the query operation.

The chosen parallel directive is applied to both the select and insert operations.

Example:

In the following example, the degree of parallelism used will be 2, which is the degree specified in the Insert hint:

INSERT /*+ PARALLEL(tbl_ins,2) */ INTO tbl_ins 
  SELECT /*+ PARALLEL(tbl_sel,4) */ * FROM tbl_sel; 

Rules for Parallelizing DDL Statements

Decision to Parallelize

DDL operations can be parallelized if a PARALLEL clause (declaration) is specified in the syntax. In the case of CREATE INDEX and ALTER INDEX ... REBUILD or ALTER INDEX ... REBUILD PARTITION, the parallel declaration is stored in the data dictionary.

You can use the ALTER SESSION FORCE PARALLEL DDL statement to override the parallel clauses of subsequent DDL statements in a session.

Degree of Parallelism

The degree of parallelism is determined by the specification in the PARALLEL clause, unless it is overridden by an ALTER SESSION FORCE PARALLEL DDL statement. A rebuild of a partitioned index is never parallelized.

Rules for Parallelizing Create Index, Rebuild Index, Move/Split Partition

Parallel CREATE INDEX or ALTER INDEX ... REBUILD

The CREATE INDEX and ALTER INDEX ... REBUILD statements can be parallelized only by a PARALLEL clause or an ALTER SESSION FORCE PARALLEL DDL statement.

ALTER INDEX ... REBUILD can be parallelized only for a nonpartitioned index, but ALTER INDEX ... REBUILD PARTITION can be parallelized by a PARALLEL clause or an ALTER SESSION FORCE PARALLEL DDL statement.

The scan operation for ALTER INDEX ... REBUILD (nonpartitioned), ALTER INDEX ... REBUILD PARTITION, and CREATE INDEX has the same parallelism as the REBUILD or CREATE operation and uses the same degree of parallelism. If the degree of parallelism is not specified for REBUILD or CREATE, the default is the number of CPUs.

Parallel MOVE PARTITION or SPLIT PARTITION

The ALTER INDEX ... MOVE PARTITION and ALTER INDEX ... SPLIT PARTITION statements can be parallelized only by a PARALLEL clause or an ALTER SESSION FORCE PARALLEL DDL statement. Their scan operations have the same parallelism as the corresponding MOVE or SPLIT operations. If the degree of parallelism is not specified, the default is the number of CPUs.

Rules for Parallelizing Create Table as Select

The CREATE TABLE ... AS SELECT statement contains two parts: a CREATE part (DDL) and a SELECT part (query). Oracle can parallelize both parts of the statement. The CREATE part follows the same rules as other DDL operations.

Decision to Parallelize (Query Part)

The query part of a CREATE TABLE ... AS SELECT statement can be parallelized only if the following conditions are satisfied:

  1. The query includes a parallel hint specification (PARALLEL or PARALLEL_INDEX) or the CREATE part of the statement has a PARALLEL clause specification or the schema objects referred to in the query have a PARALLEL declaration associated with them.

  2. At least one of the tables specified in the query requires one of the following:

    • A full table scan

    • An index range scan spanning multiple partitions

Degree of Parallelism (Query Part)

The degree of parallelism for the query part of a CREATE TABLE ... AS SELECT statement is determined by one of these rules:

Note that any values specified in a hint for parallelism are ignored.

See Also:

"Rules for Parallelizing Queries" 

Decision to Parallelize (Create Part)

The CREATE operation of CREATE TABLE ... AS SELECT can be parallelized only by a PARALLEL clause or an ALTER SESSION FORCE PARALLEL DDL statement.

When the CREATE operation of CREATE TABLE ... AS SELECT is parallelized, Oracle also parallelizes the scan operation if possible. The scan operation cannot be parallelized if, for example:

When the CREATE operation is not parallelized, the SELECT can be parallelized if it has a PARALLEL hint or if the selected table (or partitioned index) has a parallel declaration.

Degree of Parallelism (Create Part)

The degree of parallelism for the CREATE operation, and for the SELECT operation if it is parallelized, is specified by the PARALLEL clause of the CREATE statement, unless it is overridden by an ALTER SESSION FORCE PARALLEL DDL statement. If the PARALLEL clause does not specify the degree of parallelism, the default is the number of CPUs.

Summary of Parallelization Rules

Table 23-1 shows how various types of SQL statements can be parallelized, and indicates which methods of specifying parallelism take precedence.

Parallel Query

You can parallelize queries and subqueries in SELECT statements, as well as the query portions of DDL statements and DML statements (INSERT, UPDATE, and DELETE).

However, you cannot parallelize the query portion of a DDL or DML statement if it references a remote object. When you issue a parallel DML or DDL statement in which the query portion references a remote object, the operation is executed serially without notification.

See Also:

 

Parallel Queries on Index-Organized Tables

The following parallel scan methods are supported on index-organized tables:

These scan methods can be used for index-organized tables with overflow areas and index-organized tables that contain LOBs.

Nonpartitioned Index-Organized Tables

Parallel query on a nonpartitioned index-organized table uses parallel fast full scan. The degree of parallelism is determined, in decreasing order of priority, by:

  1. The PARALLEL hint (if present)

  2. The parallel degree associated with the table if the parallel degree is specified in the CREATE TABLE or ALTER TABLE statement

The allocation of work is done by dividing the index segment into a sufficiently large number of block ranges and then assigning block ranges to parallel execution servers in a demand-driven manner. The overflow blocks corresponding to any row are accessed in a demand-driven manner only by the process which owns that row.

Partitioned Index-Organized Tables

Both index range scan and fast full scan can be performed in parallel. For parallel fast full scan, parallelization is exactly the same as for nonpartitioned index-organized tables. For parallel index range scan on partitioned index-organized tables, the degree of parallelism is the minimum of the degree picked up from the above priority list (like in parallel fast full scan) and the number of partitions in the index-organized table. Depending on the degree of parallelism, each parallel execution server gets one or more partitions (assigned in a demand-driven manner), each of which contains the primary key index segment and the associated overflow segment, if any.

Parallel Queries on Object Types

Parallel queries can be performed on object type tables and tables containing object type columns. Parallel query for object types supports all of the features that are available for sequential queries on object types, including:

There are no limitations on the size of the object types for parallel queries.

The following restrictions apply to using parallel query for object types.

In all cases where the query cannot execute in parallel because of any of the above restrictions, the whole query executes serially without giving an error message.

Parallel DDL

This section includes the following topics on parallelism for data definition language (DDL) statements:

DDL Statements That Can Be Parallelized

You can parallelize DDL statements for tables and indexes that are nonpartitioned or partitioned. Table 23-1 summarizes the operations that can be parallelized in DDL statements.

The parallel DDL statements for nonpartitioned tables and indexes are:

The parallel DDL statements for partitioned tables and indexes are:

All of these DDL operations can be performed in no-logging mode for either parallel or serial execution.

CREATE TABLE for an index-organized table can be parallelized either with or without an AS SELECT clause.

Different parallelism is used for different operations (see Table 23-1). Parallel create (partitioned) table as select and parallel create (partitioned) index execute with a degree of parallelism equal to the number of partitions.

Partition parallel analyze table is made less necessary by the ANALYZE {TABLE, INDEX} PARTITION statements, since parallel analyze of an entire partitioned table can be constructed with multiple user sessions.

Parallel DDL cannot occur on tables with object columns or LOB columns.

See Also:

 

CREATE TABLE ... AS SELECT in Parallel

For performance reasons, decision support applications often require large amounts of data to be summarized or "rolled up" into smaller tables for use with ad hoc, decision support queries. Rollup occurs regularly (such as nightly or weekly) during a short period of system inactivity.

Parallel execution allows you to parallelize the query and create operations of creating a table as a subquery from another table or set of tables.

Figure 23-6 illustrates creating a table from a subquery in parallel.


Note:

Clustered tables cannot be created and populated in parallel.  


Figure 23-6 Creating a Summary Table in Parallel


Recoverability and Parallel DDL

When summary table data is derived from other tables' data, the recoverability from media failure for the smaller summary table may not be important and can be turned off during creation of the summary table.

If you disable logging during parallel table creation (or any other parallel DDL operation), you should take a backup of the tablespace containing the table once the table is created to avoid loss of the table due to media failure.

Use the NOLOGGING clause of CREATE/ALTER TABLE/INDEX statements to disable undo and redo log generation.

See Also:

 

Space Management for Parallel DDL

Creating a table or index in parallel has space management implications that affect both the storage space required during the parallel operation and the free space available after the table or index has been created.

Storage Space for CREATE TABLE ... AS SELECT and CREATE INDEX

When creating a table or index in parallel, each parallel execution server uses the values in the STORAGE clause of the CREATE statement to create temporary segments to store the rows. Therefore, a table created with an INITIAL of 5M and a PARALLEL DEGREE of 12 consumes at least 60 megabytes (MB) of storage during table creation, because each process starts with an extent of 5 MB. When the parallel execution coordinator combines the segments, some of the segments may be trimmed, and the resulting table may be smaller than the requested 60 MB.

See Also:

Oracle8i SQL Reference for a discussion of the syntax of the CREATE TABLE statement 

Free Space and Parallel DDL

When you create indexes and tables in parallel, each parallel execution server allocates a new extent and fills the extent with the table or index's data. Thus, if you create an index with a degree of parallelism of 3, there will be at least three extents for that index initially. This discussion also applies to rebuilding indexes in parallel and moving, splitting, or rebuilding partitions in parallel.

Serial operations require the schema object to have at least one extent. Parallel creations require that tables or indexes have at least as many extents as there are parallel execution servers creating the schema object.

When you create a table or index in parallel, it is possible to create pockets of free space--either external or internal fragmentation. This occurs when the temporary segments used by the parallel execution servers are larger than what is needed to store the rows.

For example, if you specify a degree of parallelism of three for a CREATE TABLE ... AS SELECT statement but there is only one datafile in the tablespace, then the internal fragmentation illustrated in Figure 23-7 can arise. The pockets of free space within internal table extents of a datafile cannot be coalesced with other free space and allocated as extents.

See Also:

 

Figure 23-7 Unusable Free Space (Internal Fragmentation)


Parallel DML

Parallel DML (parallel insert, update, and delete) uses parallel execution mechanisms to speed up or scale up large DML operations against large database tables and indexes.


Note:

Although generally data manipulation language (DML) includes queries, in this chapter the term "DML" refers only to inserts, updates, and deletes.  


This section discusses the following parallel DML topics:

Advantages of Parallel DML over Manual Parallelism

You can parallelize DML operations manually by issuing multiple DML statements simultaneously against different sets of data. For example, you can parallelize manually by:

However, manual parallelism has the following disadvantages:

Parallel DML removes these disadvantages by performing inserts, updates, and deletes in parallel automatically.

When to Use Parallel DML

Parallel DML operations are mainly used to speed up large DML operations against large database objects. Parallel DML is useful in a decision support system (DSS) environment where the performance and scalability of accessing large objects are important. Parallel DML complements parallel query in providing you with both querying and updating capabilities for your DSS databases.

The overhead of setting up parallelism makes parallel DML operations infeasible for short OLTP transactions. However, parallel DML operations can speed up batch jobs running in an OLTP database.

Refresh Tables of a Data Warehouse System

In a data warehouse system, large tables need to be refreshed (updated) periodically with new or modified data from the production system. You can do this efficiently by using parallel DML combined with updatable join views.

The data that needs to be refreshed is generally loaded into a temporary table before starting the refresh process. This table contains either new rows or rows that have been updated since the last refresh of the data warehouse. You can use an updatable join view with parallel UPDATE to refresh the updated rows, and you can use an anti-hash join with parallel INSERT to refresh the new rows.

See Also:

Oracle8i Designing and Tuning for Performance.  

Intermediate Summary Tables

In a DSS environment, many applications require complex computations that involve constructing and manipulating many large intermediate summary tables. These summary tables are often temporary and frequently do not need to be logged. Parallel DML can speed up the operations against these large intermediate tables. One benefit is that you can put incremental results in the intermediate tables and perform parallel UPDATEs.

In addition, the summary tables may contain cumulative or comparison information which has to persist beyond application sessions; thus, temporary tables are not feasible. Parallel DML operations can speed up the changes to these large summary tables.

Scoring Tables

Many DSS applications score customers periodically based on a set of criteria. The scores are usually stored in large DSS tables. The score information is then used in making a decision, for example, inclusion in a mailing list.

This scoring activity queries and updates a large number of rows in the large table. Parallel DML can speed up the operations against these large tables.

Historical Tables

Historical tables describe the business transactions of an enterprise over a recent time interval. Periodically, the DBA deletes the set of oldest rows and inserts a set of new rows into the table. Parallel INSERT... SELECT and parallel DELETE operations can speed up this rollover task.

Although you can also use parallel direct loader (SQL*Loader) to insert bulk data from an external source, parallel INSERT... SELECT will be faster in inserting data that already exists in another table in the database.

Dropping a partition can also be used to delete old rows, but to do this, the table has to be partitioned by date and with the appropriate time interval.

Batch Jobs

Batch jobs executed in an OLTP database during off hours have a fixed time window in which the jobs must complete. A good way to ensure timely job completion is to parallelize their operations. As the work load increases, more machine resources can be added; the scaleup property of parallel operations ensures that the time constraint can be met.

Enabling Parallel DML

A DML statement can be parallelized only if you have explicitly enabled parallel DML in the session via the ENABLE PARALLEL DML clause of the ALTER SESSION statement. This mode is required because parallel DML and serial DML have different locking, transaction, and disk space requirements.

The default mode of a session is DISABLE PARALLEL DML. When PARALLEL DML is disabled, no DML will be executed in parallel even if the PARALLEL hint or PARALLEL clause is used.

When PARALLEL DML is enabled in a session, all DML statements in this session will be considered for parallel execution. However, even if the PARALLEL DML is enabled, the DML operation may still execute serially if there are no parallel hints or parallel clauses or if restrictions on parallel operations are violated.

The session's PARALLEL DML mode does not influence the parallelism of SELECT statements, DDL statements, and the query portions of DML statements. Thus, if this mode is not set, the DML operation is not parallelized but scans or join operations within the DML statement may still be parallelized.

See Also:

 

Transactions with PARALLEL DML Enabled

A session that is enabled for PARALLEL DML may put transactions in the session in a special mode: If any DML statement in a transaction modifies a table in parallel, no subsequent serial or parallel query or DML statement can access the same table again in that transaction. This means that the results of parallel modifications cannot be seen during the transaction.

Serial or parallel statements that attempt to access a table which has already been modified in parallel within the same transaction are rejected with an error message.

If a PL/SQL procedure or block is executed in a PARALLEL DML enabled session, then this rule applies to statements in the procedure or block.

Transaction Model for Parallel DML

To execute a DML operation in parallel, the parallel execution coordinator acquires or spawns parallel execution servers and each parallel execution server executes a portion of the work under its own parallel process transaction.

The coordinator also has its own coordinator transaction, which can have its own rollback segment.

Rollback Segments

Oracle assigns transactions to rollback segments that have the fewest active transactions. To speed up both forward and undo operations, you should create and bring online enough rollback segments so that at most two parallel process transactions are assigned to one rollback segment.

Create the rollback segments in tablespaces that have enough space for them to extend when necessary and set the MAXEXTENTS storage parameters for the rollback segments to UNLIMITED. Also, set the OPTIMAL value for the rollback segments so that after the parallel DML transactions commit, the rollback segments will be shrunk to the OPTIMAL size.

Two-Phase Commit

A parallel DML operation is executed by more than one independent parallel process transaction. In order to ensure user-level transactional atomicity, the coordinator uses a two-phase commit protocol to commit the changes performed by the parallel process transactions.

This two-phase commit protocol is a simplified version which makes use of shared disk architecture to speed up transaction status lookups, especially during transactional recovery. It does not require the Oracle XA library. In-doubt transactions never become visible to users.

Recovery for Parallel DML

The time required to roll back a parallel DML operation is roughly equal to the time it took to perform the forward operation.

Oracle supports parallel rollback after transaction and process failures, and after instance and system failures. Oracle can parallelize both the rolling forward stage and the rolling back stage of transaction recovery.

See Also:

Oracle8i Backup and Recovery Guide for details about parallel rollback 

Transaction Recovery for User-Issued Rollback

A user-issued rollback in a transaction failure due to statement error is performed in parallel by the parallel execution coordinator and the parallel execution servers. The rollback takes approximately the same amount of time as the forward transaction.

Process Recovery

Recovery from the failure of a parallel DML coordinator or parallel execution server is performed by the PMON process. PMON performs the following tasks:

System Recovery

Recovery from a system failure needs a new startup. Recovery is performed by the SMON process and any recovery server processes spawned by SMON. Parallel DML statements may be recovered in parallel using parallel rollback. If the initialization parameter COMPATIBLE is set to 8.1.3 or greater, Fast-Start On-Demand Rollback enables dead transactions to be recovered, on demand, one block at a time.

See Also:

"Fast-Start On-Demand Rollback" 

Instance Recovery (Oracle Parallel Server)

Recovery from an instance failure in an Oracle Parallel Server is performed by the recovery processes (that is, the SMON processes and any recovery server processes they spawn) of other live instances. Each recovery process of the live instances can recover the parallel execution coordinator and/or parallel execution server transactions of the failed instance independently.

Space Considerations for Parallel DML

Parallel UPDATE uses the space in the existing object, as opposed to direct-load INSERT which gets new segments for the data.

Space usage characteristics may be different in parallel than they would be if the statement executed sequentially, because multiple concurrent child transactions modify the object.

See Also:

"Space Considerations" for information about space for direct-load INSERT 

Lock and Enqueue Resources for Parallel DML

A parallel DML operation's lock and enqueue resource requirements are very different from the serial DML requirements. Parallel DML holds many more locks, so you should increase the value of the ENQUEUE_RESOURCES and DML_LOCKS parameters.

The processes for a parallel UPDATE, DELETE, or INSERT statement acquire the following locks:

A parallel execution server can work on one or more partitions, but a partition can only be worked on by one parallel execution server.

For example, for a table with 600 partitions running with parallel degree 100, a parallel DML statement needs the following locks (assuming all partitions are involved in the statement):

A special type of parallel UPDATE exists called row-migrating parallel UPDATE. This parallel update method is only used when a table is defined with the row movement clause enabled and it allows rows to be moved to different partitions or subpartitions.

Table 23-2 summarizes the types of locks acquired by coordinator and parallel execution servers for different types of parallel DML statements.

Table 23-2 Locks Acquired by Parallel DML Statements
Type of statement  Parallel execution coordinator acquires:  Each parallel execution server acquires: 

Parallel UPDATE or DELETE into partitioned table; WHERE clause pruned to a subset of partitions/subpartitions 

1 table lock SX

1 partition lock X per pruned (sub)partition  

1 table lock SX

1 partition lock NULL per pruned (sub)partition owned by the parallel execution server

1 partition-wait lock S per pruned (sub)partition owned by the parallel execution server 

Parallel row-migrating UPDATE into partitioned table; WHERE clause pruned to a subset of (sub)partitions 

1 table lock SX 

1 table lock SX  

1 partition X lock per pruned (sub)partition 

1 partition lock NULL per pruned (sub)partition owned by the parallel execution server

1 partition-wait lock S per pruned partition owned by the parallel execution server 

1 partition lock SX for all other (sub)partitions 

1 partition lock SX for all other (sub)partitions 

Parallel UPDATE, DELETE, or INSERT into partitioned table 

1 table lock SX

Partition locks X for all (sub)partitions 

1 table lock SX

1 partition lock NULL per (sub)partition owned by the parallel execution server

1 partition-wait lock S per (sub)partition owned by the parallel execution server 

Parallel INSERT into nonpartitioned table 

1 table lock X  

None  

Restrictions on Parallel DML

The following restrictions apply to parallel DML (including direct-load INSERT):

Violations will cause the statement to execute serially without warnings or error messages (except for the restriction on statements accessing the same table in a transaction, which can cause error messages). For example, an update will be serialized if it is on a nonpartitioned table.

The following sections give further details about restrictions.

See Also:

Oracle8i Application Developer's Guide - Large Objects (LOBs) for more information about LOB restrictions 

Partitioning Key Restriction

You can only update the partitioning key of a partitioned table to a new value if the update would not cause the row to move to a new partition unless the table is defined with the row movement clause enabled.

Function Restrictions

The function restrictions for parallel DML are the same as those for parallel DDL and parallel query.

See Also:

"Parallel Execution of Functions" 

Data Integrity Restrictions

This section describes the interactions of integrity constraints and parallel DML statements.

NOT NULL and CHECK

These types of integrity constraints are allowed. They are not a problem for parallel DML because they are enforced on the column and row level, respectively.

UNIQUE and PRIMARY KEY

These types of integrity constraints are allowed.

FOREIGN KEY (Referential Integrity)

There are restrictions for referential integrity whenever a DML operation on one table could cause a recursive DML operation on another table or, in order to perform the integrity check, it would be necessary to see simultaneously all changes made to the object being modified.

Table 23-3 lists all of the operations that are possible on tables that are involved in referential integrity constraints.

Table 23-3 Referential Integrity Restrictions
DML Statement  Issued on Parent  Issued on Child  Self-Referential 

INSERT 

(Not applicable)  

Not parallelized  

Not parallelized 

UPDATE No Action 

Supported  

Supported 

Not parallelized 

DELETE No Action 

Supported 

Supported 

Not parallelized 

DELETE Cascade 

Not parallelized 

(Not applicable)  

Not parallelized 

Delete Cascade

Delete on tables having a foreign key with delete cascade is not parallelized because parallel execution servers will try to delete rows from multiple partitions (parent and child tables).

Self-Referential Integrity

DML on tables with self-referential integrity constraints is not parallelized if the referenced keys (primary keys) are involved. For DML on all other columns, parallelism is possible.

Deferrable Integrity Constraints

If there are any deferrable constraints on the table being operated on, the DML operation will not be parallelized.

Trigger Restrictions

A DML operation will not be parallelized if any triggers are enabled on the affected tables that may get fired as a result of the statement. This implies that DML statements on tables that are being replicated will not be parallelized.

Relevant triggers must be disabled in order to parallelize DML on the table. Note that enabling/disabling triggers invalidates dependent shared cursors.

Distributed Transaction Restrictions

A DML operation cannot be parallelized if it is in a distributed transaction or if the DML or the query operation is against a remote object.

Example 1:

DML statement which queries a remote object:

INSERT /* APPEND PARALLEL (t3,2) */ INTO t3 SELECT * FROM t4@dblink; 

The query operation is executed serially without notification because it references a remote object.

Example 2:

DML operation on a remote object:

DELETE /*+ PARALLEL (t1, 2) */ FROM t1@dblink; 

The DELETE operation is not parallelized because it references a remote object.

Example 3:

In a distributed transaction:

SELECT * FROM t1@dblink; 
DELETE /*+ PARALLEL (t2,2) */ FROM t2; 
COMMIT; 

The DELETE operation is not parallelized because it occurs in a distributed transaction (which is started by the SELECT statement).

Parallel Execution of Functions

The execution of user-written functions written in PL/SQL, in Java, or as external procedures in C, can be parallelized. Any PL/SQL package variables or Java static attributes used by the function are entirely private to each individual parallel execution process, however, and are newly initialized at the start of each parallel execution process rather than being copied from the original session. Because of this, not all functions will generate correct results if executed in parallel.

To allow a user-written function to be executed in parallel, use the PARALLEL_ENABLE keyword when you declare the function in either the CREATE FUNCTION or CREATE PACKAGE statement.

Functions in Parallel Queries

In a SELECT statement or a subquery in a DML or DDL statement, a user-written function may be executed in parallel if it has been declared with the PARALLEL_ENABLE keyword, if it is declared in a package or type and has a PRAGMA RESTRICT_REFERENCES that indicates all of WNDS, RNPS, and WNPS, or if it is declared with CREATE FUNCTION and the system can analyze the body of the PL/SQL code and determine that the code neither writes to the database nor reads nor modifies package variables.

Other parts of a query or subquery can sometimes execute in parallel even if a given function execution must remain serial.

See Also:

 

Functions in Parallel DML and DDL Statements

In a parallel DML or DDL statement, as in a parallel query, a user-written function may be executed in parallel if it has been declared with the PARALLEL_ENABLE keyword, if it is declared in a package or type and has a PRAGMA RESTRICT_REFERENCES that indicates all of RNDS, WNDS, RNPS, and WNPS, or if it is declared with CREATE FUNCTION and the system can analyze the body of the PL/SQL code and determine that the code neither reads nor writes to the database nor reads nor modifies package variables.

For a parallel DML statement, any function call that cannot be executed in parallel causes the entire DML statement to be executed serially.

For an INSERT ... SELECT or CREATE TABLE ... AS SELECT statement, function calls in the query portion are parallelized according to the parallel query rules in the prior paragraph; the query may be parallelized even if the remainder of the statement must execute serially, or vice versa.

Affinity


Note:

The features described in this section are available only if you have purchased Oracle8i Enterprise Edition with the Parallel Server Option. See Getting to Know Oracle8i for information about the features and options available with Oracle8i Enterprise Edition.  


In a shared-disk cluster or massively parallel processing (MPP) configuration, an instance of the Oracle Parallel Server is said to have affinity for a device if the device is directly accessed from the processor(s) on which the instance is running. Similarly, an instance has affinity for a file if it has affinity for the device(s) that the file is stored on.

Determination of affinity may involve arbitrary determinations for files that are striped across multiple devices. Somewhat arbitrarily, an instance is said to have affinity for a tablespace (or a partition of a table or index within a tablespace) if the instance has affinity for the first file in the tablespace.

Oracle considers affinity when allocating work to parallel execution servers. The use of affinity for parallel execution of SQL statements is transparent to users.

Affinity and Parallel Queries

Affinity in parallel queries increases the speed of scanning data from disk by doing the scans on a processor that is near the data. This can provide a substantial performance increase for machines that do not naturally support shared disks.

The most common use of affinity is for a table or index partition to be stored in one file on one device. This configuration provides the highest availability by limiting the damage done by a device failure and makes best use of partition-parallel index scans.

DSS customers might prefer to stripe table partitions over multiple devices (probably a subset of the total number of devices). This allows some queries to prune the total amount of data being accessed using partitioning criteria and still obtain parallelism through rowid-range parallel table (partition) scans. If the devices are configured as a RAID, availability can still be very good. Even when used for DSS, indexes should probably be partitioned on individual devices.

Other configurations (for example, multiple partitions in one file striped over multiple devices) will yield correct query results, but you may need to use hints or explicitly set object attributes to select the correct degree of parallelism.

Affinity and Parallel DML

For parallel DML (inserts, updates, and deletes), affinity enhancements improve cache performance by routing the DML operation to the node that has affinity for the partition.

Affinity determines how to distribute the work among the set of instances and/or parallel execution servers to perform the DML operation in parallel. Affinity can improve performance of queries in several ways:

For partitioned tables and indexes, partition-to-node affinity information determines process allocation and work assignment. For shared-nothing MPP systems, the Oracle Parallel Server tries to assign partitions to instances taking the disk affinity of the partitions into account. For shared-disk MPP and cluster systems, partitions are assigned to instances in a round-robin manner.

Affinity is only available for parallel DML when running in an Oracle Parallel Server configuration. Affinity information which persists across statements will improve buffer cache hit ratios and reduce block pings between instances.

See Also:

Oracle8i Parallel Server Administration, Deployment, and Performance  

Other Types of Parallelism

In addition to parallel SQL execution, Oracle can use parallelism for the following types of operations:

Like parallel SQL, parallel recovery and parallel propagation are executed by a parallel execution coordinator and multiple parallel execution servers. Parallel load, however, uses a different mechanism.

The behavior of the parallel execution coordinator and parallel execution servers may differ, depending on what kind of operation they perform (SQL, recovery, or propagation). For example, if all parallel execution servers in the pool are occupied and the maximum number of parallel execution servers has been started:

For a given session, the parallel execution coordinator coordinates only one kind of operation. A parallel execution coordinator cannot coordinate, for example, parallel SQL and parallel propagation or parallel recovery at the same time.

See Also:

 


Go to previous page Go to next page
Oracle
Copyright © 1996-2000, Oracle Corporation.

All Rights Reserved.

Library

Product

Contents

Index