|Oracle9i Real Application Clusters Deployment and Performance
Release 1 (9.0.1)
Part Number A89870-02
When designing database layouts for shared Oracle Real Application Clusters databases, remember that accessing globally shared data from multiple nodes increases transaction processing costs. In other words, multi-node transactions incur more wait time and higher CPU consumption than transactions processed on single node systems. Because of this, if you carefully consider the data access patterns of your applications, your resulting database design will enhance scalability.
In general, you can improve scalability by:
The most scalable and efficient application designs for clustered systems enable a high degree of transaction affinity to the data that the transactions access. The more local your application's data access, the more efficient your application. In this case, the application minimizes the costs of cross-instance synchronization.
All applications running on multi-node systems have some data with low node affinity. This data is shared across the cluster and thus requires synchronization. Cache Fusion, however, reduces the costs associated with globally shared database partitions by more efficiently synchronizing this data across multiple nodes.
Some database resources can become critical when certain transactions execute in Real Application Clusters environments. For example, an excessive rate of inter-instance changes to a small number of hot data blocks that are in the same table can cause increased inter-instance messaging, context switches, and general processing overhead. If a table has one or more indexes, then the maintenance cost can increase even more due to the relative complexity of index changes.
Searching for free space and allocating it when inserting new data can require access to space management structures, such as segment free lists. Also, you must carefully configure sequence number generation if every node in the cluster uses sequence numbers.
When data is frequently inserted into a table from multiple nodes and the table is not partitioned, use free list groups to avoid performance issues. In such situations, contention can be due to concurrent access to data blocks, table segment headers, and other global resource demands.
Free list groups separate the data structures associated with the free space management of a table into disjoint sets that are available for individual instances. With free list groups, the contention among processes working on different instances is reduced because data blocks with sufficient free space for inserts are managed separately for each instance.
Another efficient way of avoiding overhead due to concurrency when inserting data from different nodes is to use partitioned tables. However, in this case the application has to make sure that there is affinity between data in the partitions.
Cache Fusion resolves concurrency on shared data between instances by using cache-to-cache transfers. This reduces the overhead associated with maintaining cache coherency. To avoid inter-instance concurrency altogether, use free list groups. However, before building tables, indexes, or clusters with free list groups and free lists, determine whether the feature is useful for the application.
Oracle9i Real Application Clusters Concepts for a conceptual overview of free list groups
Before designing your database for a particular application, you should understand how frequently data is added to, modified, or read from your database tables. If you use multiple nodes and users or application modules are routed to a particular node, then concurrency among instances can be low. Thus, you would not have to use any particular design strategy.
Free lists and free list groups are usually needed when random inserts to a table from multiple instances occur frequently. Processes looking for space in data blocks can contend for the same blocks and table headers. Performance can be adversely affected by the degree of concurrency and the overhead of shipping data and header blocks from one instance to another. In these cases, using free list groups can improve performance.
For more information on partitioning, refer to "Workload Distribution Concepts in Real Application Clusters".
To migrate your application from a single instance environment to Real Application Clusters, identify the tables that are subject to a high rate of inserts. Do this by querying
V$SQL and searching for
INSERT commands as in the following example:
SELECT SUBSTR(SQL_TEXT,80), DECODE(COMMAND_TYPE,2,'INSERT'),EXECUTIONS FROM V$SQL WHERE COMMAND_TYPE = 2 ORDER BY EXECUTIONS;
Search for the table name in the string for the statements with the highest number of executions. These statements and the indexes that are built on them are candidates for free list groups. Remember to also consider the application partitioning strategy. In other words, a table can be subject to excessive insert rates, but if the
INSERT statements always occur from the same instance, you do not need to increase the
FREELIST GROUPS parameter for the table. In these cases, changing to free lists would still be beneficial for performance.
You can monitor free list group performance by examining the rate of cache transfers and forced disk writes using the
FREELIST GROUPS parameter for some tables to improve performance.
SELECT CLASS, (X_2_NULL_FORCED_STALE + X_2_S_FORCED_STALE) CACHE_TRANSFER FROM V$CLASS_CACHE_TRANSFER;
V$CLASS_CACHE_TRANSFER does not identify cache transfers by object name, you can use other views to identify the objects that significantly contribute to the number of cache transfers. For example,
V$CACHE_TRANSFER has information about each block in the buffer cache that is transferred. Block class number 4 identifies segment headers and block class number 6 identifies free list blocks. The output from the following select statement can show objects that could benefit from increased free list groups values:
SELECT NAME, CLASS#, SUM(XNC) CACHE_TRANSFER FROM V$CACHE_TRANSFER WHERE CLASS# IN (4,6) GROUP BY NAME, CLASS# ORDER BY CACHE_TRANSFER DESC;
Create free lists and free list groups by specifying the
FREELIST GROUPS storage parameters in
INDEX statements. The database can be opened in either exclusive or shared mode. If you need to use free list groups, then the general rule is to create at least one free list group for each Real Application Clusters instance.
You cannot change the value of
1. The maximum value depends on the data block size. If you specify a value that is too large, then an error message informs you of the maximum value. The optimal value for
FREELISTS depends on the expected number of concurrent inserts for each free list group for a particular table.
Each free list group is associated with one or more instances at startup. The default value of
FREELIST GROUPS is
1. This means that all existing free lists of a segment are available to all instances. As mentioned, you would typically set
FREELIST GROUPS equal to the number of instances in Real Application Clusters.
Free list group blocks with enough free space for inserts and updates are effectively disjoint once Oracle allocates them to a particular free list group. However, once data blocks that are allocated to one instance are freed by another instance, they are no longer available to the original instance. This might render some space unusable and possibly create a skew.
The following statement creates a table named
department that has seven free list groups, each of which contains four free lists:
CREATE TABLE department (deptno NUMBER(2), dname VARCHAR2(14), loc VARCHAR2(13) ) STORAGE ( INITIAL 100K NEXT 50K MAXEXTENTS 10 PCTINCREASE 5 FREELIST GROUPS 7 FREELISTS 4 );
Use clustered tables to store records from different tables if the records are frequently accessed as a group by one or more
SELECT statements. Using clustered tables can thus improve performance by reducing the overhead for processing reads. However, clustered tables may be less useful for DML statements.
You cannot specify
FREELIST GROUPS storage parameters in the
CREATE TABLE statement for a clustered table. Instead, specify free list parameters for the entire cluster rather than for individual tables. This is because clustered tables use the storage parameters of the
CREATE CLUSTER statement.
Real Application Clusters allows clusters (other than hash clusters) to use multiple free lists and free list groups. Some hash clusters can also use multiple free lists and free list groups if you created them with a user-defined key for the hashing function and the key is partitioned by instance.
You can also use the
FREELIST GROUPS parameters in the
CREATE INDEX statement. However, you should be aware that inserting into an index differs from inserting into a table because the block Oracle uses is determined by the index key value.
For example, assume you have a table with multiple free list groups that also has an index with multiple free list groups. If two sessions connect to different instances and insert rows into that table, then Oracle uses different blocks to store the table data. This minimizes cache block transfers for the affected data segment. However, index segment cache block transfers can still occur if these sessions insert similar index key values. Therefore, you can only anticipate a slight reduction in cache transfers for the index segment header because Oracle must use more header blocks to store the index free lists.
Oracle9i SQL Reference for more information on the SQL mentioned in this section
When Oracle creates an object with multiple free list groups, the number of a free list group block becomes part of the object's data dictionary definition. It is important to realize that instances and users need to be associated with a free list group block. You can establish this association statically by assigning a fixed instance number to an instance using an initialization parameter, or by specifying the instance number in DDL statements.
The following topics describe:
You can associate an instance with free list groups as follows:
You can use various SQL clauses with the
You can use the
SET INSTANCE clause is useful when an instance fails and users re-connect to other instances. For example, consider a database where space is preallocated to the free list groups in a table. If an instance fails and all the users are failed over to other instances, then their session can be set to use the free list group associated with the failed instance.
If you omit the
SET INSTANCE clause, then the failed over sessions would start inserting data into blocks and extents allocated to the instance they failed over to. Later, when the failed instance is restored and the users connect to it again, the data they inserted would be part of a set of blocks associated with the other instance's free list group. Thus, inter-instance communication could increase.
User processes are automatically associated with free lists based on the Oracle process ID of the process in which they are running as shown in the following example:
You can use the
ALTER SESSION SET INSTANCE statement to use the free list group associated with a particular instance.
Before Oracle inserts rows into a table, the table only has an initial extent with a number of free blocks allocated to it. Otherwise the table is empty. Therefore, you should attempt to preallocate space for the table in a free list group. This guarantees an optimal allocation of extents containing free blocks to the free list groups, and therefore to the instances. Preallocation also avoids extent allocation overhead.
The advantage of doing this is that the physical storage layout can be determined in advance. Moreover, the technique of allocating extents enables you to select the physical file or volume from which the new extents are allocated. However, you should consider whether and how to implement the
ALLOCATE EXTENT clause and a few Oracle initialization parameters when you preallocate as described in the following paragraphs:
ALLOCATE EXTENT clause of the
ALTER TABLE or
ALTER CLUSTER statement enables you to preallocate an extent to a table, index, or cluster with parameters to specify the extent size, datafile, and a group of free lists with which to associate the object.
You can use the
ALTER TABLE (or
ALLOCATE EXTENT statement while the database is running in exclusive mode, as well as in shared mode. When an instance runs in exclusive mode, the instance still follows the same rules for locating space. A transaction can use the master free list or the specific free list group for that instance.
SIZE parameter of the
ALLOCATE EXTENT clause is the extent size in bytes, rounded up to a multiple of the block size. If you do not specify
SIZE, then Oracle calculates the extent size according to the values of the
PCTINCREASE storage parameters.
Oracle does not use the value of
SIZE as a basis for calculating subsequent extent allocations, which are determined by the values set for the
This parameter specifies the datafile from which to take space for an extent. If you omit this parameter, then Oracle allocates space from any accessible datafile in the tablespace containing the table.
The filename must exactly match the string stored in the control file; it is case-sensitive. You can check the
FILE_NAME column of the
DBA_DATA_FILES data dictionary view for this string.
This parameter assigns the new space to the free list group associated with the instance number integer. At startup, each instance acquires a unique instance number that maps the instance to a group of free lists. The lowest instance number is 1, not 0; the maximum value is operating system-specific. The syntax is:
where n maps to the free list group with the same number. If the instance number is greater than the number of free list groups, then it is hashed as follows to determine the free list group to which it is assigned:
If you do not specify the
INSTANCE parameter, then the new space is assigned to the table but not allocated to any group of free lists. Such space is included in the master free list of free blocks as needed when no other space is available.
Oracle9i Real Application Clusters Administration for more information about the
You can prevent automatic extent allocations by preallocating extents to free list groups associated with particular instances, and by setting
MAXEXTENTS to the current number of extents (preallocated extents plus
MINEXTENTS). You can minimize the initial allocation when you create the table or cluster by setting
MINEXTENTS to 1 (the default) and by setting
INITIAL to its minimum value (two data blocks, or 10K for a block size of 2048 bytes).
To minimize contention among instances for data blocks, create multiple datafiles for each table and associate each instance with a different file.
If you expect to increase the number of nodes in your system, then allow for additional instances by creating tables or clusters with more free list groups than the current number of instances. You do not have to allocate space to those free list groups until it is needed. Only the master free list of free blocks has space allocated to it automatically.
To associate a data block with a free list group, either bring the data block below
PCTUSED by a process running on an instance using that free list group, or specifically allocate the block to that free list group. Therefore, a free list group that is never used does not leave unused free data blocks.
INSTANCE_NUMBER initialization parameter enables you to start an instance and ensure that it uses the extents allocated to it for inserts and updates. This ensures that it does not use space allocated for other instances. The instance cannot use data blocks in another free list belonging to another instance, unless the instance is restarted with the other instance's
INSTANCE_NUMBER. However, you can override the instance number during a session by using an
ALTER SESSION statement.
This section provides examples in which extents are preallocated.
The following example statement allocates an extent for table
DEPARTMENT from the datafile
DEPT_FILE7 to instance number 7:
The following SQL statement creates a table with three free list groups, each containing ten free lists:
The next SQL statement then allocates new space, dividing the allocated blocks among the free lists in the second free list group:
In a Real Application Clusters environment that runs more instances than the value you have set for the
FREELIST GROUPS storage parameter, multiple instances share the new space allocation. In this example, every third instance to start up is associated with the same group of free lists.
CREATE TABLE statement creates a table named
EMPLOYEE with one initial extent and three groups of free lists. The three
ALTER TABLE statements allocate one new extent to each group of free lists:
CREATE TABLE employee ... STORAGE ( INITIAL 4096 MINEXTENTS 1 MAXEXTENTS 4 FREELIST GROUPS 3 ); ALTER TABLE employee ALLOCATE EXTENT ( SIZE 100K DATAFILE 'empfile1' INSTANCE 1 ) ALLOCATE EXTENT ( SIZE 100K DATAFILE 'empfile2' INSTANCE 2 ) ALLOCATE EXTENT ( SIZE 100K DATAFILE 'empfile3' INSTANCE 3 );
To prevent automatic allocations,
MAXEXTENTS is set to 4 which is the sum of the values of
When you need additional space beyond this allocation, use the
ALTER TABLE statement to increase
MAXEXTENTS before allocating additional extents. For example, if the second group of free lists requires additional free space for inserts and updates, you could set
5 and allocate another extent for that free list group:
ALTER TABLE employee ... STORAGE ( MAXEXTENTS 5 ) ALLOCATE EXTENT ( SIZE 100K DATAFILE 'empfile2' INSTANCE 2 );
When designing applications for Real Application Clusters, use Oracle sequence numbers whenever possible. To optimize sequence number use, each instance's cache must be large enough to accommodate the sequences. The default cache size holds 20 sequence numbers. To increase this, for example to hold 200, use this syntax:
Using the ordering feature suppresses caching in Real Application Clusters. But note that it is normal to lose some numbers after executing the
SHUTDOWN command or after instance failures. This is true even in single instance configurations.
If ordering is essential, then you may need to disable sequence caching. In this case, expect some performance overhead.
Oracle9i Database Concepts for more information about sequences
If sequences are insufficiently cached or not cached at all, then performance problems can result with an increase in service times. If you experience performance problems, then examine the statistics in the
V$SYSTEM_EVENT view as described in the following two points to determine whether the problem is due to the use of Oracle sequences:
V$SYSTEM_EVENTas extended average wait times for row cache locks in the range of a few hundred milliseconds. The proportion of time waited for row cache locks to the total time waited for non-idle events will be relatively high.
DC_SEQUENCESparameter, the ratio of
DLM_REQUESTSwill be high. If this ratio exceeds 10 to 15%, and the row cache lock wait time is a significant portion of the total wait time, then it is likely that the service time deterioration is due to insufficiently cached sequences.
If your application cannot afford to lose a sequence number, then you may want to implement sequences by storing them in database tables. However, there is significant performance overhead associated with the mechanism required for implementing this strategy. This is true even in single instance environments. As a general recommendation, rows storing sequence numbers should be locked for only a very brief period.
In Real Application Clusters, there can be additional overhead associated with the cache coherence needed for buffers storing sequence numbers. If a single data block stores several sequence numbers, and if more than one instance needs those sequence numbers, then the data block can be frequently transferred among the instances.
To minimize that overhead, set
PCTFREE to a very high value so Oracle stores only a single row of the table containing the sequence numbers in each data block. In that case, the cache transfers only occur when the instances concurrently request the same sequence number.
Your goal in tablespace design is to group database objects according to their data access distribution patterns. If you consider the dependency analyses and transaction profiles of your database objects, then you can divide tablespaces into containers for the following objects:
READ-ONLYand infrequently modified
Consider the following additional criteria for separating database objects into tablespaces:
Grouping database objects that belong to different functional areas into different tablespaces using this strategy can improve dynamic resource mastering. This works best if you adopt a functional partitioning strategy as described in Chapter 3. Oracle's dynamic resource re-mastering by datafiles algorithm re-distributes GCS resources where they are needed most. This re-mastering strategy improves resource operations efficiency. That is, Oracle re-masters resources to the instance with which the resources are most closely associated based on access patterns. As a result, resource operations after re-mastering require minimal communication with remote instances through the Global Enqueue Service (GES) and Global Cache Service (GCS).
In rare cases, you can further reduce GCS traffic by changing the default resource control policy for some tablespaces, as described in Appendix A, "Configuring Pre-Release 1 (9.0.1) Multi-Block Lock Assignments (Optional)".
Oracle9i Real Application Clusters Concepts for more information about dynamic resource remastering
Allocating and deallocating extents are expensive operations that you should minimize. Most of these operations in Real Application Clusters require inter-instance coordination. In addition, a high rate of extent management operations can more adversely affect performance in Real Application Clusters environments than in single instance environments. This is especially true for dictionary managed tablespaces.
If the "row cache lock" event is a significant contributor to the non-idle wait time in
V$SYSTEM_EVENT, then there is contention in the data dictionary cache. Extent allocation and deallocation operations could cause this.
V$ROWCACHE provides data dictionary cache information for
DC_FREE_EXTENTS. This is particularly true when the values for
DLM_CONFLICTS for those parameters increase significantly over time. This means that excessive extent management activity is occurring.
Proper storage parameter configuration for tables, indexes, temporary segments, and rollback segments decreases extent allocation and deallocation frequency. Do this using the
INITIAL, NEXT, PCTINCREASE, MINEXTENTS, and
You can greatly reduce extent allocation and deallocation overhead if you use locally managed tablespaces. For optimal performance and space use, segments in locally managed tablespaces should ideally have similar space allocation characteristics. This enables you to create the tablespace with the proper uniform extent size that corresponds to the ideal extent size increment calculated for the segments.
For example, you could put tables with relatively high insert rates in a tablespace with a 10MB uniform extent size. On the other hand, you can place small tables with limited DML activity in a tablespace with a 100K uniform extent size. For an existing system where tablespaces are not organized by segment size, this type of configuration can require significant reorganization efforts with limited benefits. For that reason, the compromise is to create most of your tablespaces as locally managed with
AUTOALLOCATE instead of
UNIFORM extent allocation.
Oracle9i SQL Reference for more information about the
In high volume OLTP systems, inter-instance concurrent index block accesses can increase the cost of Real Application Clusters processing. This is because the commonly used B+-Tree index structures usually contribute to higher Cache Fusion activity. A right-growing tree can incur frequent cache transfers of one particular leaf block.
While traversing the tree structure, branch blocks might have to be requested from another instance that recently modified them. Leaf block splits are vulnerable because three blocks need to be modified in one transaction. For very high transaction volumes occurring from different instances, you may need to reduce inter-instance concurrent changes to:
The following section addresses how to reduce leaf, branch, and root block contention. You can reduce index segment header concurrent changes by using free list groups as described under the heading "Using Free List Groups For Concurrent Inserts from Multiple Nodes".
This section describes the following four strategies to isolate or distribute access to different parts of an index and to improve performance:
Use reverse key indexes to avoid right-growing index trees. By reversing the keys, you can achieve a broader spread of index keys over the leaf blocks of an index and thus reduce the probability of accessing the same leaf and branch blocks from multiple instances.
For indexes based on sequence numbers, you can assign different subsequences to each instance. In the case of database objects that can be partitioned based on certain characteristics, this might adequately distribute the access patterns.
For other sequentially assigned values, adjust the index value and use
INSTANCE_NUMBER to generate the index key, as shown in the following formula:
Another effective way to reduce index contention is to partition tables by range and to create local indexes on them.
"Scaling Applications for Real Application Clusters" for guidelines on physical table partitioning implementation.
Figure 4-1 shows how transactions operating on records stored in tables partitioned by range can minimize leaf and branch block contention.
In Real Application Clusters, Oracle uses inter-instance communication to globally coordinate table locks. Because most applications do not need to lock entire tables, you can disable table locks to improve locking efficiency with minimal adverse side-effects. There are two methods for disabling table locks as described under the following headings:
To prevent users from acquiring table locks, use the following statement:
Users attempting to lock tables with disabled locks receive an error. To re-enable table locking, use the following statement:
This syntax forces all currently executing transactions to commit before enabling the table lock. The statement does not wait for new transactions to start after issuing the
To determine whether a table has its table lock enabled or disabled, query the
TABLE_LOCK column in the
USER_TABLES data dictionary table. If you have select privilege on
ALL_TABLES, then query the table lock state of other user's tables.
You can set table locks set for an entire instance using the
DML_LOCKS initialization parameter. If you do not need to use the
DROP TABLE, CREATE INDEX, and
LOCK TABLE statements, then set
DML_LOCKS to zero to minimize lock conversions and achieve maximum performance.
SQL*Loader checks the flag to ensure that there is not a non-parallel direct load running against the same table. This forces Oracle to create new extents for each session.
Oracle9i Database Utilities for more information on SQL*Loader
As a general database design rule, you should only use DDL statements for maintenance tasks, not during normal system operations. Therefore, in most systems, the frequency of new object creation and other DDL statements should be very small.
However, if your application frequently creates objects, some performance degradation may occur. This is because object creation requires inter-instance coordination. A high ratio of
DLM_REQUESTS on the
DC_OBJECT_IDS row cache in
V$ROWCACHE, along with excessive wait times for the row cache lock event in
V$SYSTEM_EVENT, indicates that different instances in your cluster are issuing significant amounts of concurrent DDL statements.
To improve object creation performance is such situations, set event 10297 so that it caches
OBJECT_ID values. This improves concurrent object creation. To set event 10297, add the following line to your initialization parameter file:
If you set the additional level argument to 1, then the caching behavior is automatically adjustable. Otherwise, you can set
level to the desired cache size.
Cache Fusion introduces an improved diskless algorithm that handles cache coherency more efficiently than Oracle's earlier architectures. This enables you to implement simpler database designs while achieving optimal performance.
Response time and throughput requirements ultimately determine whether you should implement a partitioning strategy and how stringent your strategy needs to be. Response time and throughput requirements also determine how much effort you should invest to achieve an optimal database design.
A careful analysis of your system's workload should serve as the optimal basis for allocating work to particular instances. This analysis should consider:
Moreover, implementing a strategy that considers these points makes your system more robust and thus more scalable.
Generally speaking, 80% or more of your overhead results from 20% or less of a given workload. If you first attempt to deal with the 20% by observing some simple guidelines, then you can produce tangible benefits with minimal effort. You can address these workload problems by implementing any or all of the following:
CACHEparameter to a high value if needed.