|Oracle8i Parallel Server Concepts
Release 2 (8.1.6)
Part Number A76968-01
This chapter describes storage considerations for Oracle Parallel Server applications. Topics in this chapter include:
This section describes storage issues specific to Oracle Parallel Server. This section discusses:
All Oracle Parallel Server instances access the same data files. The composition of database files is the same for Oracle in parallel mode and in exclusive mode. You do not have to alter the data files to start Oracle in parallel mode.
Oracle Parallel Server requires special naming conventions for data files as specified in Oracle8i Parallel Server Setup and Configuration Guide.
To improve performance, you can control the physical placement of data so that the instances use separate sets of data blocks. Free lists, for example, enable you to allocate space for inserts to particular instances.
Whenever an instance starts up, it verifies access to all online data files. The first Oracle Parallel Server instance to start must verify access to all online files so it can determine if media recovery is required. Additional instances can operate without access to all of the online data files, but any attempt to use an unverified file fails and a message is generated.
When an instance adds a data file or brings a data file online, all instances verify access to the file. If an instance adds a new data file on a disk that other instances cannot access, verification fails, but the instances continue running. Verification can also fail if instances access different copies of the same data file.
If verification fails for any instance, diagnose and fix the problem, then use the ALTER SYSTEM CHECK DATAFILES statement to verify access. This statement has a GLOBAL option, which is the default, that makes all instances verify access to online data files. It also has a LOCAL option that makes the current instance verify access.
ALTER SYSTEM CHECK DATAFILES makes online data files available to the instance or instances for which access is verified.
Oracle cannot recover from instance failure or media failure unless the instance that performs recovery can verify access to all required online data files.
Oracle automatically maps absolute file numbers to relative file numbers. Use of Oracle Parallel Server does not affect these values. Query the V$DATAFILE view to see both numbers for your data files.
In Oracle Parallel Server, each instance writes to its own set of online redo log files. The redo written by a single instance is called a "thread of redo". Each online redo log file is associated with a particular thread number. When an online redo log is archived, Oracle records its thread number to identify it during recovery.
A "private thread" is a redo log created using the ALTER DATABASE ADD LOGFILE command with the THREAD clause. A "public thread" is a redo log created using the ALTER DATABASE ADD LOGFILE but without specifying a THREAD clause.
If the THREAD initialization parameter is specified, the instance starting up acquires the thread identified by that value as a private thread. If THREAD has the default of zero, the instance acquires a public thread. Once acquired, the acquiring instance uses the redo thread exclusively.
Online redo log files can be multiplexed, or "mirrored". A multiplexed redo log consists of two or more groups of files and all members of a group are written to concurrently when that group is active. Figure 7-1 shows the threads of redo for three instances of Oracle Parallel Server.
Group numbers must be unique within the database, therefore they are unique within a thread. However, the order of assigning groups to threads, and threads to instances, is arbitrary.
For example, although in Figure 7-1 thread 1 contains groups 1, 2, and 3 while thread 2 contains groups 4 and 5, you could instead assign groups 2, 4, and 5 to thread 1 while assigning groups 1 and 3 to thread 2. The V$LOGFILE view displays the group number associated with each redo log file.
Although it is possible to have different numbers of groups and members per thread, we recommend that all threads be configured to a common standard to facilitate administration.
Different instances of Oracle Parallel Server can have different degrees of mirroring, or different numbers of members per group. The different instances can also have different numbers of groups. For example, one instance could have three groups with two members per group, a second instance could have four non-multiplexed log files, and a third instance could have two groups with four members per group. While such a configuration may be inconvenient to administer, it may be necessary to achieve the full potential of the system.
Each instance must have at least two groups of online redo log files. When the current group fills, an instance begins writing to the next log file group. At a log switch, information is written to the control file that can be used to identify the filled group and its thread number after it has been archived.
The number of redo log files about which the control file can keep information is limited by the value of the MAXLOGHISTORY option of the CREATE DATABASE statement. Only one member per group is needed. In Oracle Parallel Server, set the value of MAXLOGHISTORY higher than you normally would in single instance Oracle. This is because in Oracle Parallel Server, the history of multiple redo log files must be tracked.
Oracle8i Concepts for a full description of multiplexed redo log files.
This section describes rollback segments as they relate to Oracle Parallel Server.
Rollback segments contain information that Oracle requires to maintain read consistency and to be able to undo changes made by transactions that roll back or abort. Each instance in Oracle Parallel Server shares use of the SYSTEM rollback segment and requires at least one dedicated rollback segment per instance.
Both private and public rollback segments can be acquired at instance startup and used exclusively by the acquiring instance until taken offline or when the acquiring instance is shut down as specified in the rollback segment parameter. Private rollback segments are unique to a particular instance; other instances cannot use them. A public rollback segment is offline and not used by any instance until an instance that needs an extra rollback segment starts up, acquires it, and brings it online. Once online, the acquiring instance uses the public rollback segment exclusively.
Only one instance writes to a given rollback segment (except for the SYSTEM rollback segment). However, other instances can read from it to create read-consistent snapshots or to perform instance recovery.
Oracle Parallel Server needs at least as many rollback segments as the maximum number of concurrent instances plus one; the extra one is for the SYSTEM rollback segment. An instance cannot start up shared without exclusive access to at least one rollback segment, whether it is public or private.
You can create new rollback segments in any tablespace. To reduce contention between rollback data and table data, partition your rollback segments in a separate tablespace. This also facilitates taking tablespaces offline because a tablespace cannot be taken offline if it contains active rollback segments.
In general, make all rollback segment extents the same size by specifying identical values for the storage parameters INITIAL and NEXT.
The data dictionary view DBA_ROLLBACK_SEGS shows each rollback segment's name, segment ID number, and owner (PUBLIC or other).
Oracle8i Administrator's Guide for information about contention for a rollback segment and for information on the performance implications of adding rollback segments.
These initialization parameters control rollback segment use:
specifies the names of rollback segments that the instance acquires at startup.
reserves instance locks to reduce contention for blocks containing rollback entries. In particular, it reserves instance locks for deferred rollback segments, that contain rollback entries for transactions in tablespaces that were taken offline.
Oracle8i Concepts for more information about data blocks, extents, and segments.
Public and private rollback segments do not have performance differences. However, private rollback segments provide more control over the matching of instances with rollback segments. This allows you to locate the rollback segments for different instances on different disks to improve performance. Therefore, use private rollback segments to reduce disk contention in high-performance systems.
Public rollback segments form a pool of rollback segments that can be acquired by any instance needing an additional rollback segment. Using public rollback segments can be disadvantageous, however, when instances are shut down and started up at the same time. For example, instance X shuts down and releases public rollback segments. Instance Y starts up and acquires the released rollback segments. Finally, instance X starts up and cannot acquire its original rollback segments. Acquiring a public rollback segment can also be made at startup if TRANSACTIONS and TRANSACTIONS_PER_ROLLBACK_SEGMENTS are not properly set.
You can use public rollback segments to improve space utilization. If you create only one large public rollback segment for long-running transactions that run on different instances each month, the rollback segment can be taken offline and brought back online or "moved" from one instance to another to better serve instances with the heavier workloads.
By default a rollback segment is private and is used by the instance specifying it in the parameter file. Specify private rollback segments using the parameter ROLLBACK_SEGMENTS.
Once a public rollback segment is acquired by an instance, it is then used exclusively by that instance.
Once created, both public and private rollback segments can be brought online using the ALTER ROLLBACK SEGMENT command.
When an instance starts, it uses the TRANSACTIONS and TRANSACTIONS_PER_ROLLBACK_SEGMENT initialization parameters to determine how many rollback segments to acquire as shown in the following equation:
The value for total_rollback_segments_required is rounded up.
At startup, an instance attempts to acquire rollback segments by executing the following steps:
This section explains space management and free list group concepts by covering the following topics:
Oracle Parallel Server enables transactions running on separate instances to insert and update data in the same table concurrently. This occurs without contention to locate free space for new records. However, to take advantage of this capability, you must accurately manage free space in your database using several structures that are described in this section.
Oracle keeps track of blocks with space available for transactions that may cause rows to exceed the space available in their original block. Oracle does this for each database object, such as a table, cluster, or index. A transaction requiring free space can examine the free list of blocks. If the free list does not contain a block with enough space to accommodate it, Oracle allocates a new extent.
When Oracle allocates new extents to a table, Oracle adds their blocks to the master free list. This can eventually result in contention for free space among multiple instances on Oracle Parallel Server because the free space contained in newly allocated extents cannot be reallocated to any group of free lists. You can have more control over free space if you specifically allocate extents to instances; in this way you minimize free space contention.
A segment is a unit of logical database storage. Oracle allocates space for segments in smaller units called extents. An extent is a specific number of contiguous data blocks allocated for storing a specific type of information. A segment thus comprises a set of extents allocated for a specific type of data structure. For example, each table's data is stored in its own data segment, while each index's data is stored in its own index segment.
A segment's extents are stored in the same tablespace. However, they may or may not be contiguous on disk. The segments can span files, but individual extents cannot.
Although you can allocate additional extents, the blocks themselves are allocated separately. If you allocate an extent to a specific instance, the blocks are immediately allocated to the free list. However, if the extent is not allocated to a specific instance, then the blocks themselves are allocated only when the high water mark moves.
The high water mark is the boundary between used and unused space in a segment. As requests for new free blocks that cannot be satisfied by existing free lists are received, the block to which the high water mark points becomes a used block, and the high water mark is advanced to the next block. In other words, the segment space to the left of the high water mark is used, and the space to the right of it is unused.
Figure 7-2 shows a segment consisting of three extents containing 10K, 20K, and 30K of space, respectively. The high water mark is in the middle of the second extent. Thus, the segment contains 20K of used space to the left of the high water mark, and 40K of unused space to the right of the high water mark.
Oracle8i Concepts for more information about segments and extents.
Single-instance Oracle uses multiple free lists to reduce block contention. Every tablespace has a free list that identifies data blocks with free space. Oracle uses blocks with free space when inserts or updates are made to a database object such as a table or a cluster.
Blocks in free lists contain free space greater than PCTFREE. This is the percentage of a block to be reserved for updates to existing rows. In general, blocks included in process free lists for a database object must satisfy the PCTFREE and PCTUSED constraints.
You can specify the number of free lists by setting the FREELISTS parameter when you create a table, index or cluster. The maximum value of the FREELISTS parameter depends on the Oracle block size on your system. In addition, for each free list you need to store a certain number of bytes in a block to accommodate overhead.
Within free list groups there are two subsets of free lists:
Because Parallel Server has multiple instances, free lists alone cannot solve contention problems. Free list groups, however, effectively reduce pinging between instances.
A free list group is a set of free lists for use by one or more instances. Each free list group provides free data blocks to accommodate inserts or updates on tables and clusters, and is associated with instance(s) at startup. By default, only one free list group is available. This means all free lists for an object reside in the segment header block. Free list groups are supported on all database objects.
If multiple free lists reside in a single block in an Oracle Parallel Server environment, the block with the free lists could thus experience pinging, or forced reads/writes among all the instances. Avoid this by grouping the free lists into separate groups and assigning each group to an instance. Each instance then has its own block containing free lists. Since each instance uses its own free lists, there is no contention among instances to access the same block containing free lists.
Inter-instance contention occurs when different instances' transactions insert data into the same table. This occurs because all free lists are held in the segment header if you do not define free list groups. The free list may be from a common pool of blocks, or you can partition multiple free lists so specific extents in files are allocated to objects.
A highly concurrent environment has potential contention for the segment header, that contains the free list.
In multi-instance environments, as illustrated in Figure 7-3, free lists provide free data blocks from available extents to different instances. You can partition multiple free lists so that extents are allocated to specific database instances. Each instance hashes to one or more free list groups, and each group's header block points to free lists. Without free list groups, every instance must read the segment header block to access the free lists.
Figure 7-4 shows the blocks of a file in which the master free list is stored in the segment header block. Three instances are forced to read this block to obtain free space. Because there is only one free list, there is only one insertion point. Free list groups help reduce contention by spreading this insertion point over multiple blocks, each of which will be accessed less frequently.
Locally managed tablespaces are also useful because they help avoid dictionary contention. This can occur if Oracle Parallel Server performs a lot of space management for sorting segments in tablespaces, creating or dropping tables, and so on. Locally managed tablespaces eliminate this contention and the benefits of Oracle Parallel Server in this case easily outweigh the benefits of single instance Oracle.
This section describes two free list group examples:
Figure 7-5 illustrates the division of free space for a table into a master free list and two free list groups, each of which contains three free lists. This example involves a well-partitioned application in which deletes occur. The master free list pictured is the master free list for this particular free list group.
The table was created with one initial extent, after which extents 2 and 5 were allocated to instance X, extents 3 and 4 were allocated to instance Y, and extent 6 was allocated automatically, but not to a particular instance. Notice the following:
Each user process running on instance X uses one of the free lists in group X, and each user process on instance Y uses one of the free lists in group Y. If more instances start up, their user processes share free lists with instance X or Y.
The simple case in Figure 7-5 becomes more complicated when you consider that extents are not permanently allocated to instances, and that space allocated to one instance cannot be used by another instance. Each free list group has its own master free list. After allocation, some blocks go onto the master free list for the group, some go to a process free list, and some do not belong to a free list at all. If the application is totally partitioned, then once blocks are allocated to a given instance, they stay with that instance. However, blocks can move from one instance to another if the application is not totally partitioned.
Consider a situation where instance Y fills a block, takes it off the free list, and then instance X frees the block. The block then goes to the free list of instance X, the instance that freed it. If instance Y needs space, it cannot reclaim this block. Instance Y can only obtain free space from its own free list group.
In general, all tables should have the same number of free list groups, but the number of free lists within a group may vary, depending on the type and amount of activity of each table.
Partitioning free space can particularly improve the performance of applications that have a high volume of concurrent inserts, or updates requiring new space, from multiple instances. Performance improvements also depend, of course, on your operating system, hardware, data block size, and so on.
In a multi-instance environment, information about multiple free lists and free list groups is not preserved upon import. If you use Export and Import to back up and restore your data, it will be difficult to import the data so that it is partitioned again.
The actual free list group block is determined by hashing the Oracle process ID by the number of free list groups. For example, if there are 3 instances and 35 free list groups, then instance 1 handles the first twelve free list groups, instance 2 the next twelve, and instance 3 the remaining eleven.
This section describes:
Data partitioning can reduce contention for data blocks. The PCM locks that often cover blocks in one free list group tend to be held primarily by the instance using that free list group. This is because an instance that modifies data is usually more likely to reuse that data than other instances. However, if multiple instances take free space from the same extent, they are more likely to contend for blocks in that extent if they subsequently modify the data that they inserted.
If MAXINSTANCES is greater than the number of free list groups in the table or cluster, then an instance number maps to the free list group associated with:
instance_number modulo number_of_free_list_groups
"Modulo" (or "rem" for "remainder") is a formula for determining which free list group should be used by calculating a remainder value. In the following example there are 2 free list groups and 10 instances. To determine which free list group instance 6 will use, the formula would read 6 modulo 2 = 0. Six divided by 2 is 3 with zero remainder, so instance 6 will use free list group 0. Similarly, instance 5 would use free list group 1 because 5 modulo 2 = 1. Five is divisible by 2 with a remainder of 1.
If there are more free list groups than MAXINSTANCES, then a different hashing mechanism is used. If multiple instances share one free list group, they share access to every extent specifically allocated to any instance sharing that free list group.
In a system with relatively few nodes, the FREELIST GROUPS option for a table should generally have the same value as the MAXINSTANCES option of CREATE DATABASE, which limits the number of instances that can access a database concurrently. In a Massively Parallel Processing system, however, MAXINSTANCES could be many times larger than FREELIST GROUPS so that many instances share one group of free lists.
Oracle8i Parallel Server Administration, Deployment, and Performance for more information on associating instances, users, and locks with freelist groups.
User processes associate with process free lists based on their Oracle process IDs. Each user process has access to only one free list in the free list group for the instance on which it is running. Every user process also has access to the master free list of free blocks.
If a table has multiple free lists but does not have multiple free list groups, or has fewer free list groups than the number of instances, then each free list is shared by user processes from different instances.
If each extent in a table is in a separate data file, you can use the GC_FILES_TO_LOCKS parameter to allocate specific ranges of PCM locks to each extent, so that each set of PCM locks is associated with only one group of free lists.
Figure 7-6 shows multiple extents in separate files. The GC_FILES_TO_LOCKS parameter allocates 10 locks to files 8 and 10, and 10 locks to files 9 and 11. Extents A and C are in the same free list group, and extents B and D are in another free list group. One set of PCM locks is associated with files 8 and 10, and a different set of PCM locks is associated with files 9 and 11. You do not need separate locks for files that are in the same free list group, such as files 8 and 10, or files 9 and 11.
This example assumes total partitioning for reads as well as writes. If more than one instance is to update blocks, then it would still be desirable to have more than one lock per file to minimize forced reads and writes. This is because even with a shared lock, all blocks held by a lock are subject to forced reads when another instance tries to read even one of the locked blocks.
Figure 7-7 illustrates how free lists and free list groups are assigned to instances.
Using the statement ALTER SESSION INSTANCE_NUMBER, you can increase the instance number value beyond the value of MAXINSTANCES. Figure 7-7 shows how this is taken into account: for the purposes of the internal calculation whereby free list groups are assigned, the instance number is brought back within the boundaries of MAXINSTANCES.
Several SQL options enable you to allocate free lists and free list groups for tables, clusters, and indexes. You can explicitly specify that new space for an object be taken from a specific data file. You can also associate free space with particular free list groups that you can then associate with particular instances.
The SQL statements include:
CREATE [TABLE | CLUSTER | INDEX] STORAGE FREELISTS FREELIST GROUPS ALTER [TABLE | CLUSTER | INDEX] ALLOCATE EXTENT SIZE DATAFILE INSTANCE
You can use these SQL options with the initialization parameter INSTANCE_NUMBER to associate data blocks with instances.
Oracle8i SQL Reference for complete syntax of these statements.
This section covers the following topics:
When a row is inserted into a table and new extents need to be allocated, a certain number of contiguous blocks, as specified by !blocks in the GC_FILES_TO_LOCKS parameter, is allocated to the free list group associated with an instance. Extents allocated when the table or cluster is first created and new extents that are automatically allocated, add their blocks to the master free list or to the space above the high water mark.
When you explicitly allocate an extent without specifying an instance, or when an extent is automatically allocated to a segment because the system is running out of space (the high water mark cannot be advanced any more), the new extent becomes part of the unused space. It is placed at the end of the extent map, which means that the current high water mark is now in an extent "to the left" of the new one. The new extent is thus added "above" the high water mark.
You have two options for controlling the allocation of new extents.
Pre-allocating extents is a static approach to the problem of preventing automatic allocation of extents by Oracle. You can pre-allocate extents to tables that have free list groups. This means that all free blocks are formatted into free lists, which will reside in the free list group of the instance to which you are pre-allocating the extent. This approach is useful if you need to partition data so as to greatly reduce all pinging on insert, or if you need to accommodate objects that you expect will grow in size.
To accommodate growth, the strategy of dynamically allocating blocks to free list groups is more effective than pre-allocating of extents. You can use the !blocks option of GC_FILES_TO_LOCKS to dynamically allocate blocks to a free list from the high water mark within a lock boundary. This method does not eliminate all pinging on the segment header. Instead, this method allocates blocks as needed so you do not have to pre-allocate extents.
Because locks are owned by instances, blocks are allocated on a per-instance basis-and that is why they are allocated to free list groups. Within an instance, blocks can be allocated to different free lists.
Using this method, you can either explicitly allocate the !blocks value, or leave the balance of new blocks still covered by the existing PCM lock. If you choose the latter, remember there still may be contention for the existing PCM lock by allocation to other instances. If the PCM lock covers multiple groups of blocks, there may still be unnecessary forced reads and writes of all the blocks covered by the lock.
A segment's high water mark is the current limit to the number of blocks that have been allocated within the segment. If you are allocating extents dynamically, the high water mark is also the lock boundary. The lock boundary and the number of blocks that will be allocated at one time within an extent must coincide. This value must be the same for all instances.
Consider the following example with 4 blocks per lock (!4). Locks have been allocated before the block content has been entered. If we have filled data block D2, held by Lock 2, and then allocated another range of 4 blocks, only the number of blocks fitting within the lock boundary are allocated. In this case, this includes blocks 7 and 8. Both of these are protected by the current lock. With the high water mark at 8, when instance 2 allocates a range of blocks, all four blocks 9 to 12 are allocated, covered by lock 3. The next time instance 1 allocates blocks it will get blocks 13 to 16, covered by lock 4.
This example assumes that GC_FILES_TO_LOCKS has the following setting for both instances:
With the EACH option specified, each file in file_list is allocated #locks number of PCM locks. Within each file, !blocks specifies the number of contiguous data blocks to be covered by each lock.
Figure 7-9 shows the incremental process by which the segment grows:
In this way, if user A on Instance 1 is working on block 10, no one else from either instance can work on any block in the range of blocks covered by Lock 2. This includes blocks 6 through 10.