|Oracle® Database VLDB and Partitioning Guide
11g Release 1 (11.1)
|PDF · Mobi · ePub|
Partitioning was initially adopted to cope with the performance requirements for data warehouses. With the explosive growth of OLTP systems and their user populations, partitioning is particularly useful for OLTP systems as well.
Partitioning is often used for OLTP systems to reduce contention in order to support a very large user population. It also helps in addressing regulatory requirements facing OLTP systems, including storing larger amounts of data in a cost-effective manner. This chapter contains the following topics:
Online Transaction Processing (OLTP) systems are one of the most common data processing systems in today's enterprises. Classical examples of OLTP systems are order entry, retail sales, and financial transaction systems.
OLTP systems are primarily characterized through a specific data usage that is different from data warehousing environments, yet some of the characteristics, such as having large volumes of data and lifecycle-related data usage and importance, are identical.
The main characteristics of an OLTP environment are:
Short response time
The nature of OLTP environments is predominantly any kind of interactive ad hoc usage, such as telemarketeers entering telephone survey results. OLTP systems require short response times in order for users to remain productive.
OLTP systems normally read and manipulate highly selective, small amounts of data; the data processing is mostly simple and complex joins are relatively rare. There is always a mix of queries and DML workload. For example, one of many call center employees retrieves customer details for every call and enters customer complaints while reviewing past communication with the customer.
Data maintenance operations
It is not uncommon to have reporting programs and data updating programs that need to run either periodically or on an ad hoc basis. These programs, which run in the background while users continue to work on other tasks, may require a large number of data-intensive computations. For example, a University may start batch jobs assigning students to classes while students can still sign up online for classes themselves.
Large user populations
OLTP systems can have immeasurably large user populations where many users are trying to access the same data at once. For example, an online auction Web site can have hundreds of thousands (if not millions) of users accessing data on its Web site at the same time.
Due to the large user population, the short response times, and small transactions, the concurrency in OLTP environments is very high. A requirement for thousands of concurrent users is not uncommon.
Large data volumes
Depending on the application type, the user population, and the data retention time, OLTP systems can become very large. For example, every customer of a bank could have access to the online banking system which shows all their transactions for the last 12 months.
The availability requirements for OLTP systems are often extremely high. An unavailable OLTP system can impact a very large user population, and organizations can suffer major losses if OLTP systems are unavailable. For example, a stock exchange system has extremely high availability requirements during trading hours.
Lifecycle related data usage
Similar to data warehousing environments, OLTP systems often experience different data access patterns over time. For example, at the end of the month, monthly interest is calculated for every active account.
The following are benefits of partitioning for OLTP environments:
Support for bigger databases
Backup and recovery, as part of a high availability strategy, can be performed on a low level of granularity to cope with the size of the database. OLTP systems usually remain online during backups and users may continue to access the system while the backup is running. The backup process should not introduce major performance degradation for the online users.
Partitioning helps to reduce the space requirements for the OLTP system because part of a database object can be stored compressed while other parts can remain uncompressed. Update transactions against uncompressed rows are more efficient than updates on compressed data.
Partitioning can be used to store data transparently on different storage tiers to lower the cost of storing vast amounts of data.
Partition maintenance operations for data maintenance (instead of DML)
In the case of data maintenance operations (purging being the most common operation), you can leverage partition maintenance operations in conjunction with Oracle's capability of online index maintenance. A partition management operation generates less redo than the equivalent DML operations.
Potential higher concurrency through elimination of hot spots
A common scenario for OLTP environments is to have monotonically increasing index values that are used to enforce primary key constraints, thus creating areas of high concurrency and potential contention: every new insert tries to update the same set of index blocks. Partitioned indexes, in particular hash-partitioned indexes, can help to alleviate this situation.
Performance in OLTP environments heavily relies on performant index access, thus the choice of the most appropriate index strategy becomes crucial. The following section discusses best practices for deciding whether or not to partition indexes in an OLTP environment.
Due to the selectivity of queries and high concurrency of OLTP applications, the choice of the right index strategy is indisputably one of the most important decisions for the use of partitioning in an OLTP environment. The following basic rules help to understand the main benefits and trade-offs for the various possible index structures:
A non-partitioned index, while larger than individual partitioned index segments, always leads to a single index probe (or scan) if an index access path is chosen; there is only one segment for a table. The data access time and number of blocks being accessed is identical for both a partitioned and a non-partitioned table.
A non-partitioned index does not provide partition autonomy and requires an index maintenance operation for every partition maintenance operation that affects rowids (for example, drop, truncate, move, merge, coalesce, or split operations).
With partitioned indexes, there are always multiple segments. Whenever the Oracle Database cannot prune down to a single index segment, the database has to touch more than one segment. This potentially leads to higher I/O requirements (n index segment probes compared with one probe with a nonpartitioned index) and can have an impact (measurable or not) on the runtime performance. This is true for all partitioned indexes.
Partitioned indexes can either be local partitioned indexes or global partitioned indexes. Local partitioned indexes always inherit the partitioning key from the table and are fully aligned with the table partitions. Consequently, any kind of partition maintenance operation requires little to no index maintenance work. For example, dropping or truncating a partition does not incur any measurable overhead for index maintenance; the local index partitions will be either dropped or truncated.
Partitioned indexes that are not aligned with the table are called global partitioned indexes. Unlike local indexes, there is no relation between a table and an index partition. Global partitioned indexes give the flexibility to choose a partitioning key that is most optimal for a performant partition index access. Partition maintenance operations normally affect more (if not all) partitions of a global partitioned index, depending on the operation and partitioning key of the index.
Under some circumstances, having multiple segments for an index can be beneficial for performance. It is very common in OLTP environments to leverage sequences to create artificial keys; consequently you create key values that are monotonically increasing, which results in many insert processes competing for the same index blocks. Introducing a global partitioned index (for example, using global hash partitioning on the key column) can alleviate this situation. If you have, for example, four hash partitions for such an index, then you now have four index segments you are inserting data into, reducing the concurrency on these segments by a factor of four for the insert processes.
With less contention, the application can support a larger user population. The following example shows the creation of a unique index on the
order_id column of the
orders table. The
order_id in the OLTP application is filled using a sequence number. The unique index uses hash partitioning in order to reduce contention for the monotonically increasing
order_id values. The unique key is then used to create the primary key constraint.
CREATE UNIQUE INDEX orders_pk ON orders(order_id) GLOBAL PARTITION BY HASH (order_id) ( PARTITION p1 TABLESPACE tbs1 , PARTITION p2 TABLESPACE tbs2 , PARTITION p3 TABLESPACE tbs3 , PARTITION p4 TABLESPACE tbs4 ) NOLOGGING; ALTER TABLE orders ADD CONSTRAINT orders_pk PRIMARY KEY (order_id) USING INDEX;
Enforcing uniqueness is important database functionality for OLTP environments. Uniqueness can be enforced with non-partitioned as well as with partitioned indexes. However, since partitioned indexes provide partition autonomy, the following requirements must be met to implement unique indexes:
A non-partitioned index can enforce uniqueness for any given column or combination of columns. The behavior of a non-partitioned index is no different for a partitioned table compared to a non-partitioned table.
Each partition of a partitioned index is considered an autonomous segment. To enforce the autonomy of these segments, you always have to include the partitioning key columns as a subset of the unique key definition.
Unique global partitioned indexes must always be prefixed with the partitioning columns.
Unique local indexes must have the partitioning key of the table as a subset of the unique key definition.
When your workload fits the use of index-organized tables, then you have to consider how to use partitioning on your index-organized table and on any secondary indexes.
Whether or not to partition secondary indexes on index-organized tables follows the same considerations as indexes on regular heap tables. You can partition an index-organized table, but the partitioning key must be a subset of the primary key. A common reason to partition an index-organized table is to reduce contention; this is typically achieved using hash partitioning.
Another reason to partition an index-organized table is to be able to physically separate data sets based on one of the primary key columns. For example, an application hosting company can physically separate application instances for different customers by list partitioning on the company identifier. Queries in such a scenario can often take advantage of index partition pruning, shortening the time for the index scan. ILM scenarios with index-organized tables and partitioning are less common because they require a date column to be part of the primary key.
In addition to the performance benefits, Partitioning also enables the optimal data management for large objects in an OLTP environment. Every partition maintenance operation in the Oracle Database can be extended to atomically include global and local index maintenance, enabling the execution of any partition maintenance operation without affecting the 24x7 availability of an OLTP environment.
Partition maintenance operations in OLTP systems occur often because of ILM scenarios. In these scenarios, [ range | interval ] partitioned tables, or [ range | interval ]-* composite partitioned tables, are common.
Other business cases for partition maintenance operations include scenarios surrounding the separation of application data. For example, a retail company runs the same application for multiple branches in a single schema. Depending on the branch revenues, the application (as separate partitions) is stored on more performant storage. List partitioning, or list-* composite partitioning, is a common partitioning strategy for this type of business case.
Hash partitioning, or hash subpartitioning for tables, can be used in OLTP systems to obtain similar performance benefits to the performance benefits achieved in data warehouse environments. The majority of the daily OLTP workload consists of relatively small operations, executed in serial. Periodic batch operations, however, may execute in parallel and benefit from the distribution benefits hash partitioning and subpartitioning can provide for partition-wise joins. For example, end-of-the-month interest calculation may be executed in parallel in order to complete within a nightly batch window.
See Also:Chapter 4, "Partitioning for Availability, Manageability, and Performance" for more information about the performance benefits of partitioning
Whenever a partition maintenance operation takes place, Oracle locks the affected table partitions for any DML operation. Data in the affected partitions, with the exception of a
TRUNCATE operation, is still fully accessible for any
SELECT operation. Since local indexes are logically coupled with the table (data) partitions, only the local index partitions of the affected table partitions have to be maintained as part of a partition maintenance operation, enabling the most optimal processing for the index maintenance.
For example, when you move an older partition from a high end storage tier to a low cost storage tier, the data and the index are always available for
SELECT operations; the necessary index maintenance is either to update the existing index partition to reflect the new physical location of the data or, more commonly, a move and rebuild of the index partition to a low cost storage tier as well. If you drop an older partition after you have archived it, then its local index partitions get dropped as well, enabling a split-second partition maintenance operation that only affects the data dictionary.
Whenever a global index is defined on a partitioned or non-partitioned table, there is no correlation between a distinct table partition and the index. Consequently, any partition maintenance operation affects all global indexes or index partitions. As with tables containing local indexes, the affected partitions are locked to prevent DML operations against the affected table partitions. However, unlike the index maintenance for local indexes, any global index will still be fully available for DML operations and will not affect the online availability of the OLTP system. Conceptually and technically, the index maintenance for global indexes for a partition maintenance operation is comparable to the index maintenance that would become necessary for a semantically identical DML operation.
For example, dropping an old partition is semantically equivalent to deleting all the records of the old partition using the SQL
DELETE statement. In both cases, all index entries of the deleted data set have to be removed from any global index as a normal index maintenance operation which will not affect the availability of an index for
SELECT and DML operations. In this scenario, a drop operation represents the most optimal approach: data is removed without the overhead of a conventional
DELETE operation and the global indexes are maintained in a non-intrusive manner.
The two most common partition maintenance operations are the removal of data and the relocation of data onto lower cost storage tier devices.
Using either a
TRUNCATE operation will remove older data based on the partitioning key criteria. The drop operation will remove the data as well as the partition metadata, while a truncate operation will only remove the data but preserve the metadata. All local index partitions are dropped respectively, as well as truncated. Normal index maintenance will be done for partitioned or non-partitioned global indexes and is fully available for select and DML operations.
The following example drops all data older than January 2006 from the
orders table. Note that as part of the drop statement, an
UPDATE GLOBAL INDEXES statement is executed, so that the global index remains usable throughout the maintenance operation. Any local index partitions are dropped as part of this operation.
ALTER TABLE orders DROP PARTITION p_before_jan_2006 UPDATE GLOBAL INDEXES;
MERGE operation as part of an Information lifecycle Management strategy, you can relocate older partitions to the most cost-effective storage tier. The data is available for
SELECT but not for DML operations during the operation. Local indexes are maintained and you will most likely relocate those as part of the merge or move operation as well. Normal index maintenance will be done for partitioned or non-partitioned global indexes and is fully available for select and DML operations.
The following example shows how to merge the January 2006 and February 2006 partitions in the
orders table, and store them in a different tablespace. Any local index partitions are also moved to the
ts_low_cost tablespace as part of this operation. The
UPDATE INDEXES clause ensures that all indexes remain usable throughout and after the operation without additional rebuilds.
ALTER TABLE orders MERGE PARTITIONS p_2006_jan,p_2006_feb INTO PARTITION p_before_mar_2006 COMPRESS TABLESPACE ts_low_cost UPDATE INDEXES;
See Also:Chapter 5, "Using Partitioning for Information Lifecycle Management" for more information about the benefits of partition maintenance operations for Information Lifecycle Management