1.2.8 Exadata Hybrid Columnar Compression
Exadata Hybrid Columnar Compression stores data using column organization, which brings similar values close together and enhances compression ratios.
Using Exadata Hybrid Columnar Compression, data is organized into sets of rows called compression units. Within a compression unit, data is organized by column and then compressed. Each row is self-contained within a compression unit.
Database operations work transparently against compressed objects, so no application changes are required. The database compresses data manipulated by any SQL operation, although compression levels are higher for direct path loads.
You can specify the following types of Exadata Hybrid Columnar Compression, depending on your requirements:
- Warehouse compression: This type of compression is optimized for query performance, and is intended for data warehouse applications.
- Archive compression: This type of compression is optimized for maximum compression levels, and is intended for historic data and data that does not change.
Assume that you apply Exadata Hybrid Columnar Compression to a daily_sales
table. At the end of every day, the table is populated with items and the number sold, with the item ID and date forming a composite primary key. A row subset is shown in the following table.
Table 1-1 Sample Table daily_sales
Item_ID | Date | Num_Sold | Shipped_From | Restock |
---|---|---|---|---|
1000 |
01-JUN-07 |
2 |
WAREHOUSE1 |
Y |
1001 |
01-JUN-07 |
0 |
WAREHOUSE3 |
N |
1002 |
01-JUN-07 |
1 |
WAREHOUSE3 |
N |
1003 |
01-JUN-07 |
0 |
WAREHOUSE2 |
N |
1004 |
01-JUN-07 |
2 |
WAREHOUSE1 |
N |
1005 |
01-JUN-07 |
1 |
WAREHOUSE2 |
N |
The database stores a set of rows in an internal structure called a compression unit. For example, assume that the rows in the previous table are stored in one unit. Exadata Hybrid Columnar Compression stores each unique value from column 4 with metadata that maps the values to the rows. Conceptually, the compressed value can be represented as:
WAREHOUSE1WAREHOUSE3WAREHOUSE2
The database then compresses the repeated word WAREHOUSE
in this value by storing it once and replacing each occurrence with a reference. If the reference is smaller than the original word, then the database achieves compression. The compression benefit is particularly evident for the Date
column, which contains only one unique value.
As shown in the following illustration, each compression unit can span multiple data blocks. The values for a particular column may or may not span multiple blocks.
Exadata Hybrid Columnar Compression has implications for row locking. When an update occurs for a row in an uncompressed data block, only the updated row is locked. In contrast, the database must lock all rows in the compression unit if an update is made to any row in the unit. Updates to rows using Exadata Hybrid Columnar Compression cause rowids to change.
Note:
When tables use Exadata Hybrid Columnar Compression, Oracle DML locks larger blocks of data (compression units) which may reduce concurrency.Oracle Database supports four methods of table compression.
Table 1-2 Table Compression Methods
Table Compression Method | Compression Level | CPU Overhead | Applications |
---|---|---|---|
Basic compression |
High |
Minimal |
DSS |
OLTP compression |
High |
Minimal |
OLTP, DSS |
Warehouse compression |
Higher (compression level depends on compression level specified (LOW or HIGH)) |
Higher (CPU overhead depends on compression level specified (LOW or HIGH)) |
DSS |
Archive compression |
Highest (compression level depends on compression level specified (LOW or HIGH)) |
Highest (CPU overhead depends on compression level specified (LOW or HIGH)) |
Archiving |
Warehouse compression and archive compression achieve the highest compression levels because they use Exadata Hybrid Columnar Compression technology. Exadata Hybrid Columnar Compression technology uses a modified form of columnar storage instead of row-major storage. This enables the database to store similar data together, which improves the effectiveness of compression algorithms. Because Exadata Hybrid Columnar Compression requires high CPU overhead for DML, use it only for data that is updated infrequently.
The higher compression levels of Exadata Hybrid Columnar Compression are achieved only with data that is direct-path inserted. Conventional inserts and updates are supported, but result in a less compressed format, and reduced compression level.
The following table lists characteristics of each table compression method.
Table 1-3 Table Compression Characteristics
Table Compression Method | CREATE/ALTER TABLE Syntax | Direct-Path Insert | DML |
---|---|---|---|
Basic compression |
|
Yes |
Yes Note: Inserted and updated rows are uncompressed. |
OLTP compression |
|
Yes |
Yes |
Warehouse compression |
|
Yes |
Yes High CPU overhead. Note: Inserted and updated rows go to a block with a less compressed format and have lower compression level. |
Archive compression |
|
Yes |
Yes Note: Inserted and updated rows are uncompressed. Inserted and updated rows go to a block with a less compressed format and have lower compression level. |
The COMPRESS FOR QUERY HIGH
option is the default data warehouse compression mode. It provides good compression and performance. The COMPRESS FOR QUERY LOW
option should be used in environments where load performance is critical. It loads faster than data compressed with the COMPRESS FOR QUERY HIGH
option.
The COMPRESS FOR ARCHIVE LOW
option is the default archive compression mode. It provides a high compression level and good query performance. It is ideal for infrequently-accessed data. The COMPRESS FOR ARCHIVE HIGH
option should be used for data that is rarely accessed.
A compression advisor, provided by the DBMS_COMPRESSION
package, helps you determine the expected compression level for a particular table with a particular compression method.
You specify table compression with the COMPRESS
clause of the CREATE
TABLE
command. You can enable compression for an existing table by using these clauses in an ALTER
TABLE
statement. In this case, only data that is inserted or updated is compressed after compression is enabled. Similarly, you can disable table compression for an existing compressed table with the ALTER
TABLE
...NOCOMPRESS
command. In this case, all data that was already compressed remains compressed, and new data is inserted uncompressed.
Parent topic: Key Features of Oracle Exadata System Software