Oracle Database In-Memory (Database In-Memory) is a suite of features, first introduced in Oracle Database 12c Release 1 (184.108.40.206), that greatly improves performance for real-time analytics and mixed workloads. The In-Memory Column Store (IM column store) is the key feature of Database In-Memory.
Database In-Memory features require the Oracle Database In-Memory option.
This chapter contains the following topics:
Traditionally, obtaining good performance for analytic queries meant satisfying a number of requirements.
In a typical data warehouse or mixed-use database, requirements include the following:
You must understand user access patterns.
You must provide good performance, which typically requires creating indexes, materialized views, and OLAP cubes.
For example, if you create 1 to 3 indexes for a table (1 primary key and 2 foreign key indexes) to provide good performance for an OLTP application, then you may need to create additional indexes to provide good performance for analytic queries.
Figure 1-1 Multiple Indexes
Meeting the preceding requirements creates manageability and performance problems. Additional access structures cause performance overhead because you must create, manage, and tune them. For example, inserting a single row into a table requires an update to all indexes on this table, which increases response time.
The demand for real-time analytics means that more analytic queries are being executed in a mixed-workload database. The traditional approach is not sustainable.
Traditionally, relational databases store data in either row or columnar formats. Memory and disk store data in the same format.
An Oracle database stores rows contiguously in data blocks. For example, in a table with three rows, an Oracle data block stores the first row, and then the second row, and then the third row. Each row contains all column values for the row. Data stored in row format is optimized for transaction processing. For example, updating all columns in a small number of rows may modify only a small number of blocks.
To address the problems relating to analytic queries, some database vendors have introduced a columnar format. A columnar database stores selected columns—not rows—contiguously. For example, in a large sales table, the sales IDs reside in one column, and sales regions reside in a different column.
Analytical workloads access few columns while scanning, but scan the entire data set. For this reason, the columnar format is the most efficient for analytics. Because columns are stored separately, an analytical query can access only required columns, and avoid reading inessential data. For example, a report on sales totals by region can rapidly process many rows while accessing only a few columns.
Database vendors typically force customers to choose between a columnar and row-based format. For example, if the data format is columnar, then the database stores data in columnar format both in memory and on disk. Gaining the advantages of one format means losing the advantages of the alternate format. Applications either achieve rapid analytics or rapid transactions, but not both. The performance problems for mixed-use databases are not solved by storing data in a single format.
The Oracle Database In-Memory (Database In-Memory) feature set includes the In-Memory Column Store (IM column store), advanced query optimizations, and availability solutions.
The Database In-Memory optimizations enable analytic queries to run orders of magnitude faster on data warehouses and mixed-use databases.
This section contains the following topics:
The Database In-Memory feature set includes the IM column store, advanced query optimizations, and availability solutions. These features combine to accelerate analytic queries by orders of magnitude without sacrificing OLTP performance or availability.
This section contains the following topics:
The compressed columnar format enables faster scans, queries, joins, and aggregates.
This section contains the following topics:
The columnar format provides fast throughput for scanning large amounts of data. You can analyze data in real time, enabling you to explore different possibilities and perform iterations.
The IM column store can drastically improve performance for the following types of queries:
A query that scans a large number of rows and applies filters that use operators such as
A query that selects a small number of columns from a table or a materialized view having large number of columns, such as a query that accesses 5 out of 100 columns
Columnar format uses fixed-width columns for most numeric and short string data types. This optimization enables rapid vector processing, which enables the database to answer queries faster.
Scans of the IM column store are faster than scans of row-based data for the following reasons:
Elimination of buffer cache overhead
The IM column store stores data in a pure, in-memory columnar format. The data does not persist in the data files (or generate redo), so the database avoids the overhead of reading data from disk into the buffer cache.
The database scans only the columns necessary for the query rather than entire rows of data. Furthermore, the database uses storage indexes and an internal dictionary to read only the necessary IMCUs for a specific query. For example, if a query requests all sales for a store with a store ID less than 8, then the database can use IMCU pruning to eliminate IMCUs that do not contain this value.
Traditionally, the goal of compression is to save space. In the IM column store, the goal of compression is to accelerate scans. The database automatically compresses columnar data using algorithms that allow
WHERE clause predicates to be applied against the compressed formats. Depending on the type of compression applied, Oracle Database can scan data in its compressed format without decompressing it first. Therefore, the volume of data that the database must scan in the IM column store is less than the corresponding volume in the database buffer cache.
Each CPU core scans local in-memory columns. To process data as an array (set), the scans use SIMD vector instructions. For example, a query can read a set of values in a single CPU instruction rather than read the values one by one. Vector scans by a CPU core are orders of magnitude faster than row scans.
For example, suppose a user executes the following ad hoc query:
SELECT cust_id, time_id, channel_id FROM sales WHERE prod_id BETWEEN 14 and 29
When using the buffer cache, the database would typically scan an index to find the product IDs, use the rowids to fetch the rows from disk into the buffer cache, and then discard the unwanted column values. Scanning data in row format in the buffer cache requires many CPU instructions, and can result in suboptimal CPU efficiency.
When using the IM column store, the database can scan only the requested
sales columns, avoiding disk altogether. Scanning data in columnar format pipelines only necessary columns to the CPU, increasing efficiency. Each CPU core scans local in-memory columns using SIMD vector instructions.
A Bloom filter is a low-memory data structure that tests membership in a set. The IM column store takes advantage of Bloom filters to improve the performance of joins.
Bloom filters speed up joins by converting predicates on small dimension tables to filters on large fact tables. This optimization is useful when performing a join of multiple dimensions with one large fact table. The dimension keys on fact tables have many repeat values. The scan performance and repeat value optimization speeds up joins by orders of magnitude.
An important aspect of analytics is to determine patterns and trends by aggregating data. Aggregations and complex SQL queries run faster when data is stored in the IM column store.
In Oracle Database, aggregation typically involves a
GROUP BY clause. Traditionally, the database used
HASH operators. Starting in Oracle Database 12c Release 1 (12.1), the database offered
VECTOR GROUP BY transformations to enable efficient in-memory, array-based aggregation.
During a fact table scan, the database accumulates aggregate values into in-memory arrays, and uses efficient algorithms to perform aggregation. Joins based on the primary key and foreign key relationships are optimized for both star schemas and snowflake schemas.
Although OLTP applications do not benefit from accessing data in the IM column store, the dual-memory format can indirectly improve OLTP performance.
When all data is stored in rows, improving analytic query performance requires creating access structures. The standard approach is to create analytic indexes, materialized views, and OLAP cubes. For example, a table might require 3 indexes to improve the performance of the OLTP application (1 primary key and 2 foreign key indexes) and 10-20 additional indexes to improve the performance of the analytic queries. While this technique can improve analytic query performance, it slows down OLTP performance. Inserting a row into the table requires modifying all indexes on the table. As the number of indexes increases, insertion speed decreases.
When you populate data into the IM column store, you can drop analytic access structures. This technique reduces storage space and processing overhead because fewer indexes, materialized views, and OLAP cubes are required. For example, an insert results in modifying 1-3 indexes instead of 11-23 indexes.
While the IM column store can drastically improve performance for analytic queries in business applications, ad-hoc analytic queries, and data warehouse workloads, pure OLTP databases that perform short transactions using index lookups benefit less. The IM column store does not improve performance for the following types of queries:
A query with complex predicates
A query that selects a large number of columns
A query that returns a large number of rows
Oracle Database Data Warehousing Guide to learn more about physical data warehouse design
The IM column store is fully integrated into Oracle Database. All High Availability features are supported.
The columnar format does not change the Oracle database on-disk storage format. Thus, buffer cache modifications and redo logging function in the same way. Features such as RMAN, Oracle Data Guard, and Oracle ASM are fully supported.
In an Oracle Real Application Clusters (Oracle RAC) environment, each node has its own IM column store by default. Depending on your requirements, you can populate objects in different ways:
Different tables are populated on every node. For example, the
sales fact table is on one node, whereas the
products dimension table is on a different node.
A single table is distributed among different nodes. For example, different partitions of the same hash-partitioned table are on different nodes, or different rowid ranges of a single nonpartitioned table are on different nodes.
Some objects appear in the IM column store on every node. For example, you might populate the
products dimension table in every node, but distribute partitions of the
sales fact table across different nodes.
Database In-Memory is simple to implement, and requires no application changes.
Key aspects of Database In-Memory adoption include:
Ease of deployment
No user-managed data migration is required. The database stores data in row format on disk and automatically converts row data into columnar format when populating the IM column store.
Compatibility with existing applications
No application changes are required. The optimizer automatically takes advantage of the columnar format. If your application connects to the database and issues SQL, then it can benefit from Database In-Memory features.
Full SQL compatibility
Database In-Memory places no restrictions on SQL. Analytic queries can benefit whether they use Oracle analytic functions or customized PL/SQL code.
Ease of use
No complex setup is required. The
INMEMORY_SIZE initialization parameter specifies the amount of memory reserved for use by the IM column store. The
INMEMORY clause in DDL statements specifies the objects or columns to be populated into the IM column store. By configuring the IM column store, you can immediately improve the performance of existing analytic workloads and ad-hoc queries.
The Oracle Database In-Memory option is required for all Database In-Memory features. No special hardware is required for an IM column store.
The IM column store requires a minimum of 100 MB of memory. The store size is included in
For Oracle RAC databases, the
DUPLICATE ALL options require Oracle Engineered Systems.
Oracle Database Licensing Information for all licensing-related information
For queries to benefit from the IM column store, the only required tasks are specifying a size for the IM column store, and specifying objects and columns for population. Query optimization and availability features require additional configuration.
Principal Tasks for Configuring the IM Column Store
The following table lists the principal configuration tasks.
Table 1-1 Configuration Tasks
|Task||Notes||To Learn More|
Enable the IM column store by specifying its size.
Specify tables (internal or external), columns (nonvirtual or virtual), tablespaces, or materialized views for population into the IM column store.
Optionally, create Automatic Data Optimization (ADO) policies to set
For example, a policy can evict the
Principal Tasks for Optimizing In-Memory Queries
In-Memory query optimizations are not required for the IM column store to function. The following optimization tasks are optional.
Table 1-2 Query Optimization Tasks
|Task||Notes||To Learn More|
Manage automatic detection of IM expressions in the IM column store by using the
For example, invoke the
Define join groups using the
Candidates are columns that are frequently paired in a join predicate, for example, a column joining a fact and dimension table.
If necessary for a particular query block, specify the
In-memory aggregation is an automatically enabled feature that cannot be controlled with initialization parameters or DDL.
Limit the number of IMCUs updated through trickle repopulation within a two-minute interval by setting the initialization parameter
You can disable trickle repopulation by setting this initialization parameter to
Principal Tasks for Managing Availability
The principal tasks are shown in the following table.
Table 1-3 Availability Tasks
|Task||Notes||To Learn More|
Specify an In-Memory FastStart (IM FastStart) tablespace using the
IM FastStart optimizes the population of database objects in the IM column store when the database is restarted. IM FastStart stores information on disk for faster population of the IM column store.
For an object or tablespace, specify
By default, each In-Memory object is distributed among the Oracle RAC instances, effectively employing a share-nothing architecture for the IM column store.
In an Oracle Data Guard environment, you can use the same Database In-Memory initialization parameters and statements on a primary or standby database.
For example, you can enable the IM column store on both a primary and standby database by setting
No special tools or utilities are required to manage the IM column store or other Database In-Memory features. Administrative tools such as SQL*Plus, SQL Developer, and Oracle Enterprise Manager (Enterprise Manager) are fully supported.
This section describes tools that have specific Database In-Memory feature support:
The In-Memory Advisor is a downloadable PL/SQL package that analyzes the analytical processing workload in your database. This advisor recommends a size for the IM column store and a list of objects that would benefit from In-Memory population.
The In-Memory Advisor differentiates analytics processing from other database activity based upon SQL plan cardinality, Active Session History (ASH), parallel query usage, and other statistics. The In-Memory Advisor estimates the size of objects in the IM column store based on statistics and heuristic compression factors.
The In-Memory Advisor estimates analytic processing performance improvement factors based on the following:
Elimination of wait events such as user I/O waits, cluster transfer waits, and buffer cache latch waits
Query processing advantages related to specific compression types
Decompression cost heuristics for specific compression types
SQL plan cardinality, number of columns in the result set, and so on
The output of the In-Memory Advisor is a report with recommendations. The advisor also generates a SQL*Plus script that alters the recommended objects with the
The In-Memory Advisor is not included in the stored PL/SQL packages. You must download it from Oracle Support.
My Oracle Support note 1965343.1 to learn more about the In-Memory Advisor:
Enterprise Manager Cloud Control (Cloud Control) provides the In-Memory Column Store Central Home page. This page gives a dashboard interface to the IM column store.
Use this page to monitor in-memory support for database objects such as tables, indexes, partitions and tablespaces. You can view In-Memory functionality for objects and monitor their In-Memory usage statistics. Unless otherwise stated, this manual describes the command-line interface to Database In-Memory features.
"Using IM Column Store In Cloud Control" explains how to use Cloud Control to manage the IM column store.
Oracle Compression Advisor estimates the compression ratio that you can realize using the
MEMCOMPRESS clause. The advisor uses the
Oracle Database PL/SQL Packages and Types Reference to learn more about
You can import database objects that are enabled for the IM column store using the
TRANSFORM=INMEMORY:y option of the
With this option, Oracle Data Pump keeps the IM column store clause for all objects that have one. When the
TRANSFORM=INMEMORY:n option is specified, Data Pump drops the IM column store clause from all objects that have one.
You can also use the
string option to override the IM column store clause for a database object in the dump file during import. For example, you can use this option to change the IM column store compression for an imported database object.
Oracle Database Utilities for more information about the
TRANSFORM impdb parameter