The In-Memory Column Store (IM column store) stores tables and partitions in memory using a columnar format optimized for rapid scans. Oracle Database uses a sophisticated architecture to manage data in columnar and row formats simultaneously.
2.1 Dual-Format: Column and Row
When you enable an IM column store, the SGA manages data in separate locations: the In-Memory Area and the database buffer cache.
The IM column store encodes data in a columnar format: each column is a separate structure. The columns are stored contiguously, which optimizes them for analytic queries. The database buffer cache can modify objects that are also populated in the IM column store. However, the buffer cache stores data in the traditional row format. Data blocks store the rows contiguously, optimizing them for transactions.
The following figure illustrates the difference between row-based storage and columnar storage.
Figure 2-1 Columnar and Row-Based Storage
Description of "Figure 2-1 Columnar and Row-Based Storage"
2.1.1 Columnar Data in the In-Memory Area
The In-Memory Area is an optional SGA component that contains the IM column store.
220.127.116.11 Size of the In-Memory Area
The In-Memory Area is controlled by the
INMEMORY_SIZE initialization parameter. By default, the size of the In-Memory Area is 0, which means the IM column store is disabled.
To enable the IM column store, set the In-Memory Area to at least 100 MB. The size is shown in
The In-Memory Area and SGA_TARGET
The In-Memory Area is subtracted from the
SGA_TARGET initialization parameter setting. For example, if you set
SGA_TARGET to 10 GB, and if you set the
INMEMORY_SIZE to 4 GB, then 40% of the
SGA_TARGET setting is allocated to the In-Memory Area. The following graphic illustrates the relationship.
Figure 2-2 INMEMORY_SIZE and SGA_TARGET
Description of "Figure 2-2 INMEMORY_SIZE and SGA_TARGET"
Unlike the other components of the SGA, including the buffer cache and the shared pool, the In-Memory Area size is not controlled by automatic memory management. The database does not automatically shrink the In-Memory Area when the buffer cache or shared pool requires more memory, or increase the In-Memory Area when it runs out of space.
Dynamic Resizing of the In-Memory Area
Starting in Oracle Database 12c Release 2 (12.2), you can dynamically increase
INMEMORY_SIZE by using the
ALTER SYSTEM statement. The database allocates increased memory when the following conditions are met:
Free memory is available in the SGA.
The new size for
INMEMORY_SIZEis at least 128 MB greater than the current setting.
You cannot use
ALTER SYSTEMto reduce
V$SGA views immediately reflect the change.
The In-Memory Area in a Multitenant Environment
In a CDB, the size of the IM column store is set by the
INMEMORY_SIZE parameter in the CDB root. By default, the IM column store is shared among the PDBs. Consequently, a PDB can "starve" other PDBs by consuming the available memory.
Within a PDB, you can limit memory consumption by using
ALTER SYSTEM SET INMEMORY_SIZE. For example, at the CDB level, you might set
20G, and then configure the PDBs as follows:
In the preceding example, the
INMEMORY_SIZE settings at the PDB level add up to
21G, even though
INMEMORY_SIZE at the CDB level is only
20G. Oversubscription ensures that valuable space in the IM column store is not wasted if a PDB is shut down or unplugged.
Oracle Database Administrator’s Guide to learn more about automatic memory management
18.104.22.168 Memory Pools in the In-Memory Area
The In-Memory Area is divided into subpools for columnar data and metadata.
The In-Memory area is subdivided into the following subpools:
This subpool stores the IMCUs, which contain the columnar data. The
V$INMEMORY_AREA.POOLcolumn identifies this subpool as
1MB POOL, as shown in Example 2-1.
The metadata pool
This subpool stores metadata about the objects that reside in the IM column store. The
V$INMEMORY_AREA.POOLcolumn identifies this subpool as
64KB POOL, as shown in Example 2-1.
Figure 2-3 Subpools in the In-Memory Area
Description of "Figure 2-3 Subpools in the In-Memory Area"
The database determines the relative size of the two subpools using internal heuristics. The database allocates the majority of space in the In-Memory Area to the columnar data pool (1 MB pool).
Oracle Database automatically determines the subpool sizes. You cannot change the space allocations.
Example 2-1 V$INMEMORY_AREA View
This example queries the
V$INMEMORY_AREA view to determine the amount of available memory in each subpool (sample output included):
COL POOL FORMAT a9
COL POPULATE_STATUS FORMAT a15
SSELECT POOL, TRUNC(ALLOC_BYTES/(1024*1024*1024),2) "ALLOC_GB",
POOL ALLOC_GB USED_GB POPULATE_STATUS
--------- ---------- ---------- ---------------
1MB POOL 7.99 0 DONE
64KB POOL 1.98 0 DONE
The current size of the In-Memory area is visible in
SELECT NAME, VALUE/(1024*1024*1024) "SIZE_IN_GB"
WHERE NAME LIKE '%Mem%';
In-Memory Area 10
In this example, the memory allocated to the subpools is 9.97 GB, whereas the size of the In-Memory Area is 10 GB. The database uses a small percentage of memory for internal management structures.
See Also:Oracle Database Reference to learn about
2.1.2 Row Data in the Database Buffer Cache
The database buffer cache stores and processes data blocks in the same way whether the IM column store is enabled or disabled. Buffer I/O and buffer pools function the same.
The IM column store enables data to be simultaneously populated in the SGA in both the traditional row format (the buffer cache) and the columnar format. The database transparently sends OLTP queries (such as primary key lookups) to the buffer cache, and analytic and reporting queries to the IM column store. When fetching data, Oracle Database can also read data from both memory areas within the same query.
In the execution plan, the operation
TABLE ACCESS IN MEMORY FULL indicates that some or all data is accessed in the IM column store.
The dual-format architecture does not double memory requirements. The buffer cache is optimized to run with a much smaller size than the size of the database.
The following figure shows a sample IM column store. The database stores the
sh.sales table on disk in traditional row format. The SGA stores the data in columnar format in the IM column store, and in row format in the database buffer cache.
Figure 2-4 IM Column Store
Description of "Figure 2-4 IM Column Store"
Every on-disk data format for permanent, heap-organized tables is supported by the IM column store. The columnar format does not affect the format of data stored in data files or in the buffer cache, nor does it affect undo data and online redo logging.
The database processes DML modifications in the same way, regardless of whether the IM column store is enabled, by updating the buffer cache, online redo log, and undo tablespace. However, the database uses an internal mechanism to track changes and ensure that the IM column store is consistent with the rest of the database. For example, if the
sales table is populated in the IM column store, and if an application updates a row in
sales, then the database automatically keeps the copy of the
sales table in the IM column store transactionally consistent. A query that accesses the IM column store always returns the same results for a query that accesses the buffer cache.
Oracle Database Concepts to learn more about the database buffer cache
2.2 In-Memory Storage Units
The IM column store manages both data and metadata in optimized storage units, not in traditional Oracle data blocks.
Oracle Database maintains the storage units in the In-Memory Area. The following graphic gives an overview of the In-Memory Area and the database processes that interact with it. The remaining chapter describes the various memory components.
Figure 2-5 IM Column Store: Memory and Process Architecture
Description of "Figure 2-5 IM Column Store: Memory and Process Architecture"
2.2.1 In-Memory Compression Units (IMCUs)
An In-Memory Compression Unit (IMCU) is a compressed, read-only storage unit that contains data for one or more columns.
An IMCU is analogous to a tablespace extent. An IMCU has two parts: a set of Column Compression Units (CUs), and a header that contains metadata such as the IM storage index.
22.214.171.124 IMCUs and Schema Objects
The IM column store stores data for a single object (table, partition, materialized view) in a set of IMCUs. An IMCU stores columnar data for one and only one object.
126.96.36.199.1 IMCUs and INMEMORY Columns
For an object specified as
INMEMORY, every column listed in the
INMEMORY clause is included in every IMCU.
For example, the
sh.sales table has 7 columns. The following DDL statement specifies the table as
INMEMORY, which means that every IMCU for
sales includes columnar data for these 7 columns:
ALTER TABLE sh.sales INMEMORY MEMCOMPRESS FOR QUERY LOW;
NO INMEMORY Columns in INMEMORY Objects
You can specify that some but not all columns in an
INMEMORY table have the
INMEMORY attribute. For example, the
sh.customers table has 23 columns. The following DDL statement specifies that 15 columns in
sh.customers have the
NO INMEMORY attribute, which means that the other 8 columns in the table have the
ALTER TABLE sh.customers INMEMORY
MEMCOMPRESS FOR QUERY LOW
NO INMEMORY ( cust_gender, cust_year_of_birth, cust_marital_status,
cust_postal_code, cust_city, cust_state_province,
cust_main_phone_number, cust_income_level, cust_credit_limit,
cust_email, cust_total, cust_total_id, cust_eff_from,
cust_eff_to, cust_valid );
The following query shows the compression levels of the columns in
sh.customers, indicating which columns are
SET LINESIZE 200
COL TABLE_NAME FORMAT a25
COL SEG_COL_ID FORMAT 999
COL COLUMN_NAME FORMAT a25
COL INMEMORY_COMPRESSION FORMAT a11
SELECT SEGMENT_COLUMN_ID AS "SEG_COL_ID", COLUMN_NAME, INMEMORY_COMPRESSION
FROM V$IM_COLUMN_LEVEL WHERE TABLE_NAME = 'CUSTOMERS'
ORDER BY SEG_COL_ID;
SEG_COL_ID COLUMN_NAME INMEMORY_CO
---------- ------------------------- -----------
1 CUST_ID DEFAULT
2 CUST_FIRST_NAME DEFAULT
3 CUST_LAST_NAME DEFAULT
4 CUST_GENDER NO INMEMORY
5 CUST_YEAR_OF_BIRTH NO INMEMORY
6 CUST_MARITAL_STATUS NO INMEMORY
7 CUST_STREET_ADDRESS DEFAULT
8 CUST_POSTAL_CODE NO INMEMORY
9 CUST_CITY NO INMEMORY
10 CUST_CITY_ID DEFAULT
11 CUST_STATE_PROVINCE NO INMEMORY
12 CUST_STATE_PROVINCE_ID DEFAULT
13 COUNTRY_ID DEFAULT
14 CUST_MAIN_PHONE_NUMBER NO INMEMORY
15 CUST_INCOME_LEVEL NO INMEMORY
16 CUST_CREDIT_LIMIT NO INMEMORY
17 CUST_EMAIL NO INMEMORY
18 CUST_TOTAL NO INMEMORY
19 CUST_TOTAL_ID NO INMEMORY
20 CUST_SRC_ID DEFAULT
21 CUST_EFF_FROM NO INMEMORY
22 CUST_EFF_TO NO INMEMORY
23 CUST_VALID NO INMEMORY
The following graphic represents three tables from the
sh schema populated in the IM column store:
sales. In this example, each table has a different number of columns specified
INMEMORY. The IMCUs for each table include only data for the
Figure 2-6 Columns and IMCUs
Description of "Figure 2-6 Columns and IMCUs"
Queries That Reference NO INMEMORY Columns
When a query references a
NO INMEMORY column, the table scan retrieves data from the row store rather than the IMCUs in the IM column store. Row store access occurs even if all other columns referenced in the query are populated
For example, assume that the
customers table is populated into the IM column store. The
cust_last_name columns are specified
INMEMORY, but the
cust_postal_code column is specified as
NO INMEMORY. You issue the following query:
SELECT cust_id, cust_last_name, cust_postal_code
WHERE cust_id < 5001
ORDER BY cust_id;
In this case, the database accesses the row store, not the IM column store, even though
cust_postal_code is the only
NO INMEMORY column referenced in the query. The following query, which has
cust_postal_code in the predicate but not the
SELECT list, must also access the row store:
SELECT cust_id, cust_last_name
WHERE cust_postal_code = 77501
ORDER BY cust_id;
https://blogs.oracle.com/in-memory/what-happens-if-a-column-is-not-populated for a blog entry on accessing columns that are not populated in the IM column store
Oracle Database SQL Language Reference to learn about the
188.8.131.52.2 In-Memory Compression
The IM column store uses special compression formats optimized for access speed rather than storage reduction. The columnar format enables queries to execute directly against the compressed columns.
Compression enables scanning and filtering operations to process a much smaller amount of data, which optimizes query performance. Oracle Database only decompresses data when it is required for the result set.
The compression applied in the IM column store is closely related to Hybrid Columnar Compression. Both technologies process column vectors. The primary difference is that the column vectors for the IM column store are optimized for SIMD vector processing, whereas the column vectors for Hybrid Columnar Compression are optimized for disk storage.
When you enable an object for population into the IM column store, you specify the type of compression in the
FOR QUERY (
FOR CAPACITY (
184.108.40.206.3 IMCUs and Rows
Each IMCU contains all column values (including nulls) for a subset of rows in a table segment. A subset of rows is called a granule.
All IMCUs for a given segment contain approximately the same number of rows. Oracle Database determines the size of a granule automatically depending on data type, data format, and compression type. A higher compression level results in more rows in the IMCU.
A one-to-many mapping exists between an IMCU and a set of database blocks. As illustrated in Example 2-2, each IMCU stores the values for columns for a different set of blocks.
The columns in an IMCU are not sorted. Oracle Database populates them in the order that they are read from disk.
The number of rows in an IMCU dictates the amount of space an IMCU consumes. If the target number of rows causes an IMCU to grow beyond the amount of contiguous 1 MB extents available in the 1 MB pool, then the IMCU creates additional extents (pieces) to hold the remaining column CUs. An IMCU always allocates space in 1 MB increments.
Example 2-2 IMCUs and Row Subsets
In this simplified example, only the following 4 columns of the
customers table have the
cust_gender. Only 5 rows exist in the table, stored in 2 data blocks. Conceptually, the first data block stores its rows as follows:
The second data block stores rows as follows:
Assume IMCU 1 stores the data for the first data block. In this case, the
cust_id column values for the 3 rows in this data block stores are stored “vertically” within a CU as follows:
IMCU 2 stores the data from the second data block. The
cust_id column values for these 2 rows are stored within a CU as follows:
cust_id value is the first value for each row in the data block, the
cust_id column is in the first position within the IMCU. Columns always occupy the same position, so Oracle Database can reconstruct the rows by reading the IMCUs for a segment.
220.127.116.11 Column Compression Units (CUs)
A Column Compression Unit (CU) is contiguous storage for a single column in an IMCU. Every IMCU has one or more CUs.
18.104.22.168.1 Structure of a CU
A CU is divided into a body and a header.
The body of every CU stores the column values for the range of rows included in the IMCU. The header contains metadata about the values stored in the CU body, for example, the minimum and maximum value within the CU. It may also contain a local dictionary, which is a sorted list of the distinct values in that column and their corresponding dictionary codes.
The following figure shows an IMCU with 4 CUs for the
channel_id. Each CU stores the column values for the range of rows included in the IMCU.
Figure 2-7 CUs in an IMCU
Description of "Figure 2-7 CUs in an IMCU"
The CUs store values in rowid order. For this reason, the database can answer queries by “stitching” the rows back together. For example, an application issues the following query:
SELECT cust_id, time_id, channel_id
WHERE prod_id =5;
The database begins by scanning the
prod_id column for entries with the value
5. Assume that the database finds
5 in position two in the
prod_id column. The database now must find the corresponding
channel_id for this row. Because the CUs store data in rowid order, the database can find the corresponding
channel_id values in position 2 in those columns. Thus, to answer the query, the database must extract the values from position 2 in the
channel_id columns, and then stitch the row back together to return it to the end user.
22.214.171.124.2 Local Dictionary
In a CU, the local dictionary has a list of distinct values and their corresponding dictionary codes.
The local dictionary stores the symbol contained in the column. The following figure illustrates how a CU stores a
name column in a
Figure 2-8 Local Dictionary
Description of "Figure 2-8 Local Dictionary"
In the preceding figure, the CU contains only 7 rows. Every distinct value in this CU, such as
Audi, is assigned a different dictionary code, such as
Audi. The CU stores the dictionary code rather than the original value.
When the database uses a common dictionary for a join group, the local dictionary contains references to the common dictionary rather than the symbols. For example, rather than storing the values
Cadillac for the
vehicles.name column, the local dictionary stores dictionary codes such as
The CU header contains the minimum and maximum values for the column. In this example, the minimum value is
Audi and the maximum value is
Cadillac. The local dictionary stores the list of distinct values:
Cadillac. Their corresponding dictionary codes (
2) are implicit. The local dictionary for a CU in each IMCU is independent of the local dictionaries in other IMCUs.
If a query filters on Audi automobiles, then the database scans this IMCU for only
126.96.36.199 In-Memory Storage Indexes
Every IMCU header automatically creates and manages In-Memory Storage Indexes (IM storage indexes) for its CUs. An IM storage index stores the minimum and maximum for all columns within the IMCU.
sales is populated in the IM column store. Every IMCU for this table has all columns. The
sales.prod_id column is stored in a separate CU within every IMCU. The IMCU header has the minimum and maximum values of each
prod_id CU (and every other CU).
To eliminate unnecessary scans, the database can perform IMCU pruning based on SQL filter predicates. The database scans only the IMCUs that satisfy the query predicate, as shown in the
WHERE prod_id > 14 AND prod_id < 29 example in the following graphic.
Figure 2-9 Storage Index for Columnar Data
Description of "Figure 2-9 Storage Index for Columnar Data"
2.2.2 Snapshot Metadata Units (SMUs)
A Snapshot Metadata Unit (SMU) contains metadata and transactional information for an associated IMCU.
188.8.131.52 IMCUs and SMUs
The columnar pool of the In-Memory Area stores the actual data: IMCUs and IMEUs. The metadata pool in the In-Memory Area stores the SMUs.
Figure 2-10 IMCUs and SMUs
This figure shows IMCUs in the data pool, and SMUs in the metadata pool.
Description of "Figure 2-10 IMCUs and SMUs"
Every IMCU maps to a separate SMU. Thus, if the columnar data pool contains 100 IMCUs, then the metadata pool contains 100 SMUs. The SMUs store several types of metadata for their associated IMCUs, including the following:
Mapping information for rows
184.108.40.206 Transaction Journal
Every SMU contains a transaction journal. The database uses the transaction journal to keep the IMCU transactionally consistent.
The database uses the buffer cache to process DML, just as when the IM column store is not enabled. For example, an
UPDATE statement might modify a row in an IMCU. In this case, the database adds the rowid for the modified row to the transaction journal and marks it stale as of the SCN of the DML statement. If a query needs to access the new version of the row, then the database obtains the row from the database buffer cache.
Figure 2-11 Transaction Journal
Description of "Figure 2-11 Transaction Journal"
The database achieves read consistency by merging the contents of the column, transaction journal, and buffer cache. When the IMCU is refreshed during repopulation, queries can access the up-to-date row directly from the IMCU.
"Optimizing Repopulation of the IM Column Store" for an in-depth discussion of how the IM column store maintains transactional consistency
2.2.3 In-Memory Expression Units (IMEUs)
An In-Memory Expression Unit (IMEU) is a storage container for materialized In-Memory Expressions (IM expressions) and user-defined virtual columns.
The database treats materialized expressions just like other columns in the IMCU. Conceptually, an IMEU is a logical extension of its parent IMCU. Just as an IMCU can contain multiple columns, an IMEU can contain multiple virtual columns.
Every IMEU maps to exactly one IMCU, mapping to the same row set. The IMEU contains expression results for the data contained in its associated IMCU. When the IMCU is populated, the associated IMEU is also populated.
A typical IM expression involves one or more columns, possibly with constants, and has a one-to-one mapping with the rows in the table. For example, an IMCU for an
employees table contains rows 1–1000 for the column
weekly_salary. For the rows stored in this IMCU, the IMEU calculates the automatically detected IM expression
weekly_salary*52, and the user-defined virtual column
quarterly_salary defined as
weekly_salary*12. The 3rd row down in the IMCU maps to the 3rd row down in the IMEU.
The IMEU is a logical extension of the IMCUs of a particular segment. By default, the IMEU inherits the
INMEMORY clause properties, including Oracle Real Application Clusters (Oracle RAC) properties such as
DUPLICATE, from the base segment. You can selectively enable or disable virtual columns for storage in IMEUs. You can also specify compression levels for different columns.
2.3 Expression Statistics Store (ESS)
The Expression Statistics Store (ESS) is a repository maintained by the optimizer to store statistics about expression evaluation. The ESS resides in the SGA and persists on disk.
When an IM column store is enabled, the database leverages the ESS for its In-Memory Expressions (IM expressions) feature. However, the ESS is independent of the IM column store. The ESS is a permanent component of the database and cannot be disabled.
The database uses the ESS to determine whether an expression is “hot” (frequently accessed), and thus a candidate for an IM expression. During a hard parse of a query, the ESS looks for active expressions in the
GROUP BY clause, and so on.
For each segment, the ESS maintains expression statistics such as the following:
Frequency of execution
Cost of evaluation
The optimizer assigns each expression a weighted score based on cost and the number of times it was evaluated. The values are approximate rather than exact. More active expressions have higher scores. The ESS maintains an internal list of the most frequently accessed expressions.
Control the behavior of IM expressions using the
DBMS_INMEMORY_ADMIN package. For example, the
IME_CAPTURE_EXPRESSIONS procedure prompts the database to identify and gradually populate the hottest expressions in the database. The
IME_POPULATE_EXPRESSIONS procedure forces the database to populate the expressions immediately.
ESS information is stored in the data dictionary and exposed in the
DBA_EXPRESSION_STATISTICS view. This view shows the metadata that the optimizer has collected in the ESS. IM expressions are exposed as system-generated virtual columns, prefixed by the string
SYS_IME, in the
Oracle Database SQL Tuning Guide to learn more about ESS
Oracle Database Reference to learn more about the
Oracle Database PL/SQL Packages and Types Reference to learn more about the
2.4 In-Memory Process Architecture
In response to queries and DML, server processes scan columnar data and update SMU metadata. Background processes populate row data from disk into the IM column store.
2.4.1 In-Memory Coordinator Process (IMCO)
The In-Memory Coordinator Process (IMCO) manages many tasks for the IM column store. Its primary task is to initiate background population and repopulation of columnar data.
Population is a streaming mechanism, converting row data into columnar format, and then compressing it. IMCO automatically initiates population of
INMEMORY objects with any priority other than
NONE. When objects with priority
NONE are accessed, IMCO populates them using Space Management Worker Process (Wnnn) processes.
The IMCO background process also initiates threshold-based repopulation of IM column store objects when they meet a staleness threshold. IMCO may instigate trickle repopulation for any IMCU in the IM column store that has stale entries but does not meet the staleness threshold.
Trickle repopulation occurs automatically in the background. The steps are as follows:
IMCO wakes up.
IMCO determines whether population tasks need to be performed, including whether any stale entries exist in an IMCU.
If IMCO finds stale entries, then it triggers a Space Management Worker Process to repopulate these entries in the IMCU.
IMCO sleeps for two minutes, and then returns to Step 1.
2.4.2 Space Management Worker Processes (Wnnn)
Space Management Worker Processes (Wnnn) populate or repopulate data on behalf of IMCO.
During population, Wnnn processes are responsible for creating IMCUs, SMUs, and IMEUs. When creating IMEUs, the worker processes perform the following tasks:
Identify virtual columns for population
Create virtual column values
Compute values for each row, transform the data into columnar format, and compress it
Register the objects with the space layer
Associate the IMEUs with their corresponding IMCUs
During IMEU creation, parent IMCUs remain available for queries.
During repopulation, the Wnnn processes create new versions of the IMCUs based on the existing IMCUs and transactions journals, while temporarily retaining the old versions. This mechanism is called double buffering.
The database can quickly move IM expressions in and out of the IM column store. For example, if an IMCU was created without an IMEU, then the database can add an IMEU later without forcing the IMCU to undergo the full repopulation mechanism.
INMEMORY_MAX_POPULATE_SERVERS initialization parameter controls the maximum number of worker processes that can be started for population. The
INMEMORY_TRICKLE_REPOPULATE_PERCENT initialization parameter controls the maximum percentage of time that worker processes can perform trickle repopulation.
Oracle Database Reference to learn more about background processes
2.4.3 In-Memory Dynamic Scans
In-Memory Dynamic Scans (IM dynamic scans) use lightweight threads to parallelize In-Memory table scans.
220.127.116.11 Purpose of IM Dynamic Scans
When additional CPU is available, IM dynamic scans accelerate In-Memory table scans that are CPU bound.
IM dynamic scans automatically use idle CPU resources to scan IMCUs in parallel and maximize CPU usage. When CPU resources are available, applications can get even faster analytic query results automatically. Because the scans are dynamic, they enable the use of excess CPU bandwidth without affecting existing workload.
IM dynamic scans are more flexible than traditional Oracle parallel execution, although the two are not mutually exclusive. Dynamic scans use multiple lightweight threads of execution within a process. Typically, the performance overhead for dynamic scans is low.
Oracle Database Administrator’s Guide to learn more about Resource Manager
18.104.22.168 How IM Dynamic Scans Work
IM Dynamic Scans attain optimal performance by reading IMCUs in parallel.
22.214.171.124.1 About Lightweight Threads
A lightweight thread is an execution entity that helps to parallelize full table scans. It is “lightweight” because it does not incur the higher memory overhead of Oracle processes.
A lightweight thread used by IM dynamic scans is not the same as a regular thread in the multithreaded Oracle Database model.
Lightweight threads share the resources of the parent foreground or PQ process, called the table scan process, that coordinates the scan of a set of IMCUs. Threads maintain their own independent flow of execution. The database can parallelize scans by prioritizing threads and executing them asynchronously.
For eligible queries, the process allocates a pool of threads. Resource Manager automatically determines the number of threads in the pool based on the CPU count in the database host and the current load on the system. The pool of threads remains available to the session for subsequent queries unless the idle time reaches an internal threshold, at which point the database terminates the threads.
Communication between threads occurs exclusively within a process. For this reason, contention does not occur at the database instance level.
Oracle Database Concepts to learn about the multithreaded Oracle Database model
126.96.36.199.2 When the Database Considers IM Dynamic Scans
Lightweight threads are enabled when a CPU resource plan is enabled (for example,
RESOURCE_MANAGER_PLAN=DEFAULT_PLAN) and CPU utilization of the database is low.
If lightweight threads are enabled, then the database considers an IM dynamic scan when an application queries an object that is currently populated in the IM column store. Typically, a serial or parallel query is a candidate for IM dynamic scans when it has the following characteristics:
Accesses a high number of IMCUs or columns
Consumes all rows in the table
Oracle Database Resource Manager (the Resource Manager), which is automatically enabled when
INMEMORY_SIZE is greater than
0, is required for IM dynamic scans. The Resource Manager decides when and how to use the lightweight threads. Lightweight threads are the lowest priority operation in the database because they are capitalizing on unused resources.
CPU_COUNT must be greater than or equal to
24 in order
to perform IM dynamic scans.
188.8.131.52.3 How IM Dynamic Scans Work
Resource Manager allocates lightweight threads to parallelize the scan of IMCUs.
When the database determines that a query can benefit from an IM dynamic scan, it typically proceeds as follows:
A table scan process spawns a pool of lightweight threads.
The table scan process creates a separate task for every IMCU that must be scanned, and then adds each task to a task queue.
Resource Manager determines how many threads can participate in the table scan.
Active threads pick up tasks from the task queue, with the table scan process consuming results from completed tasks.
Depending on the database load, Resource Manager continuously adjusts the number of active lightweight threads while the query is running. If CPU resources are not available, then the table scan process performs the scan without using lightweight threads.
The following graphic illustrates an IM dynamic scan of 12 IMCUs in the
Figure 2-12 IM Dynamic Scan
Description of "Figure 2-12 IM Dynamic Scan"
In the preceding graphic, the database host has 8 CPU cores. Based on an internal algorithm, Resource Manager assigns 4 threads to assist the table scan process. In this scenario, 4 CPU cores remain idle for other concurrent database operations to use.
184.108.40.206 Interface for IM Dynamic Scans
IM dynamic scans are transparent, which means that they require no application changes and are automatically controlled by the Resource Manager.
IM dynamic scans require the Resource Manager, which is automatically enabled when
INMEMORY_SIZE is greater than
0. No specific resource plan is required.
Several new session statistics track the usage of IM dynamic scans. Each thread writes trace data to a separate trace file.
Execution plans are unchanged. The following figure shows a sample execution plan.
SQL> SELECT MAX(l_quantity) largest_order FROM lineitem;
Plan hash value: 1885658499
|Id| Operation | Name |Rows| Bytes |Cost(%CPU)| Time |Pstart| Pstop|
|0| SELECT STATEMENT | | 1 | 3| 116K (4)| 00:00:05 | | |
|1| SORT AGGREGATE | | 1 | 3| | | | |
|2| PARTITION RANGE ALL | | 600M |1716M| 116K (4)| 00:00:05 | 1 | 84 |
|3| TABLE ACCESS INMEMORY FULL| LINEITEM | 600M |1716M| 116K (4)| 00:00:05 | 1 | 84 |
IM scan CUs memcompress for query low 1147
IM scan bytes in-memory 5.1790E+10
IM scan bytes uncompressed 7.6722E+10
IM scan CUs columns accessed 1147
IM scan rows 600037902
IM scan rows projected 29
IM scan (dynamic) rows 600037902
IM scan (dynamic) multi-threaded scans 1
IM scan (dynamic) tasks processed by thread 1146
Consider the characteristics of the plan:
The execution plan is unchanged.
Note that the plan does not mention IM dynamic scans in Step 3. However, clicking the binocular icon in a SQL Monitor report would show “Dynamic Scan Tasks on Thread.”
IM scan (dynamic) multi-threaded scansis nonzero, which means that the database used an IM dynamic scan.
IM scan CUs memcompress for query lowindicates that 1147 IMCUs exist in the
IM scan (dynamic) tasks processed by threadindicates how many IMCUs were processed in parallel.
The number is 1146, which is less than the total number of 1147 shown in
IM scan CUs memcompress for query low. The database analyzed the first IMCU without parallelization to determine whether parallelization was worthwhile. Because the answer was yes, the database proceeded to scan the remaining 1146 IMCUs in parallel.
IM scan (dynamic) rowsand
IM scan rows are equal, which means that the threads retrieved all rows for the query.
2.5 CPU Architecture: SIMD Vector Processing
For data that is populated in the IM column store, the database uses SIMD (single instruction, multiple data) processing.
A SIMD unit is a processor that enables a single instruction to process data as a unit, called a vector, rather than processing data in separate instructions. For example, instead of using a loop to execute four addition operations, SIMD could load the four sets of numbers into vectors and perform one addition operation. SIMD processing is sometimes called vectorization.
The IM column store maximizes the number of column entries that the CPU can load into the vector registers and evaluate. Instead of evaluating each entry in the column one at a time, the database evaluates a set of column values in a single CPU instruction. SIMD vector processing enables the database to scan billions of rows per second.
For example, an application issues a query to find the total number of orders in the
sales table that use the
promo_id value of
sales table resides in the IM column store. The query begins by scanning only the
sales.promo_id column, as shown in the following diagram:
Figure 2-13 SIMD Vector Processing
Description of "Figure 2-13 SIMD Vector Processing"
The CPU evaluates the data as follows:
Loads the first 8 values (the number varies depending on data type and compression mode) from the
promo_idcolumn into the SIMD register, and then compares them with the value
9999in a single instruction
Discards the entries.
Loads another 8 values into the SIMD register, and then continues in this way until it has evaluated all entries.
2.5.1 SIMD and Oracle LOBs
Oracle Database 18c provides SIMD vector support for queries involving SQL operators on specific LOB columns.
The nature of the support depends on the type of LOB:
The IM column store provides contiguous storage for inline LOBs, which are LOBs less than 4 KB, within the IMCUs. Columnar storage enables faster query processing by removing the overhead of assembling LOB data from the database buffer cache.
In this case, the IM column store only stores the LOB locator, which is 40 byes. Out-of-line columns do not benefit from columnar optimization.
There is one exception to the preceding rule. An IMEU can allocate up to 32 KB of contiguous storage for JSON columns defined as a LOB data type. The IMEU stores these columns in the OSON format, which can provide faster query performance using SIMD processing.
Oracle Database SecureFiles and Large Objects Developer's Guide to learn more about LOBs
2.5.2 SIMD and Oracle Numbers
For tables compressed with
NUMBER columns are encoded using an optimized format that enables native calculations in hardware.
SIMD vector processing enables simple aggregations,
GROUP BY aggregations, and arithmetic operations to benefit significantly. The performance improvement depends on the amount of time the aggregation spends on arithmetic computation. Some aggregations may benefit by up to a factor of 9.
2.5.3 SIMD and Exadata Smart Flash Cache
Besides storing data in Hybrid Columnar Compression format, Exadata Smart Flash Cache can store data in pure columnar format.
Exadata Smart Scan supports SIMD predicates. The advantage is that In-Memory performance extends from DRAM storage to secondary storage.
By default, Exadata Smart Flash Cache compresses data using the level
MEMCOMPRESS FOR CAPACITY LOW. To change the compression level or disable the columnar format altogether, use the
ALTER TABLE ... NO CELLMEMORY statement.