8 Optimizing In-Memory Joins
Database In-Memory supports multiple features to improve the performance of In-Memory joins.
8.1 About In-Memory Joins
Joins are an integral part of data warehousing workloads. The IM column store enhances the performance of joins when the tables being joined are stored in memory.
Because of faster scan and join processing, complex multitable joins and simple joins that use Bloom filters benefit from the IM column store. In a data warehousing environment, the most frequently-used joins involved a fact table and one or more dimension tables.
The following joins run faster when the tables are populated in the IM column store:
- 
                        Joins that are amenable to using Bloom filters 
- 
                        Joins of multiple small dimension tables with one fact table 
- 
                        Joins between two tables that have a primary key-foreign key relationship 
8.2 Optimizing Joins with Join Groups
You can optimizing In-Memory joins by creating join groups using the CREATE INMEMORY JOIN GROUP statement.
                  
8.2.1 About Join Groups
A join group is a group of between 1 and 255 columns that are frequently joined.
The table set for the join group includes one or more internal tables. External tables are not supported. When the IM column store is enabled, the database can use join groups to optimize joins of populated tables.
The columns in the join group can be in the same or different tables. For example, if the sales and times tables frequently join on the time_id column, then you might create a join group for (times(time_id), sales(time_id)). If the employees table often joins to itself on the employee_id column, then you could create the join group (employees(employee_id)).
                     
Note:
The same column cannot be a member of multiple join groups.
When you create a join group, the database invalidates the current In-Memory contents of the tables referenced in the join group. Subsequent repopulation causes the database to re-encode the IMCUs of the tables with the common dictionary. For this reason, Oracle recommends that you first create the join group, and then populate the tables.
Create join groups using the CREATE INMEMORY JOIN GROUP statement. To add columns to or drop columns from a join group, use an ALTER INMEMORY JOIN GROUP statement. Drop a join group using the DROP INMEMORY JOIN GROUP statement.
                     
Note:
In Oracle Active Data Guard, a standby database ignores join group definitions. A standby database does not use common dictionaries, and executes queries as if join groups did not exist.
Example 8-1 Creating a Join Group
This example creates a join group named deptid_jg that includes the department_id column in the hr.employees and hr.departments tables.
                        
CREATE INMEMORY JOIN GROUP deptid_jg (hr.employees(department_id),hr.departments(department_id));8.2.2 Purpose of Join Groups
In certain queries, join groups eliminate the performance overhead of decompressing and hashing column values.
Without join groups, if the optimizer uses a hash join but cannot use a Bloom filter, or if the Bloom filter does not filter rows effectively, then the database must decompress IMCUs and use an expensive hash join. To illustrate the problem, assume a star schema has a sales fact table and a vehicles dimension table. The following query joins these tables, but does not filter the output, which means that the database cannot use a Bloom filter:
                     
SELECT v.year, v.name, s.sales_price
FROM   vehicles v, sales s
WHERE  v.name = s.name;The following figure illustrates how the database joins the two data sets.
The database performs a hash join as follows:
- 
                           Scans the vehiclestable, decompresses the rows that satisfy the predicate (in this case, all rows satisfy the predicate because no filters exist), and sends the rows to the hash join
- 
                           Builds a hash table in the PGA based on the decompressed rows 
- 
                           Scans the salestable and applies any filters (in this case, the query does not specify filters)
- 
                           Processes matching rows from the IMCUs and then sends the rows to the join When the hash join can consume row sets from the probe side (in this case, the salestable), the row sets sent by the table scan are in compressed form. Depending on whether the local dictionary or join group is leveraged to find matching rows from the build side, the hash join either decompresses the rows or leaves them uncompressed.
- 
                           Probes the hash table using the join column, which in this case is the vehicle name 
If a join group exists on the v.name and s.name columns, then the database can make the preceding steps more efficient, eliminating the decompression and filtering overhead. The benefits of join groups are:
                     
- 
                           The database operates on compressed data. 
- 
                           The database avoids hashing on the join key and probing the hash table, which requires comparing the join keys of the probe rows and hashed rows. When a join group exists, the database stores codes for each join column value in a common dictionary. The database builds a join group array using dictionary codes. Every array element points to a build-side row stored in the hash area (typically, PGA memory). During the probe, each probe row has a code associated with the join key. The database uses this code to search the array to determine whether a pointer exists in the array element. If a pointer exists, then there is a match; otherwise, there is no match. 
- 
                           The dictionary codes are dense and have a fixed length, which makes them space efficient. 
- 
                           Optimizing a query with a join group is sometimes possible when it is not possible to use a Bloom filter. 
8.2.3 How Join Groups Work
In a join group, the database compresses all columns in the join group using the same common dictionary.
8.2.3.1 How a Join Group Uses a Common Dictionary
A common dictionary is a table-level, instance-specific set of dictionary codes.
The database automatically creates a common dictionary in the IM column store when a join group is defined on the underlying columns. The common dictionary enables the join columns to share the same dictionary codes.
A common dictionary provides the following benefits:
- 
                              Encodes the values in the local dictionaries with codes from the common dictionary, which provides compression and increases the cache efficiency of the IMCU 
- 
                              Enables joins to use dictionary codes to construct and probe the data structures used during hash joins 
- 
                              Enables the optimizer to obtain statistics such as cardinality, distribution of column values, and so on 
The following figure illustrates a common dictionary that corresponds to a join group created on the sales.name and vehicles.name columns. 
                        
Figure 8-2 Common Dictionary for a Join Group

Description of "Figure 8-2 Common Dictionary for a Join Group"
When the database uses a common dictionary, the local dictionary for each CU does not store the original values: AUDI, BMW, CADILLAC, FORD, and so on. Instead, the local dictionary stores references to the values stored in the common dictionary. For example, the local dictionary might store the value 101 for Audi and 220 for BMW. The common dictionary might store the value 0 for Audi and 1 for BMW. The 101 (AUDI) in the local dictionary is a pointer to the 0 (AUDI) in the common dictionary.
                        
8.2.3.2 How a Join Group Optimizes Scans
The key optimization is joining on common dictionary codes instead of column values, thereby avoiding the use of a hash table for the join.
Consider the following query, which uses a join group to join vehicles and sales on the name column:
                        
SELECT v.year, v.name, s.sales_price
FROM   vehicles v, sales s
WHERE  v.name = s.name
AND    v.name IN ('Audi', 'BMW', 'Porsche', 'VW');The following figure illustrates how the join benefits from the common dictionary created on the join group.
As illustrated in the preceding diagram, the database performs a hash join on the compressed data as follows:
- 
                              Scans the vehiclestable, and sends the dictionary codes (not the original column values) to the hash join:0(Audi),1(BMW),2(Cadillac), and so on
- 
                              Builds an array of distinct common dictionary codes in the PGA 
- 
                              Scans the salestable and applies any filters (in this case, the filter is for German cars only)
- 
                              Sends matching rows to the join in compressed format 
- 
                              Looks up corresponding values in the array rather than probing a hash table, thus avoiding the need to compute a hash function on the join key columns 
In this example, the vehicles table has only seven rows. The vehicles.name column has the following values:
                        
Audi
BMW
Cadillac
Ford
Porsche
Tesla
VWThe common dictionary assigns a dictionary code to each distinct value. Conceptually, the common dictionary looks as follows:
Audi     0
BMW      1
Cadillac 2
Ford     3
Porsche  4
Tesla    5
VW       6
The database scans vehicles.name, starting at the first dictionary code in the first IMCU and ending at the last code in the last IMCU. It stores a 1 for every row that matches the filter (German cars only), and 0 for every row that does not match the filter. Conceptually, the array might look as follows:
                        
array[0]: 1
array[1]: 1
array[2]: 0
array[3]: 0
array[4]: 1
array[5]: 0
array[6]: 1The database now scans the sales fact table. To simplify the example, assume that the sales table only has 6 rows. The database scans the rows as follows (the common dictionary code for each value is shown in parentheses):
                        
Cadillac (2)
Cadillac (2)
BMW      (1)
Ford     (3)
Audi     (0)
Tesla    (5)The database then proceeds through the vehicles.name array, looking for matches. If a row matches, then the database sends the matching row with its associated common dictionary code, and retrieves the corresponding column value from the vehicles.name and sales.name IMCUs:
                        
2  -> array[2] is 0, so no join
2  -> array[2] is 0, so no join
1  -> array[1] is 1, so join
3  -> array[3] is 0, so no join
0  -> array[0] is 1, so join
5  -> array[5] is 0, so no join8.2.4 When a Hash Join Uses Common Dictionary Encodings
Joins on columns in a join group typically see a performance benefit.
At join group creation, the database does the following:
- 
                           Caches the hash of the dictionary values for the join key columns 
- 
                           Caches the binary representation of the NUMBERdata for the join key columns
- 
                           Encodes columns with the same common dictionary 
A join on columns in a join group always uses the first two optimizations to improve performance. For example, if the optimizer chooses a hash join, then the query uses the cached hash values to probe the bloom filter. If the query uses an IM aggregation join, then the query uses the cached binary number to index into the key vector.
A hash join may or may not use dictionary encodings. When dictionary encodings are present in at least one column of the hash join, the query can leverage the encodings in the following ways:
- 
                           Join group-aware hash join Both columns in the hash join carry common dictionary encoding data during runtime. The execution plan must show either a parallel hash join plan without any distribution involved from both sides of the hash join, or a serial hash join plan. 
- 
                           Encoding-aware hash join One fact table column in the hash join carries dictionary encoding data during runtime. The execution plan must show either a parallel hash join without any distribution from the right side of the hash join, or a serial hash join plan. In some cases, if the common dictionary has good compression ratio, and if a parallel hash join plan cannot leverage a join group-aware hash join (for example, in a parallel broadcast-none plan), then the query can use an encoding-aware hash join for the common dictionary. 
In a SQL Monitor report, the following fields show dictionary usage:
                Columnar Encodings Observed, and Columnar Encodings
                Leveraged. The statistics are cumulative. In a parallel hash join, the
            fields summarize statistics collected from all child processes involved in executing a
            row source. In the context of the local dictionary in an IMCU, the statistics show the
            number of encoding IDs observed from the right child row source and the number of
            encodings leveraged by the join. If a hash join on a single process leverages the common
            dictionary, then Columnar Encodings Leveraged shows the number of
            encodings leveraged in the join.
The following table indicates the possible values for Columnar Encodings Observed and Columnar Encodings Leveraged, and what the combinations mean.
                     
Table 8-1 Join Group Usage in a SQL Monitor Report
| Columnar Encodings Observed | Columnar Encodings Leveraged | Encoding-Aware Hash Join Used? | Join Group-Aware Hash Join Used? | 
|---|---|---|---|
| Not present | Not present | No | No | 
| Positive value | Not present | No | No | 
| Positive value | Positive value | Yes | No | 
| Not present | Positive value | No | Yes | 
For example, if the report shows that the Columnar Encodings Leveraged field is 4 (for example, because the parallel degree is 4) but the Columnar Encodings Observed field is absent, then the query leveraged the join group for the hash join. If the Columnar Encodings Observed field is 4 but the Columnar Encodings Leveraged field is absent, then dictionary encodings existed, but the query did not use them.
                     
Various factors can prevent a query from engaging an encoding-aware hash join. Factors include the following:
- 
                           The compression ratio of the common dictionary is suboptimal. 
- 
                           The query observes too many row sets passed from the table scan without a common dictionary. 
- 
                           The build-side row length is too large. 
- 
                           The build-side rows cannot fit into PGA memory. 
- 
                           The build side has duplicate join keys. 
See Also:
8.2.5 Creating Join Groups
Define join groups using the CREATE INMEMORY JOIN GROUP statement. 
                     
Candidates for join groups are columns that are frequently paired in a join predicate. Typical examples include a column joining a fact and dimension table, or a column joining a table to itself.
The CREATE INMEMORY JOIN GROUP statement immediately defines a join group, which means that its metadata is visible in the data dictionary. The database does not immediately construct the common dictionary. Rather, the database builds the common dictionary the next time that a table referenced in the join group is populated or repopulated in the IM column store.
                        
Guidelines
Creating, modifying, or dropping a join group typically invalidates all the underlying tables referenced in the join group. Thus, Oracle recommends that you create join groups before initially populating the tables.
To create a join group:
- 
                              In SQL*Plus or SQL Developer, log in to the database as a user with the necessary privileges. 
- 
                              Create a join group by using a statement in the following form: CREATE INMEMORY JOIN GROUP join_group_name ( table1(col1), table2(col2) );For example, the following statement creates a join group named sales_products_jg:CREATE INMEMORY JOIN GROUP sales_products_jg (sales(prod_id), products(prod_id));
- 
                              Optionally, view the join group definition by querying the data dictionary (sample output included): COL JOINGROUP_NAME FORMAT a18 COL TABLE_NAME FORMAT a8 COL COLUMN_NAME FORMAT a7 SELECT JOINGROUP_NAME, TABLE_NAME, COLUMN_NAME, GD_ADDRESS FROM DBA_JOINGROUPS; JOINGROUP_NAME TABLE_NA COLUMN_ GD_ADDRESS ------------------ -------- ------- ---------------- SALES_PRODUCTS_JG SALES PROD_ID 00000000A142AE50 SALES_PRODUCTS_JG PRODUCTS PROD_ID 00000000A142AE50
- 
                              Populate the tables referenced in the join group, or repopulate them if they are currently populated. 
Example 8-2 Optimizing a Query Using a Join Group
In this example,  you log in to the database as SYSTEM, and then create a join group on the prod_id column of sales and products, which are not yet populated in the IM column store:
                        
CREATE INMEMORY JOIN GROUP 
  sh.sales_products_jg (sh.sales(prod_id), sh.products(prod_id));
You enable the sh.sales and sh.products tables for population in the IM column store:
                        
ALTER TABLE sh.sales INMEMORY;
ALTER TABLE sh.products INMEMORY;The following query indicates the tables are not yet populated in the IM column store (sample output included):
COL OWNER FORMAT a3
COL NAME FORMAT a10
COL STATUS FORMAT a20
SELECT OWNER, SEGMENT_NAME NAME,
       POPULATE_STATUS STATUS
FROM   V$IM_SEGMENTS;
no rows selectedQuery both tables to populate them in the IM column store:
SELECT /*+ FULL(s) NO_PARALLEL(s) */ COUNT(*) FROM sh.sales s;
SELECT /*+ FULL(p) NO_PARALLEL(p) */ COUNT(*) FROM sh.products p;
The following query indicates the tables are now populated in the IM column store (sample output included):
COL OWNER FORMAT a3
COL NAME FORMAT a10
COL PARTITION FORMAT a13
COL STATUS FORMAT a20
SELECT OWNER, SEGMENT_NAME NAME, PARTITION_NAME PARTITION,
       POPULATE_STATUS STATUS, BYTES_NOT_POPULATED
FROM   V$IM_SEGMENTS;
OWN NAME       PARTITION     STATUS               BYTES_NOT_POPULATED
--- ---------- ------------- -------------------- -------------------
SH  SALES      SALES_Q3_1998 COMPLETED            0
SH  SALES      SALES_Q4_2001 COMPLETED            0
SH  SALES      SALES_Q4_1999 COMPLETED            0
SH  PRODUCTS                 COMPLETED            0
SH  SALES      SALES_Q1_2001 COMPLETED            0
SH  SALES      SALES_Q1_1999 COMPLETED            0
SH  SALES      SALES_Q2_2000 COMPLETED            0
SH  SALES      SALES_Q2_1998 COMPLETED            0
SH  SALES      SALES_Q3_2001 COMPLETED            0
SH  SALES      SALES_Q3_1999 COMPLETED            0
SH  SALES      SALES_Q4_2000 COMPLETED            0
SH  SALES      SALES_Q4_1998 COMPLETED            0
SH  SALES      SALES_Q1_2000 COMPLETED            0
SH  SALES      SALES_Q1_1998 COMPLETED            0
SH  SALES      SALES_Q2_2001 COMPLETED            0
SH  SALES      SALES_Q2_1999 COMPLETED            0
SH  SALES      SALES_Q3_2000 COMPLETED            0
Query DBA_JOINGROUPS to get information about the join group (sample output included):
                        
COL JOINGROUP_NAME FORMAT a18
COL TABLE_NAME FORMAT a8
COL COLUMN_NAME FORMAT a7
SELECT JOINGROUP_NAME, TABLE_NAME, COLUMN_NAME, GD_ADDRESS 
FROM   DBA_JOINGROUPS;
JOINGROUP_NAME     TABLE_NA COLUMN_ GD_ADDRESS
------------------ -------- ------- ----------------
SALES_PRODUCTS_JG  SALES    PROD_ID 00000000A142AE50
SALES_PRODUCTS_JG  PRODUCTS PROD_ID 00000000A142AE50
The preceding output shows that the join group sales_products_jg joins on the same common dictionary address.
                        
See Also:
- 
                                 Oracle Database SQL Language Reference to learn about the CREATE INMEMORY JOIN GROUPstatement
- 
                                 Oracle Database Reference to learn about the DBA_JOINGROUPSview
8.2.6 Monitoring Join Group Usage
To determine whether queries are using the join group, you can use either a graphical SQL Monitor report (recommended) or a SQL query that uses the DBMS_SQLTUNE.REPORT_SQL_MONITOR_XML function.
                     
"When a Hash Join Uses Common Dictionary Encodings" explains how to interpret the SQL Monitor output.
Prerequisites
To monitor join groups, you must meet the following prerequisites:
- 
                              A join group must exist. 
- 
                              The columns referenced by the join group must have been populated after join group creation. 
- 
                              You must execute a join query that could potentially use the join group. 
To monitor join group usage:
- 
                              Log in to the database as a user with the necessary privileges. 
- 
                              Create a SQL*Plus variable to store the SQL ID as follows: VAR b_sqlid VARCHAR2(13)
- 
                              Execute a query that joins on the columns in the join group. 
- 
                              Use either following techniques: - 
                                    Graphical SQL Monitor Report SQL Monitor reports are available in Enterprise Manager. In SQL*Plus, you can use DBMS_SQL_MONITOR.REPORT_SQL_MONITORto generate a SQL Monitor report as follows:SET TRIMSPOOL ON SET TRIM ON SET PAGES 0 SET LINESIZE 1000 SET LONG 1000000 SET LONGCHUNKSIZE 1000000 SPOOL /tmp/long_sql.htm SELECT DBMS_SQL_MONITOR.REPORT_SQL_MONITOR( sql_id => :b_sqlid, report_level => 'ALL', TYPE => 'active') FROM DUAL; SPOOL OFFAccess the report in a browser, and then click the binoculars icon on the hash join to view the join group statistics. 
- 
                                    Command-Line Query Use the DBMS_SQLTUNE.REPORT_SQL_MONITOR_XMLfunction in a query, as shown in the following example:SELECT encoding_hj.rowsource_id row_source_id, CASE WHEN encoding_hj.encodings_observed IS NULL AND encoding_hj.encodings_leveraged IS NOT NULL THEN 'join group was leveraged on ' || encoding_hj.encodings_leveraged || ' processes' ELSE 'join group was NOT leveraged' END columnar_encoding_usage_info FROM (SELECT DBMS_SQLTUNE.REPORT_SQL_MONITOR_XML(session_id=>-1,sql_id=>:b_sqlid). EXTRACT(q'#//operation[@name='HASH JOIN' and @parent_id]#') xmldata FROM DUAL) hj_operation_data, XMLTABLE('/operation' PASSING hj_operation_data.xmldata COLUMNS "ROWSOURCE_ID" NUMBER PATH '@id', "ENCODINGS_LEVERAGED" NUMBER PATH 'rwsstats/stat[@id="9"]', "ENCODINGS_OBSERVED" NUMBER PATH 'rwsstats/stat[@id="10"]') encoding_hj;
 
- 
                                    
8.2.6.1 Monitoring Join Groups Using a SQL Monitor Report: Example
Your goal is to use a graphical SQL Monitor report to determine whether a query leveraged a join group.
In this example, you create a join group on the prod_id columns of sh.products and sh.sales tables, and then join these tables on this column. You grant the sh account administrative privileges. 
                           
Example 8-3 Monitoring a Join Group Using a SQL Monitor Report
- 
                                 In SQL*Plus, log in to the database as user sh.
- 
                                 Create a SQL*Plus variable to store the SQL ID as follows: VAR b_sqlid VARCHAR2(13)
- 
                                 Apply the INMEMORYattribute to thesh.productsandsh.salestables as follows:ALTER TABLE sales NO INMEMORY; ALTER TABLE products NO INMEMORY; ALTER TABLE sales INMEMORY MEMCOMPRESS FOR QUERY; ALTER TABLE products INMEMORY MEMCOMPRESS FOR QUERY;
- 
                                 Create a join group on prod_id:CREATE INMEMORY JOIN GROUP jgrp_products_sales (products(prod_id), sales(prod_id));
- 
                                 Scan the tables to populate them in the IM column store: SELECT /*+ FULL(s) */ COUNT(*) FROM sales s; SELECT /*+ FULL(p) */ COUNT(*) FROM products p;
- 
                                 Execute a query that joins on the prod_idcolumn, and then aggregates product sales:SELECT /*+ USE_HASH(sales) LEADING(products sales) MONITOR */ products.prod_id, products.prod_category_id, SUM(sales.amount_sold) FROM products, sales WHERE products.prod_id = sales.prod_id GROUP BY products.prod_category_id, products.prod_id;
- 
                                 Generate an HTML-based SQL Monitor report by using DBMS_SQLTUNE.REPORT_SQL_MONITOR.For example, create a SQL script with the following contents, and run it in SQL*Plus: SET TRIMSPOOL ON SET TRIM ON SET PAGES 0 SET LINESIZE 1000 SET LONG 1000000 SET LONGCHUNKSIZE 1000000 SPOOL /tmp/jg_report.htm SELECT DBMS_SQL_MONITOR.REPORT_SQL_MONITOR( sql_id => :b_sqlid, report_level => 'ALL', TYPE => 'active') FROM DUAL; SPOOL OFF
- 
                                 Open the HTML report in a browser. The following sample report shows the execution plan for the join. The binoculars in the hash join open a window that shows additional statistics. Figure 8-4 Monitored SQL Execution Details Page  
 Description of "Figure 8-4 Monitored SQL Execution Details Page"
- 
                                 Click the binoculars icon to open a window that shows join group statistics. The following sample window shows the statistics: Because Columnar Encodings Leveragedis a positive value andColumnar Encodings Observedis not present, the join group was leveraged.
- 
                                 Optionally, clean up after the example: DROP INMEMORY JOIN GROUP jgrp_products_sales; ALTER TABLE sales NO INMEMORY; ALTER TABLE products NO INMEMORY;
See Also:
- 
                                    Oracle Database PL/SQL Packages and Types Reference to learn about the DBMS_SQLTUNE.REPORT_SQL_MONITOR_XMLfunction
- 
                                    Oracle Database Reference to learn about the V$SESSIONview
8.2.6.2 Monitoring Join Groups from the Command Line: Example
Your goal is to use command-line tools to determine whether a query leveraged a join group.
In this example, you create a join group on the prod_id columns of sh.products and sh.sales tables, and then join these tables on this column. You grant the sh account administrative privileges. 
                           
Example 8-4 Monitoring a Join Group from the Command Line
- 
                                 Log in to the database as sh.
- 
                                 Create a SQL*Plus variable to store the SQL ID as follows: VAR b_sqlid VARCHAR2(13)
- 
                                 Apply the INMEMORYattribute to thesh.productsandsh.salestables as follows:ALTER TABLE sales NO INMEMORY; ALTER TABLE products NO INMEMORY; ALTER TABLE sales INMEMORY MEMCOMPRESS FOR QUERY; ALTER TABLE products INMEMORY MEMCOMPRESS FOR QUERY;
- 
                                 Create a join group on prod_id:CREATE INMEMORY JOIN GROUP jgrp_products_sales (products(prod_id), sales(prod_id));
- 
                                 Scan the tables to populate them in the IM column store: SELECT /*+ FULL(s) */ COUNT(*) FROM sales s; SELECT /*+ FULL(p) */ COUNT(*) FROM products p;
- 
                                 Execute a query that joins on the prod_idcolumn, and then aggregates product sales:SELECT /*+ USE_HASH(sales) LEADING(products sales) MONITOR */ products.prod_id, products.prod_category_id, SUM(sales.amount_sold) FROM products, sales WHERE products.prod_id = sales.prod_id GROUP BY products.prod_category_id, products.prod_id;
- 
                                 Obtain the SQL ID of the preceding aggregation query: BEGIN SELECT PREV_SQL_ID INTO :b_sqlid FROM V$SESSION WHERE SID=USERENV('SID'); END;
- 
                                 Use DBMS_SQLTUNE.REPORT_SQL_MONITOR_XMLto determine whether the database used the join group.For example, execute the following query: COL row_source_id FORMAT 999 COL columnar_encoding_usage_info FORMAT A40 SELECT encoding_hj.rowsource_id row_source_id, CASE WHEN encoding_hj.encodings_observed IS NULL AND encoding_hj.encodings_leveraged IS NOT NULL THEN 'join group was leveraged on ' || encoding_hj.encodings_leveraged || ' processes' ELSE 'join group was NOT leveraged' END columnar_encoding_usage_info FROM (SELECT DBMS_SQLTUNE.REPORT_SQL_MONITOR_XML(session_id=>-1,sql_id=>:b_sqlid). EXTRACT(q'#//operation[@name='HASH JOIN' and @parent_id]#') xmldata FROM DUAL ) hj_operation_data, XMLTABLE('/operation' PASSING hj_operation_data.xmldata COLUMNS "ROWSOURCE_ID" NUMBER PATH '@id', "ENCODINGS_LEVERAGED" NUMBER PATH 'rwsstats/stat[@id="9"]', "ENCODINGS_OBSERVED" NUMBER PATH 'rwsstats/stat[@id="10"]' ) encoding_hj;The following sample output shows that the join group was leveraged in the query: ROW_SOURCE_ID COLUMNAR_ENCODING_USAGE_INFO ------------- ---------------------------------------- 2 join group was leveraged on 1 processes
- 
                                 Optionally, clean up after the example: DROP INMEMORY JOIN GROUP jgrp_products_sales; ALTER TABLE sales NO INMEMORY; ALTER TABLE products NO INMEMORY;
See Also:
- 
                                    Oracle Database PL/SQL Packages and Types Reference to learn about the DBMS_SQLTUNE.REPORT_SQL_MONITOR_XMLfunction
- 
                                    Oracle Database Reference to learn about the V$SESSIONview
8.3 Optimizing Joins Using In-Memory Deep Vectorization
In-Memory deep vectorization can optimize complex SQL operators by pipelining the physical operators inside each SQL operator and vectorizing them using SIMD techniques. This feature is enabled by default.
8.3.1 About In-Memory Deep Vectorization
In-Memory deep vectorization is a SIMD-based framework that supports vectorization for higher-level query operators in the query plan. The framework includes optimizations such as SIMD, hardware acceleration, and pipelined execution.
The In-Memory vectorized joins feature is key to the deep vectorization framework. Using SIMD vector processing, the framework optimizes aspects of hash joins such as hashing, building, probing, and gathering. This optimization can improve the performance of join processing by 100% or more.
The In-Memory vectorized joins feature is transparent to the user, requiring no plan changes. The optimization is enabled by default, but you can disable it by setting the INMEMORY_DEEP_VECTORIZATION initialization parameter to false.
                     
See Also:
- 
                              "CPU Architecture: SIMD Vector Processing" to learn more about SIMD vectorization 
- 
                              Oracle Database SQL Tuning Guide to learn more about hash joins 
8.3.2 How In-Memory Deep Vectorization Works
The In-Memory deep vectorization framework deconstructs high-level, complex SQL operators such as hash joins into smaller kernel-sized units.
The deconstructed kernels are suitable for SIMD vectorization techniques. The database executes the kernels in a pipelined fashion to accelerate the overall operation.
See Also:
"CPU Architecture: SIMD Vector Processing" to learn more about SIMD vectorization
8.3.3 How an In-Memory Vectorized Join Works
The vectorized joins feature is a key aspect of the In-Memory deep vectorization framework.
An In-Memory vectorized join works as follows:
- 
                           At run time, the database decides whether a hash join would benefit from In-Memory deep vectorization. If so, the database proceeds to the next step. Note: Because selection of the vectorized joins operation occurs at runtime, the execution plan does not show the optimization. 
- 
                           The database pushes down join processing to the scan operators for evaluation directly on In-Memory columnar data formats. 
- 
                           The operation uses a SIMD-optimized hash table data structure instead of a traditional hash table. 
- 
                           The database determines matched rows from the left and right side of the join, and sends them back to the parent SQL operator using vectorization techniques. 
An In-Memory vectorized join may leverage In-Memory features such as the following:
- 
                           Join groups If a join group is declared, then join processing with deep vectorization may be significantly faster. 
- 
                           IM dynamic scans Join processing is parallelized further in scan operators using lightweight threads. 
- 
                           Aggregation pushdown Aggregations on top of joins are vectorized with the join operation. 
- 
                           In-Memory columnar compression formats The vectorized joins feature heavily leverages columnar data formats. 
8.3.4 User Interface for Deep Vectorization
The deep vectorization framework is enabled when the INMEMORY_DEEP_VECTORIZATION initialization parameter is true, which is the default value.
                     
You can use SQL Monitor to determine whether a query used a vectorized join. In a SQL Monitor report, click the binoculars icon next to the HASH JOIN operation in the Information column. If DeepVec Hash Joins has the value 1, then the query used deep vectorization; otherwise, the query did not use it.
                        
8.3.5 In-Memory Vectorized Join: Example
This example indicates how a hash join benefits from deep vectorization.
In this example, a customers and orders table exist, and INMEMORY_DEEP_VECTORIZATION is currently set to false. Your goal is to count the orders within a single nation, making use of the deep vectorization optimizations if possible.
                        
- 
                              Log in to the database as administrative user DB1.
- 
                              Using ALTER SESSION, set theINMEMORY_DEEP_VECTORIZATIONinitialization parameter toENABLE.
- 
                              Join the tpch.customerandtpch.orderstables as follows, filtering on the value21in thetpch.customer.c_nationkeycolumn:SELECT /*+monitor */ COUNT(*) FROM tpch.customer c, tpch.orders o WHERE c.c_custkey = o.o_custkey AND c.c_nationkey = 21
- 
                              To create a SQL Monitor report in HTML, pass the SQL ID to DBMS_SQL_MONITOR.REPORT_SQL_MONITOR.See "Monitoring Join Group Usage" for an example showing how to generate the report. 
- 
                              Open the SQL Monitor report in a browser. The overview section of the report appears below. 
- 
                              In the Details section of the report, find the HASH JOINoperation, and then click the binoculars icon.
- 
                              In the Other Information window, look for DeepVec Hash Joins. Because the value is1, the database used an In-Memory vectorized join.Note that DeepVec Hash Join Flagsis an internal statistic that indicates which optimizations deep vectorization employed.





