In-Memory JSON Data

29 In-Memory JSON Data

A column of JSON data can be stored in the In-Memory Column Store (IM column store) to improve query performance.

29.1 Overview of In-Memory JSON Data

You can populate JSON data into the In-Memory Column Store (IM column store), to improve the performance of ad hoc and full-text queries.

Using the IM column store for JSON data is especially useful for ad hoc analytical queries that scan a large number of small JSON documents.

If a JSON column is of data type JSON then you can also use the IM column store to provide support for full-text search. (JSON type is available only if database initialization parameter compatible is at least 20.)

Note:

An alternative to placing a JSON column in the IM column store is to create a JSON search index on the column. This provides support for both ad hoc queries and full-text search.

If a JSON search index is defined for a JSON column (of any data type), and that column is also populated into the IM column store, then the search index, not the IM column store, is used for queries of that column.

Unlike the case for using the IM column store to support full-text search, JSON search index support is available for any JSON column, not just a column of data type JSON.

The IM column store is supported only for JSON documents smaller than 32,767 bytes. If you have a mixture of document sizes, those documents that are larger than 32,767 bytes are processed without the In-Memory optimization. For better performance, consider breaking up documents larger than 32,767 bytes into smaller documents.

The IM column store is an optional SGA pool that stores copies of tables and partitions in a special columnar format optimized for rapid scans. The IM column store supplements the row-based storage in the database buffer cache. You do not need to load the same object into both the IM column store and the buffer cache. The two caches are kept transactionally consistent. The database transparently sends online transaction processing (OLTP) queries (such as primary-key lookups) to the buffer cache and analytic and reporting queries to the IM column store.

You can think of the use of JSON data in memory as improving the performance of SQL/JSON path access. SQL functions and conditions json_table, json_query, json_value, json_exists, and json_textcontains all accept a SQL/JSON path argument, and they can all benefit from loading JSON data into the IM column store.

Once JSON documents have been loaded into memory, any subsequent path-based operations on them use the In-Memory representation, which avoids the overhead associated with reading and parsing the on-disk format.

If queried JSON data is populated into the IM column store, and if there are function-based indexes that can apply to that data, the optimizer chooses whether to use an index or to scan the data in memory. In general, if index probing results in few documents then a functional index can be preferred by the optimizer. In practice this means that the optimizer can prefer a functional index for very selective queries or DML statements.

On the other hand, if index probing results in many documents then the optimizer might choose to scan the data in memory, by scanning the function-based index expression as a virtual-column expression.

Ad hoc queries, that is, queries that are not used frequently to target a given SQL/JSON path expression, benefit in a general way from populating JSON data into the IM column store, by quickly scanning the data. But if you have some frequently used queries then you can often further improve their performance in these ways:

Creating virtual columns that project scalar values (not under an array) from a column of JSON data and loading those virtual columns into the IM column store.
Creating a materialized view on a frequently queried json_table expression and loading the view into the IM column store.

However, if you have a function-based index that projects a scalar value using function json_value then you need not explicitly create a virtual column to project it. As mentioned above, in this case the function-based index expression is automatically loaded into the IM column store as a virtual column. The optimizer can choose, based on estimated cost, whether to scan the function-based index in the usual manner or to scan the index expression as a virtual-column expression.

Note:

The advantages of a virtual column over a materialized view are that you can build an index on it and you can obtain statistics on it for the optimizer.
Virtual columns, like columns in general, are subject to the 1000-column limit for a given table.

Note:

A table with one or more columns of JSON data type has an additional, hidden virtual column for each such column. It has a system-generated name, which starts with SYS_IME_OSON_. As it is virtual, it does not use any space.

This hidden column is used when data is loaded into the IM column store, to optimize in-memory performance. It's not listed when you use a describe command, and it's not affected by a SELECT * query. It is however listed when you query dictionary views such as USER_TAB_COLS.

Prerequisites For Using JSON Data In Memory

To be able to take advantage of the IM column store for JSON data, the following must all be true:

Database compatibility is 12.2.0.0 or higher. For full-text support it must be 20 or higher.
The value set for max_string_size in the Oracle instance start-up configuration file must be 'extended'.
Sufficient SGA memory must be configured for the IM column store.
A DBA has specified that the tablespace, table, or materialized view that contains the JSON columns is eligible for population into the IM column store, using keyword INMEMORY in a CREATE or ALTER statement.
Initialization parameters are set as follows:
- IMMEMORY_EXPRESSIONS_USAGE is STATIC_ONLY or ENABLE.
  
  ENABLE allows In-Memory materialization of dynamic expressions, if used in conjunction with PL/SQL procedure DBMS_INMEMORY.ime_capture_expressions.
- IMMEMORY_VIRTUAL_COLUMNS is ENABLE, meaning that the IM column store populates all virtual columns. (The default value is MANUAL.)
The columns storing the JSON data must be known to contain well-formed JSON data. This is the case if the column is of JSON data type or it has an is json check constraint.

You can check the value of each initialization parameter using command SHOW PARAMETER. (You must be logged in as database user SYS or equivalent for this.) For example:

SHOW PARAMETER INMEMORY_VIRTUAL_COLUMNS

Related Topics

29.2 Populating JSON Data Into the In-Memory Column Store

Use ALTER TABLE … INMEMORY to populate a column of JSON data, or a table with such a column, into the In-Memory Column Store (IM column store), to improve the performance of JSON queries.

You specify that a table with one or more columns of JSON data is to be populated into the IM column store, by marking the table as INMEMORY. Example 29-1 illustrates this.

A column is guaranteed to contain only well-formed JSON data if (1) it is of data type JSON or (2) it is of type VARCHAR2, CLOB, or BLOB and it has an is json check constraint. (Database initialization parameter compatible must be at least 20 to use data type JSON.)

The IM column store is used for queries of documents that are smaller than 32,767 bytes. Queries of documents that are larger than that do not benefit from the IM column store.

Note:

If a JSON column in a table that is to be populated into the IM column store was created using a database that did not have a compatibility setting of at least 12.2 or did not have max_string_size set to extended (this is the case prior to Oracle Database 12c Release 2 (12.2.0.1), for instance) then you must first run script rdbms/admin/utlimcjson.sql. It prepares all existing tables that have JSON columns to take advantage of the In-Memory JSON processing that was added in Release 12.2.0.1. See Upgrading Tables With JSON Data For Use With the In-Memory Column Store.

After you have marked a table that has JSON columns as INMEMORY, an In-Memory virtual column is added to it for each JSON column. The corresponding virtual column is used for queries of a given JSON column. The virtual column contains the same JSON data as the corresponding JSON column, but in OSON format, regardless of the data type of the JSON column (VARCHAR2, CLOB, BLOB, or JSON type). OSON is Oracle's optimized binary JSON format for fast query and update in both Oracle Database server and Oracle Database clients.

Populating JSON data into the IM column store using ALTER TABLE … INMEMORY provides support for ad hoc structural queries, that is, queries that you might not anticipate or use regularly.

If a column is of data type JSON then you can populate it into the IM column store using ALTER TABLE … INMEMORY TEXT, to provide support for full-text search. (Using ALTER TABLE … INMEMORY both with and without keyword TEXT for the same JSON column provides support for both ad hoc and full-text queries.)

Note:

If a JSON search index is defined for a JSON column (of any data type) that is populated into the IM Column Store then the search index, not the IM Column Store, is used for queries of that column.

29.3 Upgrading Tables With JSON Data For Use With the In-Memory Column Store

A table with JSON columns created using a database that did not have a compatibility setting of at least 12.2 or did not have max_string_size = extended must first be upgraded, before it can be populated into the In-Memory Column Store (IM column store). To do this, run script rdbms/admin/utlimcjson.sql.

Script rdbms/admin/utlimcjson.sql upgrades all existing tables that have JSON columns so they can be populated into the IM column store. To use it, all of the following must be true:

Database parameter compatible must be set to 12.2.0.0 or higher.
Database parameter max_string_size must be set to extended.
The JSON columns being upgraded must be known to contain well-formed JSON data. This is the case for a column of data type JSON^Foot 1 or a non-JSON type column that has an is json check constraint defined on it.

Related Topics

Overview of In-Memory JSON Data

Parent topic: In-Memory JSON Data

Footnote Legend

Footnote 1: Database initialization parameter compatible must be at least 20 to use data type JSON.