28 Indexes for JSON Data
You can index scalar values in your JSON data using function-based indexes. In addition, you can define a JSON search index, which is useful for both ad hoc structural queries and full-text queries.
- Overview of Indexing JSON Data
You can index particular scalar values within your JSON data using function-based indexes. You can index JSON data in a general way using a JSON search index, for ad hoc structural queries and full-text queries. - How To Tell Whether a Function-Based Index for JSON Data Is Picked Up
Whether or not a particular index is picked up for a given query is determined by the optimizer. To determine whether a given query picks up a given function-based index, look for the index name in the execution plan for the query. - Creating Bitmap Indexes for JSON_VALUE
You can create a bitmap index for SQL/JSON functionjson_value
. A bitmap index can be appropriate whenever your queries target only a small set of JSON values. - Creating B-Tree Indexes for JSON_VALUE
You can create a B-tree function-based index for SQL/JSON functionjson_value
. You can use the standard syntax for this, explicitly specifyingjson_value
, or you can use dot-notation syntax with an item method. Indexes created in either of these ways can be used with both dot-notation queries andjson_value
queries. - Using a JSON_VALUE Function-Based Index with JSON_TABLE Queries
An index created usingjson_value
withERROR ON ERROR
can be used for a query involvingjson_table
. In this case the index acts as a constraint on the indexed path, to ensure that only one (non-null
) scalar JSON value is projected for each item in the JSON collection. - Using a JSON_VALUE Function-Based Index with JSON_EXISTS Queries
An index created using SQL/JSON functionjson_value
withERROR ON ERROR
can be used for a query involving SQL/JSON conditionjson_exists
. - Data Type Considerations for JSON_VALUE Indexing and Querying
For a function-based index created using SQL/JSON functionjson_value
to be picked up for a given query, the data type returned byjson_value
in the query must match the type specified in the index. - Creating Multivalue Function-Based Indexes for JSON_EXISTS
For JSON data that is stored asJSON
data type you can use a multivalue function-based index for SQL/JSON conditionjson_exists
. Such an index targets scalar JSON values, either individually or within a JSON array. - Using a Multivalue Function-Based Index
Ajson_exists
query in aWHERE
clause can pick up a multivalue function-based index if (and only if) the data that it targets matches the scalar types specified in the index. - Indexing Multiple JSON Fields Using a Composite B-Tree Index
To index multiple fields of a JSON object you can create a composite B-tree index using multiple path expressions with SQL/JSON functionjson_value
or dot-notation syntax. - JSON Search Index for Ad Hoc Queries and Full-Text Search
A JSON search index is a general index. It can improve the performance of both (1) ad hoc structural queries, that is, queries that you might not anticipate or use regularly, and (2) full-text search. It is an Oracle Text index that is designed specifically for use with JSON data.
Parent topic: Performance Tuning for JSON
28.1 Overview of Indexing JSON Data
You can index particular scalar values within your JSON data using function-based indexes. You can index JSON data in a general way using a JSON search index, for ad hoc structural queries and full-text queries.
As always, function-based indexing is appropriate for queries that target particular functions, which in the context of SQL/JSON functions means particular SQL/JSON path expressions. This indexing is not very helpful for queries that are ad hoc, that is, arbitrary. Define a function-based index if you know that you will often query a particular path expression.
Regardless of the SQL data type you use to store JSON data, you can use a
B-tree or bitmap function-based index for SQL/JSON function
json_value
queries. Such an index targets a single scalar
JSON value. A bitmap index can be appropriate wherever the number of possible values for
the function is small. For example, you can use a bitmap index for
json_value
if the values targeted are expected to be few.
For JSON data that is stored as JSON
type you can use a
multivalue function-based index for SQL/JSON condition
json_exists
. Such an index targets scalar JSON values,
either individually or (especially) as elements of a JSON array.
Although a multivalue index can index a single scalar value, if you expect a path expression to target such a value then it is more performant to use a B-tree or bitmap index. Use a multivalue index especially to index a path expression that you expect to target an array of scalar values.
SQL/JSON path expressions that contain predicates can be used in queries that pick up a function-based index. But a path expression that you use to define a function-based index cannot contain predicates.
If you query in an ad hoc manner then define a JSON search index. This is a general index, not targeted to any specific path expression. It is appropriate for structural queries, such as looking for a JSON field with a particular value, and for full-text queries using Oracle SQL condition json_textcontains
, such as looking for a particular word among various string values.
You can of course define both function-based indexes and a JSON search index for the same JSON column.
A JSON search index is an Oracle Text (full-text) index designed specifically for use with JSON data.
Note:
Oracle recommends that you use AL32UTF8 as the database character set. Automatic character-set conversion can take place when creating or applying an index. Such conversion can be lossy, which can mean that some data that you might expect to be returned by a query is not returned. See Character Sets and Character Encoding for JSON Data.
28.2 How To Tell Whether a Function-Based Index for JSON Data Is Picked Up
Whether or not a particular index is picked up for a given query is determined by the optimizer. To determine whether a given query picks up a given function-based index, look for the index name in the execution plan for the query.
For example:
-
Given the index defined in Example 28-3, an execution plan for each of the queries in these examples references an index scan with index
po_num_id1
: Example 28-5, Example 28-6, Example 28-7, Example 28-8, and Example 28-10 -
Given the index defined in Example 28-14, an execution plan for the queries in examples Example 28-17 and Example 28-18 references an index scan with index
mvi_1
.
When a multivalue index is picked up, the execution plan also shows
(MULTI VALUE)
for the index range scan, and the filter used in
the plan is JSON_QUERY
, not JSON_EXISTS2
. If the
execution plan does not use a multivalue index for a given
json_exists
query, then the filter is
JSON_EXISTS2
.
Parent topic: Indexes for JSON Data
28.3 Creating Bitmap Indexes for JSON_VALUE
You can create a bitmap index for SQL/JSON function json_value
. A bitmap index can be appropriate whenever your queries target only a small set of JSON values.
Example 28-1 Creating a Bitmap Index for JSON_VALUE
This is an appropriate index to create provided there are only a few possible values for field CostCenter
in your data.
CREATE BITMAP INDEX cost_ctr_idx ON j_purchaseorder
(json_value(po_document, '$.CostCenter'));
Parent topic: Indexes for JSON Data
28.4 Creating B-Tree Indexes for JSON_VALUE
You can create a B-tree function-based index for SQL/JSON function json_value
. You can use the standard syntax for this, explicitly specifying json_value
, or you can use dot-notation syntax with an item method. Indexes created in either of these ways can be used with both dot-notation queries and json_value
queries.
Example 28-3 creates a function-based index for
json_value
on field PONumber
of the object that is in
column po_document
of table j_purchaseorder
. The object is
passed as the path-expression context item.
The use of ERROR ON
ERROR
here means that if the data contains a record that has no
PONumber
field, has more than one
PONumber
field, or has a PONumber
field with a
non-number value then index creation fails. And if the index exists then trying to
insert such a record fails.
An alternative is to create an index using the dot-notation syntax described in Simple Dot-Notation Access to JSON Data, applying an item method to the targeted data. Example 28-2 illustrates this.
The indexes created in
both Example 28-3 and Example 28-2 can be picked up for either a query that uses dot-notation
syntax or a query that uses json_value
.
If you want to allow indexing of data that might be missing the field targeted
by a json_value
expression, then use a NULL ON EMPTY
clause, together with an ERROR ON ERROR
clause. Example 28-4 illustrates this.
Oracle
recommends that you create a function-based index for json_value
using one of the following forms. In each case the index can be used in both dot-notation
and json_value
queries that lead to a scalar result of the specified JSON
data type.
-
Dot-notation syntax, with an item method applied to the value to be indexed. The indexed values are only scalars of the data type specified by the item method.
-
A
json_value
expression that specifies aRETURNING
data type. It can optionally useERROR ON ERROR
andNULL ON EMPTY
. The indexed values are only scalars of the data type specified by theRETURNING
clause.
Indexes created in either of these ways can thus be used with both dot-notation
queries and json_value
queries.
Example 28-2 Creating a Function-Based Index for a JSON Field: Dot Notation
Item method number()
causes the
index to be of numeric type. Always apply an item method to the targeted data when you use
dot notation to create a function-based
index.
CREATE UNIQUE INDEX po_num_idx1 ON j_purchaseorder po
(po.po_document.PONumber.number());
Example 28-3 Creating a Function-Based Index for a JSON Field: JSON_VALUE
Item method number()
causes the index to be of numeric type.
Alternatively you can instead use clause RETURNING
NUMBER
.
CREATE UNIQUE INDEX po_num_idx2 ON j_purchaseorder
(json_value(po_document, '$.PONumber.number()'
ERROR ON ERROR));
Example 28-4 Specifying NULL ON EMPTY for a JSON_VALUE Function-Based Index
Clause RETURNING VARCHAR2(200)
causes the index to be a SQL string of maximum length 200 characters. You could use item
method string()
in the path expression instead, but in that case the
default return type of VARCHAR2(4000)
is used.
Because
of clause NULL ON EMPTY
, index po_ref_idx1
can index JSON
documents that have no Reference
field.
CREATE UNIQUE INDEX po_ref_idx1 ON j_purchaseorder
(json_value(po_document, '$.Reference'
RETURNING VARCHAR2(200) ERROR ON ERROR
NULL ON EMPTY));
Parent topic: Indexes for JSON Data
28.5 Using a JSON_VALUE Function-Based Index with JSON_TABLE Queries
An index created using json_value
with ERROR ON ERROR
can be used for a query involving json_table
. In this case the index acts as a constraint on the indexed path, to ensure that only one (non-null
) scalar JSON value is projected for each item in the JSON collection.
For the index to be used in this way each of these conditions must hold:
-
The query
WHERE
clause refers to a column projected byjson_table
. -
The data type of that column matches the data type used in the index definition.
-
The effective SQL/JSON path that targets that column matches the indexed path expression.
The query in Example 28-5 thus makes use of the index created in Example 28-3.
Note:
A function-based index created using a json_value
expression
or dot notation can be picked up for a corresponding occurrence in a query
WHERE
clause only if the occurrence is used in a SQL comparison
condition, such as >=
. In particular, it is not picked up for an
occurrence used in condition IS NULL
or IS NOT
NULL
.
See Oracle Database SQL Language Reference for information about SQL comparison conditions.
Example 28-5 Use of a JSON_VALUE Function-Based Index with a JSON_TABLE Query
The index can be picked up because the column SQL type,
NUMBER(5)
, matches the type used in the
index.
SELECT jt.*
FROM j_purchaseorder po,
json_table(po.po_document, '$'
COLUMNS po_number NUMBER(5) PATH '$.PONumber',
reference VARCHAR2(30 CHAR) PATH '$.Reference',
requestor VARCHAR2(32 CHAR) PATH '$.Requestor',
userid VARCHAR2(10 CHAR) PATH '$.User',
costcenter VARCHAR2(16 CHAR) PATH '$.CostCenter') jt
WHERE po_number = 1600;
Parent topic: Indexes for JSON Data
28.6 Using a JSON_VALUE Function-Based Index with JSON_EXISTS Queries
An index created using SQL/JSON function
json_value
with ERROR ON ERROR
can be used for a query
involving SQL/JSON condition json_exists
.
In order for a json_value
function-based index to be picked up for
one of the comparisons of the query, the type of that comparison must be the same as
the returning SQL data type for the index. The SQL data types used are those
mentioned for item methods double()
, float()
,
number()
,
string()
,timestamp()
, date()
,
dateWithTime()
, dsInterval()
, and
ymInterval()
— see SQL/JSON Path Expression Item Methods.
For example, if the index returns a number
then the comparison type must also be number. If the query filter expression
contains more than one comparison that matches a json_value
index,
the optimizer chooses one of the indexes.
The type of a comparison is determined as follows:
-
If the SQL data types of the two comparison terms (sides of the comparison) are different then the type of the comparison is unknown, and the index is not picked up. Otherwise, the types are the same, and this type is the type of the comparison.
-
If a comparison term is of SQL data type string (a text literal) then the type of the comparison is the type of the other comparison term.
-
If a comparison term is a path expression with a function step whose item method imposes a SQL match type then that is also the type of that comparison term. The item methods that impose a SQL match type are
double()
,float()
,number()
,string()
,timestamp()
,date()
,dateWithTime()
,dsInterval()
, andymInterval()
. -
If a comparison term is a path expression with no such function step then its type is SQL string (text literal).
Example 28-3 creates a function-based index for
json_value
on field PONumber
. The index
indexes NUMBER
values.
Each of the queries
Example 28-6, Example 28-7, and Example 28-8 can make use of this index when evaluating its
json_exists
condition. Each of these queries uses a comparison
that involves a simple path expression that is relative to the absolute path
expression $.PONumber
. The relative simple path expression in each
case targets the current filter item, @
, but in the case of Example 28-8 it transforms (casts) the matching data to SQL
data type NUMBER
.
Example 28-6 JSON_EXISTS Query Targeting Field Compared to Literal Number
This query makes use of the index because:
-
One comparison term is a path expression with no function step, so its type is SQL string (text literal).
-
Because one comparison term is of type string, the comparison has the type of the other term, which is number (the other term is a numeral).
-
The type of the (lone) comparison is the same as the type returned by the index: number.
SELECT count(*) FROM j_purchaseorder
WHERE json_exists(po_document, '$.PONumber?(@ > 1500)');
Example 28-7 JSON_EXISTS Query Targeting Field Compared to Variable Value
This query can make use of the index because:
-
One comparison term is a path expression with no function step, so its type is SQL string (text literal).
-
Because one comparison term is of type string, the comparison has the type of the other term, which is number (the other term is a variable that is bound to a number).
-
The type of the (lone) comparison is the same as the type returned by the index: number.
SELECT count(*) FROM j_purchaseorder
WHERE json_exists(po_document, '$.PONumber?(@ > $d)'
PASSING 1500 AS "d");
Example 28-8 JSON_EXISTS Query Targeting Field Cast to Number Compared to Variable Value
This query can make use of the index because:
-
One comparison term is a path expression with a function step whose item method (
number()
) transforms the matching data to a number, so the type of that comparison term is SQL number. -
The other comparison term is a numeral, which has SQL type number. The types of the comparison terms match, so the comparison has this same type, number.
-
The type of the (lone) comparison is the same as the type returned by the index: number.
SELECT count(*) FROM j_purchaseorder
WHERE json_exists(po_document, '$.PONumber?(@.number() > $d)'
PASSING 1500 AS "d");
Example 28-9 JSON_EXISTS Query Targeting a Conjunction of Field Comparisons
Just as for Example 28-6, this query can make use of the index on field
PONumber
. If a json_value
index is also
defined for field Reference
then the optimizer chooses which index
to use for this
query.
SELECT count(*) FROM j_purchaseorder
WHERE json_exists(po_document,
'$?(@.PONumber > 1500
&& @.Reference == "ABULL-20140421")');
28.7 Data Type Considerations for JSON_VALUE Indexing and Querying
For a function-based index created using SQL/JSON function
json_value
to be picked up for a given query, the data type returned by
json_value
in the query must match the type specified in the
index.
When RETURNING DATE
is used with json_value
, the same
time-handling behavior (truncation or preservation) must be used in both the index and the
query, for the index to be picked up. That is, either RETURNING DATE PRESERVE
TIME
must be used in both, or RETURNING DATE TRUNCATE TIME
(or
RETURNING DATE
, since truncation is the default behavior) must be used in
both.
By default, SQL/JSON function json_value
returns a VARCHAR2
value. When you create a function-based index using json_value
, unless you
use a RETURNING
clause or an item method to specify a different return data
type, the index is not picked up for a query that expects a non-VARCHAR2
value.
For example, in the query of Example 28-10, json_value
uses RETURNING NUMBER
. The index created in Example 28-3 can be picked up for this query, because the indexed json_value
expression specifies a return type of NUMBER
. Without keywords RETURNING NUMBER
in the index the return type it specifies would be VARCHAR2(4000)
(the default) — the index would not be picked up for such a query.
Similarly, the index created in Example 28-2 can be picked up for the query because it uses item method number()
, which also imposes a return type of NUMBER
.
Now consider the queries in Example 28-11 and Example 28-12, which use json_value
without a
RETURNING
clause, so that the value returned is of type
VARCHAR2
.
In Example 28-11, SQL function to_number
explicitly converts
the VARCHAR2
value returned by json_value
to a number.
Similarly, in Example 28-12, comparison condition > (greater-than) implicitly converts
the value to a number.
Neither of the indexes of Example 28-3 and Example 28-2 is picked up for either of these queries. The queries might return the right results in each case, because of type-casting, but the indexes cannot be used to evaluate the queries.
Consider also what happens if some of the data cannot be converted to a
particular data type. For example, given the queries in Example 28-10, Example 28-11, and Example 28-12, what happens to a PONumber
value such as
"alpha"
?
For Example 28-11 and Example 28-12, the query stops in error because of the attempt to cast the
value to a number. For Example 28-10, however, because the default error handling behavior is
NULL ON ERROR
, the non-number value "alpha"
is simply
filtered out. The value is indexed, but it is ignored for the query.
Similarly, if the query used, say, DEFAULT '1000' ON ERROR
, that
is, if it specified a numeric default value, then no error would be raised for the value
"alpha"
: the default value of 1000
would be used.
Note:
For a function-based index based on SQL/JSON function
json_value
to be picked up for a given query, the same return
data type and handling (error, empty, and mismatch) must be used in both the index
and the query.
This means that if you change the return type or handling in a query, so that it no longer matches what is specified in the index, then you must rebuild any persistent objects that depend on that query pattern. (The same applies to materialized views, partitions, check constraints and PL/SQL subprograms that depend on that pattern.)
Example 28-10 JSON_VALUE Query with Explicit RETURNING NUMBER
SELECT count(*) FROM j_purchaseorder po
WHERE json_value(po_document, '$.PONumber' RETURNING NUMBER) > 1500;
Example 28-11 JSON_VALUE Query with Explicit Numerical Conversion
SELECT count(*) FROM j_purchaseorder po
WHERE to_number(json_value(po_document, '$.PONumber')) > 1500;
Example 28-12 JSON_VALUE Query with Implicit Numerical Conversion
SELECT count(*) FROM j_purchaseorder po
WHERE json_value(po_document, '$.PONumber') > 1500;
Parent topic: Indexes for JSON Data
28.8 Creating Multivalue Function-Based Indexes for JSON_EXISTS
For JSON data that is stored as JSON
data type you can use
a multivalue function-based index for SQL/JSON condition json_exists
. Such
an index targets scalar JSON values, either individually or within a JSON array.
The main use of a multivalue index is to index scalar values within arrays. This includes scalar array elements, but also scalar field values of object array elements.
A multivalue index can also index a single scalar value, but for queries that target a single value it is generally more performant to use a B-tree or bitmap index.
In a query, you use json_exists
in the
WHERE
clause of a SELECT
statement. Condition
json_exists
returns true if the data it targets matches the
SQL/JSON path expression (or equivalent simple dot-notation syntax) in the query.
Otherwise it returns false. It is common for the path expression to include a
predicate — matching then requires that the targeted data satisfy the predicate.
You create a multivalue index using CREATE INDEX
with
keyword MULTIVALUE
, and using either the syntax of SQL/JSON
function json_table
or simple dot-notation that you use in queries
to specify the path to the indexed data. (However, you cannot use a SQL
NESTED
clause in place of
json_table
— a compile-time error is raised if you do
that.)
You can create a composite function-based index, to index more
than one virtual column, that is, more than one JSON field. A composite index acts
like a set of function-based indexes. When used to query, you use function
json_table
to project specified JSON field values as virtual
columns of SQL scalar values. Similarly, when used to define an index, the field
values that json_table
specifies are indexed as a composite
function-based index.
When using json_table
syntax to create a multivalue
index you must use these error-handling clauses: ERROR ON ERROR NULL
ON EMPTY NULL ON MISMATCH
; otherwise, a query compile-time error is
raised. When using simple dot-notation syntax without json_table
,
the behavior of these clauses is provided implicitly.
When using json_table
syntax you can use a FOR
ORDINALITY
clause, to enable use of the index for queries that target
specific array positions. (See COLUMNS Clause of SQL/JSON Function JSON_TABLE.)
For a multivalue index to be picked up by a query, the index must specify the SQL type of the data to be indexed, and the SQL type for the query result must match the type specified by the index.
If you create a non-composite multivalue index, that is, without using
json_table
syntax, then the index specification must
include a data-type conversion item method (other than
binary()
and dateWithTime()
), to indicate the
SQL data type. See SQL/JSON Path Expression Item Methods for information about the data-type conversion item methods.
If the index uses an item method with "only" in its name then only queries that use that same item method can pick up the index. Otherwise (with a non-"only" method or with no method), any query that targets a scalar value (possibly as an array element) that can be converted to the type indicated by the item method can pick up the index.
For example, a multivalue index that uses item method
numberOnly()
can only be picked up for a query that also uses
numberOnly()
. But an index that uses number()
,
or that uses no item method, can be picked up for a query that matches any scalar
(such as the string "3.14"
) that can be converted to a number.
If you create a composite multivalue index then the
json_table
virtual column type specifies the SQL type to use.
This means that queries of data that can be converted to the specified SQL
type can pick up the index.
However, just as in the non-composite index case, you can use a data-type conversion item method with "only" in its name, to override (further constrain) the specified column type. You use the item method in the column path expression.
For example, if the column type is specified as NUMBER
then queries
with matching data (such as the string "3.14"
) that can be
converted to a number can pick up the index. But if the column path expression uses
item method numberOnly()
then only queries that also use
numberOnly()
can pick up the index.
You can create more than one multivalue index for a given target. For
example, you can create one index for a field month
that uses item
method number()
and another for the same field that uses item
method string()
.
The following are not allowed, as ways to create a multivalue index:
-
You cannot specify sibling nested arrays in the
json_table
expression used to create a composite multivalue index. An error is raised if you try. You can index multiple arrays, but they cannot be siblings, that is, they cannot have the same parent field. -
Using a SQL
NESTED
clause (see SQL NESTED Clause Instead of JSON_TABLE).
A type-error mismatch between the type of a scalar JSON value and the
corresponding scalar SQL data type of a json_table
virtual column
can be because of type incompatibility, as put forth in Table 16-2, or because the SQL data type is too constraining — too small to
store the data.
Error-handling ERROR ON ERROR NULL ON EMPTY NULL ON
MISMATCH
returns SQL NULL
for the first kind of
mismatch, but it raises an error for the second kind. For example, type
incompatibility is tolerated when creating an index with SQL type
NUMBER
for JSON string data, but an error is raised if you try
to create an index using SQL type VARCHAR(2)
for data that has a
JSON string value of "hello"
, because the data has more than two
characters.
Example 28-13 Table PARTS_TAB, for Multivalue Index Examples
Table parts_tab
, with JSON
data type
column jparts
, is used in multivalue index examples here. The JSON
data includes field subparts
whose value is an array with scalar
elements.
CREATE TABLE parts_tab (id NUMBER, jparts JSON);
INSERT INTO parts_tab VALUES
(1, '{"parts" : [{"partno" : 3, "subparts" : [510, 580, 520]},
{"partno" : 4, "subparts" : 730}]}');
INSERT INTO parts_tab VALUES
(2, '{"parts" : [{"partno" : 7, "subparts" : [410, 420, 410]},
{"partno" : 4, "subparts" : [710, 730, 730]}]}');
Example 28-14 Creating a Multivalue Index for JSON_EXISTS
The multivalue index created here indexes the value of field
subparts
. The table alias (t
in this case) is
required when using simple dot notation syntax.
If the subparts
value targeted by a query is an
array then the index can be picked up for any array elements that are
numbers. If the value is a scalar then the index can be picked up if the
scalar is a number.
Given the data in table parts_tab
, a
subparts
field in each of the objects of array
parts
in the first row (which has id
1
) is indexed: the field in the first object because its array
value has elements that are numbers (510
, 580
, and
520
) the field in the second object because its value is a
number (730
).
If item method number()
were used in the index
definition, instead of numberOnly()
, then non-number scalar values
(such as the string "730"
) that can be converted to numbers would
also be indexed.
CREATE MULTIVALUE INDEX mvi ON parts_tab t
(t.jparts.parts.subparts.numberOnly());
Example 28-15 Creating a Composite Multivalue Index for JSON_EXISTS
This example shows two equivalent ways to create a composite multivalue
index that targets both field partno
and field
subparts
. The composite index acts like a set of two
function-based indexes that target those two fields.
The first query uses json_table
syntax with a SQL/JSON
path expression for the row pattern, $.parts[*]
. The second uses
simple dot notation for the row pattern. Otherwise, the code is the same for both.
As must always be the case for multivalue index creation using
json_table
, the error handling is specified as ERROR ON
ERROR NULL ON EMPTY NULL ON MISMATCH
.
Column PARTNUM
is given SQL data type
NUMBER(10)
here, which means that, for the index to be used for
a query that targets field partno
, the value of that field must be
convertible to that data type.
-
If type conversion is impossible because the types are generally incompatible, as put forth in Table 16-2, then the
NULL ON MISMATCH
error handler causes SQLNULL
to be returned. An example of this would be apartno
string value of"hello"
for the SQLpartNum
column of typeNUMBER(10)
. -
If, on the other hand, the SQL data type storage is too constraining then an error is raised — the index is not created. An example of this would be a
partno
string with more than 10 characters, such as"1234567890123"
.
CREATE MULTIVALUE INDEX cmvi_1 ON parts_tab
(json_table(jparts, '$.parts[*]'
ERROR ON ERROR NULL ON EMPTY NULL ON MISMATCH
COLUMNS (partNum NUMBER(10) PATH '$.partno',
NESTED
PATH '$.subparts[*]'
COLUMNS (subpartNum NUMBER(20) PATH '$'))));
CREATE MULTIVALUE INDEX cmvi_1 ON parts_tab t
(t.jparts.parts[*]
ERROR ON ERROR NULL ON EMPTY NULL ON MISMATCH
COLUMNS (partNum NUMBER(10) PATH '$.partno',
NESTED subparts[*]
COLUMNS (subpartNum NUMBER(20) PATH '$'))));
Example 28-16 Creating a Composite Multivalue Index That Can Target Array Positions
The code in this example is like that in Example 28-15, except that it also specifies virtual column SEQ
for ordinality. That means that values in the column just before it,
SUBPARTNUM
, can be accessed by way of their (one-based)
positions in array subparts
. (The SQL data type of a FOR
ORDINALITY
column is always NUMBER
.)
As always, at most one entry in a COLUMNS
clause can be
a column name followed by FOR ORDINALITY
, which specifies a column
of generated row numbers (SQL data type NUMBER
), starting with one.
Otherwise, an error is raised when creating the index.
In addition to that general rule for json_table
syntax:
-
When
json_table
is used to create a multivalue index, theFOR ORDINALITY
column must be the last column of json_table. (This is not required whenjson_table
is used in queries; it applies only to index creation.) -
In order for a multivalue index created using
json_table
to be picked up for a given query, the query must include a predicate on the JSON field corresponding to the first virtual column of thejson_table
expression.
In order for a query that targets array elements by their position
to pick up a multivalue index for array positions, the index column for those array
elements must be the one immediately before the FOR
ORDINALITY
column
(The code here uses simple dot notation for the row pattern; if it instead used a SQL/JSON path expression for the row pattern, the rest of the code would be the same.)
CREATE MULTIVALUE INDEX cmvi_2 ON parts_tab t
(t.jparts.parts[*]
ERROR ON ERROR NULL ON EMPTY NULL ON MISMATCH
COLUMNS (partNum NUMBER(10) PATH '$.partno',
NESTED subparts[*]
COLUMNS (subpartNum NUMBER(20) PATH '$',
seq FOR ORDINALITY))));
28.9 Using a Multivalue Function-Based Index
A json_exists
query in a WHERE
clause can
pick up a multivalue function-based index if (and only if) the data that it targets matches
the scalar types specified in the index.
A multivalue function-based index for SQL/JSON condition
json_exists
targets scalar JSON values, either individually or
as elements of a JSON array. You can define a multivalue index only for JSON data
that is stored as JSON
data type.
Condition json_exists
returns true if the data it
targets matches the SQL/JSON path expression (or equivalent simple dot-notation
syntax) in the query. Otherwise it returns false. It is common for the path
expression to include a predicate — matching then requires that the targeted data
satisfy the predicate.
A multivalue index that is defined using a data-type conversion item
method (such as numberOnly()
) that has "only" in its name can be
picked up only by json_exist
queries that also use that same item
method. That is, the query must use the same item method explicitly. See
Creating Multivalue Function-Based Indexes for JSON_EXISTS for more information.
A multivalue index defined using no item method, or using a data-type
conversion item method (such as number()
) that does not have
"only" in its name, can be picked up by a query that targets a scalar value
(possibly as an array element) that can be converted to the type indicated by
the item method. See SQL/JSON Path Expression Item Methods for information about the data-type conversion item methods.
The examples here use SQL/JSON condition json_exists
in
a WHERE
clause to check for a subparts
field value
that matches 730
. They are discussed in terms of whether they can
pick up multivalue indexes mvi
, cmvi_1
, and
cmvi_2
, which are defined in Creating Multivalue Function-Based Indexes for JSON_EXISTS. Conversion of JSON scalar values to SQL scalar values is specified in Table 16-2.
Example 28-17 JSON_EXISTS Query With Item Method numberOnly()
This example uses item method numberOnly()
in a
WHERE
clause. The query can pick up index mvi
when the path expression targets either a numeric
subparts
value of 730
(e.g. subparts :
730
) or an array subparts
value with one or more
numeric elements of 730
(e.g. subparts:[630,
730, 690, 730]
). It cannot pick up index mvi
for targeted string values of "730"
(e.g.
subparts:"730"
or subparts:["630", "730", 690,
"730"]
).
If index mvi
had instead been defined used item method
number()
, then this query could pick up the index for a numeric
subparts
value of 730
, a string
subparts
value of "730"
, or an array
subparts
value with numeric elements of 730
or
string elements of "730"
.
SELECT count(*) FROM parts_tab
WHERE json_exists(jparts, '$.parts.subparts?(@.numberOnly() == 730)');
Example 28-18 JSON_EXISTS Query Without Item Method numberOnly()
These two queries do not use item method
numberOnly()
. The first uses method number()
,
which converts the targeted data to a number, if possible. The second does no type
conversion of the targeted data.
Index mvi
cannot be picked up by either of these queries, even if the targeted data is
the number 730
. For the index to be picked up, a query must
use numberOnly()
, because the index is defined using
numberOnly()
.
SELECT count(*) FROM parts_tab t
WHERE json_exists(jparts, '$.parts.subparts?(@.number() == 730)');
SELECT count(*) FROM parts_tab t
WHERE json_exists(jparts, '$.parts.subparts?(@ == 730)');
Example 28-19 JSON_EXISTS Query Checking Multiple Fields
The predicate in this query specifies the existence of a partno
field that matches the SQL NUMBER
value 4
(possibly by conversion from a JSON string), and a field subparts
that matches the number 730
.
The query can pick up either of the indexes cmvi_1
or
cmvi_2
. Both rows of the data match these indexes, because each
row has a parts.partno
value that matches the number
4
and a parts.subparts
value that matches the
number 730
. For the subparts
match, the first row
has a subparts
value of 730
, and the second row
has a subparts
value that is an array with a value of
730
.
SELECT a FROM parts_tab
WHERE json_exists(jparts,'$.parts[*]?(@.partno == 4 &&
@.subparts == 730)');
Example 28-20 JSON_EXISTS Query Checking Array Element Position
This example is similar to Example 28-19, but in addition to requiring that field partno
match the number 4
, the predicate here requires that the value of
field subparts
match an array of at least two elements, and
that the second element of the array match the number 730
.
This query can pick up index cmvi_2
, including for
positional predicate [1]
. Index cmvi_2
specifies
virtual column subpartNum
, which corresponds to JSON field
subparts
, as the penultimate column, just before the final,
FOR ORDINALITY
, column.
This query could also pick up index cmvi_1
, but that index has no
FOR ORDINALITY
column, so making use of it would require an
extra step, to evaluate the array-position condition, [1]
. Using
index cmvi_2
requires no such extra step, so it is more performant
for such queries.
SELECT a FROM parts_tab
WHERE json_exists(jparts,'$.parts[*]?(@.partno == 4 &&
@.subparts[1] == 730)');
28.10 Indexing Multiple JSON Fields Using a Composite B-Tree Index
To index multiple fields of a JSON object you can create a composite B-tree
index using multiple path expressions with SQL/JSON function json_value
or
dot-notation syntax.
Example 28-21 illustrates this. A SQL query that references the corresponding JSON data (object fields) picks up the composite index. Example 28-22 illustrates this.
Alternatively, you can create virtual columns for the JSON object fields you want to index, and then create a composite B-tree index on those virtual columns. In that case a SQL query that references either the virtual columns or the corresponding JSON data (object fields) picks up the composite index. The query performance is the same in both cases.
The data does not depend logically on any indexes that are implemented to improve query performance. If you want this independence from implementation to be reflected in your code, then query the data directly (not virtual columns). Doing that ensures that the query behaves the same with or without the index — the index serves only to improve performance.
Example 28-21 Creating a Composite B-tree Index For JSON Object Fields
CREATE INDEX user_cost_ctr_idx ON
j_purchaseorder(json_value(po_document, '$.User'
RETURNING VARCHAR2(20),
json_value(po_document, '$.CostCenter'
RETURNING VARCHAR2(6)));
Example 28-22 Querying JSON Data Indexed With a Composite B-tree Index
SELECT po_document FROM j_purchaseorder
WHERE json_value(po_document, '$.User') = 'ABULL'
AND json_value(po_document, '$.CostCenter') = 'A50';
Parent topic: Indexes for JSON Data
28.11 JSON Search Index for Ad Hoc Queries and Full-Text Search
A JSON search index is a general index. It can improve the performance of both (1) ad hoc structural queries, that is, queries that you might not anticipate or use regularly, and (2) full-text search. It is an Oracle Text index that is designed specifically for use with JSON data.
Full-text querying of JSON data is covered in Full-Text Search Queries. The present topic covers the creation and maintenance of JSON search indexes, which are required for full-text search and are also useful for ad hoc queries. Examples of ad hoc queries that are supported by a JSON search index are presented here.
Create a JSON search index for queries that involve full-text search. Create a JSON search index also for queries that aren't particularly expected or used regularly — that is, ad hoc queries. But to index queries for which you know the query pattern ahead of time, it's generally advisable to use a function-based index that targets such a specific pattern. If both function-based and JSON search indexes are applicable to given a query, it is the function-based index that's used.
For JSON data stored as JSON
type, an alternative to creating
and maintaining a JSON search index is to populate the JSON column into the In-Memory Column
Store (IM column store) — see In-Memory JSON Data.
Note:
If you created a JSON search
index using Oracle Database 12c Release 1 (12.1.0.2) then Oracle recommends that you
drop that index and create a new search index for use with later releases,
using CREATE SEARCH INDEX
as shown here.
Note:
You must rebuild any JSON search indexes and Oracle Text indexes created prior to Oracle Database 18c if they index JSON data that contains object fields with names longer than 64 bytes. Otherwise, such fields might not be searchable until they are reindexed. See Oracle Database Upgrade Guide for more information.
You create a JSON search index using CREATE SEARCH INDEX
with
the keywords FOR JSON
. Example 28-23 illustrates this.
The column on which you create a JSON search index can be of data type
JSON
, VARCHAR2
, CLOB
, or
BLOB
. It must be known to contain only well-formed JSON data, which means
that it is either of type JSON
or it has an is json
check
constraint. CREATE SEARCH INDEX
raises an error if the column is not known
to contain JSON data.
If the name of your JSON search index is present in the execution plan for your query, then you know that the index was in fact picked up for that query. You will see a line similar to that shown in Example 28-25.
You can specify a PARAMETERS
clause when creating a
search index, to override the default settings of certain configurable options. By default
(no PARAMETERS
clause), the index is synchronized on commit, and both text
and numeric ranges are indexed.
If your queries that make use of a JSON search index involve only full-text
search or string-equality search, and never involve string-range search or numeric or
temporal search (equality or range), then you can save some index maintenance time and some
disk space by specifying TEXT
for parameter
SEARCH_ON
. The default value of SEARCH_ON
is
TEXT_VALUE
, which means index numeric ranges as well as text.
Also by default, the search index created records and maintains persistent
data-guide information, which requires some maintenance overhead. You can inhibit this
support for persistent data-guide information by specifying DATAGUIDE
OFF
in the PARAMETERS
clause.
A JSON search index is maintained asynchronously. Until it is synchronized, the index is not used for data that has been modified or newly inserted. An index can improve query performance, but the act of synchronizing it with the data affects performance negatively while it occurs. In particular, it can negatively affect DML operations.
There are essentially three ways to synchronize a JSON search index. Each is typically appropriate for a different use case.
-
Synchronize on commit.
This is appropriate when commits are infrequent and it is important that the committed changes be immediately visible to other operations (such as queries). (A stale index can result in uncommitted changes not being visible.) Example 28-23 creates a search index that is synchronized on commit.
-
Synchronize periodically at some interval of time.
For online transaction-processing (OLTP) applications, which require fast and reliable transaction handling with high throughput, and which typically commit each operation, periodic index synchronization is often appropriate. In this case, the synchronization interval is generally greater than the time between commits, and it is not essential that the result of each commit be immediately visible to other operations. Example 28-24 creates a search index that is synchronized each second.
-
Synchronize on demand, for example at a time when database load is reduced.
You generally do this infrequently — the index is synchronized less often than with on-commit or interval synchronizing. This method is typically appropriate when DML performance is particularly important.
If you need to invoke procedures in package CTX_DDL
, such as
CTX_DDL.sync_index
to manually sync the index, then you need privilege
CTXAPP
. To create the index with a synchronization interval, as
opposed to having the index be synchronized on commit, then you need privilege
CREATE JOB
.
Note:
To alter a JSON search index j_s_idx
, you use ALTER
INDEX j_s_idx REBUILD ...
(not
ALTER SEARCH INDEX j_s_idx ...
).
Example 28-23 Creating a JSON Search Index That Is Synchronized On Commit
Synchronization on commit is the default behavior, but you can explicitly
specify it using PARAMETERS ('SYNC (ON
COMMIT)'
.
CREATE SEARCH INDEX po_search_idx ON j_purchaseorder (po_document)
FOR JSON;
Example 28-24 Creating a JSON Search Index That Is Synchronized Each Second
CREATE SEARCH INDEX po_search_1_sec_idx ON j_purchaseorder (po_document)
FOR JSON
PARAMETERS('SYNC (EVERY "FREQ=SECONDLY; INTERVAL=1")')
Example 28-25 Execution Plan Indication that a JSON Search Index Is Used
|* 2| DOMAIN INDEX | PO_SEARCH_IDX | | | 4 (0)
Ad Hoc Queries of JSON Data
Example 28-26 shows some non full-text queries of JSON data that also make use of the JSON search index created in Example 28-23.
Example 28-26 Some Ad Hoc JSON Queries
This query selects documents that contain a shipping instructions address that includes a country.
SELECT po_document FROM j_purchaseorder
WHERE json_exists(po_document,
'$.ShippingInstructions.Address.country');
This query selects documents that contain user AKHOO
where
there are more than 8 items ordered. It takes advantage of numeric-range
indexing.
SELECT po_document FROM j_purchaseorder
WHERE json_exists(po_document, '$?(@.User == "AKHOO"
&& @.LineItems.Quantity > 8)');
This query selects documents where the user is AKHOO
. It uses
json_value
instead of json_exists
in the
WHERE
clause.
SELECT po_document FROM j_purchaseorder
WHERE json_value(po_document, '$.User') = 'AKHOO';
Related Topics
See Also:
-
Oracle Text Reference for information about the
PARAMETERS
clause forCREATE SEARCH INDEX
-
Oracle Text Reference for information about the
PARAMETERS
clause forALTER INDEX
...REBUILD
-
CREATE INDEX in Oracle Text Reference for information about synchronizing a JSON search index
-
Oracle Text Application Developer's Guide for guidance about optimizing and tuning the performance of a JSON search index
Parent topic: Indexes for JSON Data