Using SQL commands
An index can be created using the CREATE INDEX
command.
Create a single field index:
CREATE INDEX fixedschema_conf ON baggageInfo(confNo)
The above is an example of a single-column fixed schema index. The index
is created on the confNo
field having string
data
type in the baggageInfo
table.
Create a composite index:
CREATE INDEX compindex_namephone ON baggageInfo(fullName,contactPhone)
Note:
You can have one or more fields of this index as fixed schema columns.Create a JSON index:
An index is called a JSON index if at least one of the fields is inside JSON data. As JSON is schema-less, the data type of an indexed JSON field may be different across rows. When creating an index on JSON fields, if you are unsure what data type to expect for the JSON field, you may use the anyAtomic data type. Alternatively, you can specify one of the Oracle NoSQL Database atomic data types. You do that by declaring a data type using the AS keyword next to every index path into the JSON field.
CREATE INDEX jsonindex_tagnum ON baggageInfo(bagInfo[].tagnum as INTEGER)
The above is an example of a JSON index. The index is created on the
tagnum
field present in the baginfo
JSON field
in the baggageInfo
table. Notice that you provide a data type for
the tagnum
field while creating the index.
The creation of a JSON index will fail if the associated table contains any rows with data that violate the declared data type. Similarly, after creating a JSON index, an insert/update operation will fail if the new row does not conform to the declared data type in the JSON index.
CREATE INDEX jsonindex_routing ON baggageInfo(bagInfo[].routing as ANYATOMIC)
- Numbers
- String
- boolean
However, this advantage is offset by space and CPU costs. It is because numeric values of any kind in the indexed field will be cast to Number before being stored in the index. This cast takes CPU time, and the resulting storage for the number will be larger than the original storage for the number.
Create an Index on JSON Collection Table
Indexing the fields in the JSON collection table is similar to creating JSON indexes. You must specify the name (along with the path expression) and ANYATOMIC for the type definition, or, for strongly typed indexes, you can specify the JSON type of the fields being indexed.
If you are indexing a top-level JSON field in the document, the field name is the path expression. If the element is deeply nested in a JSON object, you specify the complete path name. In either case, the data type for every index must be specified. It is recommended to use ANYATOMIC in the JSON collection tables for more flexibility.
CREATE INDEX idx_ntfy_cty on storeAcct (address.city as ANYATOMIC, notify as ANYATOMIC)
In the composite index above, the notify
field is a
top-level field of the storeAcct
table, which can be indexed by
specifying the field name as the path. The city
field is nested in
the address
field and must be indexed using the path
expression.
Note:
If you are creating an index on a nested JSON field, the field must be present in all the rows of the table. Otherwise, an error is displayed.Create a simple index:
An index is called a simple index if, for each row of data in the table, there is one entry created in the index. The index will return a single value that is of atomic data type or any special value (SQL NULL, JSON NULL, EMPTY). Essentially, the index paths of a simple index must not return an array or map or a nested data type.
CREATE INDEX simpleindex_arrival ON baggageInfo(bagInfo[].lastSeenTimeGmt as ANYATOMIC,
bagInfo[].bagArrivalDate as ANYATOMIC, bagInfo[].lastSeenTimeStation as ANYATOMIC)
The above is an example of a simple index created on a JSON document in
a JSON field. The index is created on the lastSeenTimeGmt
and
bagArrivalDate
and lastSeenTimeStation
, all
from the bagInfo
JSON document in the info JSON field in the
baggageInfo
table. If the evaluation of a simple index path
returns an empty result, the special value EMPTY is used as an index entry. In the
above example, If there is no lastSeenTimeGmt
or
bagArrivalDate
or lastSeenTimeStation
entry in
the bagInfo
JSON document, or if there is no
bagInfo
JSON array, then the special value EMPTY is indexed.
Create a multikey index:
An index is called a multikey index if, for each row of data in the table, there are multiple entries created in the index. In a multikey index, there is at least one index path that uses an array or a nested data type. In a multikey index, for each table row, index entries are created on all the elements in arrays that are being indexed.
CREATE INDEX multikeyindex1 ON stream_acct (acct_data.contentStreamed[].seriesInfo[] AS ANYATOMIC)
The index is created on the seriesInfo[]
array in the
stream_acct table. Here, all the elements in the seriesInfo[]
array
in each row of the stream_acct
table will be indexed.
Example 2: Nested multikey index: Create an index on the episode details array of the streaming account application.
CREATE INDEX multikeyindex2 ON stream_acct (
acct_data.contentStreamed[].seriesInfo[].episodes[] AS ANYATOMIC)
The above is an example of a nested multikey index where the field is
present in an array that is present inside another array. The index is created on
the episodes[]
array in the seriesInfo[]
array in
the acct_data
JSON of the stream_acct
table.
Example 3: Composite multikey index:
CREATE INDEX multikeyindex3 ON stream_acct (acct_data.country AS ANYATOMIC,
acct_data.contentStreamed[].seriesInfo[].episodes[] AS ANYATOMIC)
The above is an example of a composite multikey index having one
multikey index path and one simple index path. The index is created on the
country
field and episodes[]
array in the
acct_data
JSON column of the stream_acct
table.
See Specifications & Restrictions on Multikey index to learn about restrictions on multikey index.
Create an index with NO NULLS clause
CREATE INDEX nonull_phone ON baggageInfo (contactPhone) WITH NO NULLS
- The above query creates an index on the phone number of the passengers. If some passengers do not have a phone number then those fields will not be part of the index.
- The indexes that are created with the WITH NO NULLS clause may be useful when the data contain a lot of NULL and/or EMPTY values on the indexed fields. It will reduce the time and space overhead during indexing.
- However, the use of such indexes by queries is restricted. If an index is created with the WITH NO NULLS clause, IS NULL, and NOT EXISTS predicates cannot be used as index predicates for that index.
- In fact, such an index can be used by a query only if the query has an index predicate for each of the indexed fields.
Create an index with unique keys per row
CREATE INDEX idx_showid ON
stream_acct(acct_data.contentStreamed[].showId AS INTEGER)
WITH UNIQUE KEYS PER ROW
In the above query, an index is created on showId
and
there cannot be duplicate showId
for a single
contentStreamed
array. This informs the query processor that
for any streaming user, the contentStreamed
array cannot contain
two or more shows with the same show id. The restriction is necessary because if
duplicate show ids existed, they wouldn’t be included in the index. If you insert a
row with the same showId
two or more items in a single
contentStreamed
array, an error is thrown and the insert
operation is not successful.
Optimization in the query run time :
When you create an index with unique keys per row, the index would
contain fewer entries than the number of elements in the
contentStreamed
array. You could write an efficient query to
use this index. The use of such an index by the query would yield fewer results from
the FROM clause than if the index was not used.
Examples of creating indexes on functions:
BaggageInfo
table by their latest modification
time:CREATE INDEX idx_modtime ON BaggageInfo(modification_time())
modification_time
as the filter
condition.SELECT * FROM BaggageInfo $u WHERE
modification_time($u) > "2019-08-01T10:45:00"
This query returns all the rows whose most recent modification time is
after 2019-08-01T10:45:00. It uses the idx_modtime
index defined
above. You can verify this by viewing the query plan using the show
query
command.
BaggageInfo
table on the length of the routing
field.CREATE INDEX idx_routlen ON BaggageInfo (length(bagInfo[].routing as string))
length
as
the filter
condition.SELECT * from BaggageInfo $bag where length($bag.bagInfo[].routing) > 10
This query returns all the rows whose length of the routing field is
greater than 10. It uses the idx_routlen
index defined above. You
can verify this by viewing the query plan using the show query
command.
Example 3: Using a multi-key index path
stream_acct
table by the id of the shows they watch and the
year and month of the dates when the show was
watched.CREATE INDEX idx_showid_year_month ON
stream_acct(acct_data.contentStreamed[].showId AS INTEGER,
substring(acct_data.contentStreamed[].seriesInfo[].episodes[].date AS STRING,0, 4),
substring(acct_data.contentStreamed[].seriesInfo[].episodes[].date AS STRING,5, 2))
SELECT count(*) FROM stream_acct s1 WHERE EXISTS
s1.acct_data.contentStreamed[$element.showId = 16].seriesInfo.
episodes[substring($element.date, 0, 4) = "2022"]
idx_showid_year_month
.
You can verify this by viewing the query plan using the show query
command.show query SELECT count(*) FROM stream_acct s1 WHERE EXISTS
> s1.acct_data.contentStreamed[$element.showId = 16].seriesInfo.episodes[substring($element.date, 0, 4) = "2022"]
{
"iterator kind" : "GROUP",
"input variable" : "$gb-1",
"input iterator" :
{
"iterator kind" : "RECEIVE",
"distribution kind" : "ALL_SHARDS",
"distinct by fields at positions" : [ 1 ],
"input iterator" :
{
"iterator kind" : "SELECT",
"FROM" :
{
"iterator kind" : "TABLE",
"target table" : "stream_acct",
"row variable" : "$$s1",
"index used" : "idx_showid_year_month",
"covering index" : true,
"index row variable" : "$$s1_idx",
"index scans" : [
{
"equality conditions" : {"acct_data.contentStreamed[].showId":16,"substring#acct_data.contentStreamed[].seriesInfo[].episodes[].date@,0,4":"2022"},
"range conditions" : {}
}
]
},
"FROM variable" : "$$s1_idx",
"SELECT expressions" : [
{
"field name" : "Column_1",
"field expression" :
{
"iterator kind" : "CONST",
"value" : 1
}
},
{
"field name" : "acct_id_gen",
"field expression" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "#acct_id",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$s1_idx"
}
}
}
]
}
},
"grouping expressions" : [
],
"aggregate functions" : [
{
"iterator kind" : "FUNC_COUNT_STAR"
}
]
}