CREATE INDEX

To add an index definition to the store, use a CREATE INDEX statement. It can be used to create simple indexes and multi-key indexes. It can also be used to create JSON indexes.

Indexable Field Types

Fields can be indexed only if they are declared to be one of the following types. For all complex types (arrays, maps, and records), the field can be indexed if the ultimate target of the index is a scalar datatype. So a complex type that contains a nested complex type (such as an array of records, for example) can be indexed if the index's target is a scalar datatype contained by the embedded record.

  • Integer

  • Long

  • Number

  • Float

  • Double

  • Json

    Note that there are some differences and restrictions on indexing Json data versus other data types. See JSON Indexes for more information.

  • String

  • Enum

  • Array

    In the case of arrays, the field can be indexed only if the array contains values that are of one of the other indexable scalar types. For example, you can create an index on an array of Integers. You can also create an index on a specific record in an array of records. Only one array can participate in an index, otherwise the size of the index can grow exponentially because there is an index entry for each array entry.

  • Maps

    As is the case with Arrays, you can index a map if the map contains scalar types, or if the map contains a record that contains scalar types.

  • Records

    Like Arrays and Maps, you can index fields in an embedded record if the field contains scalar data.

Simple Indexes

An index is simple if it does not index any maps or arrays. To create a simple index:

CREATE INDEX [IF NOT EXISTS] index-name ON table-name (path_list)         

where:

  • IF NOT EXISTS is optional, and it causes the CREATE INDEX statement to be ignored if an index by that name currently exists. If this phrase is not specified, and an index using the specified name does currently exist, then the CREATE INDEX statement will fail with an error.

  • index-name is the name of the index you want to create.

  • table-name is the name of the table that you want to index.

  • path_list is a comma-separated list of one or more name_paths. A name_path refers to an element of a table. Normally these are schema fields that is, field names that appear in the CREATE TABLE expression used to create the associated table.

    However, if the table contains a record, then the name_path may be record keys that use dot-notation to identify a record field. For example:

    CREATE TABLE example (
            id INTEGER,
            myRecord RECORD(field_one STRING, field_two STRING),
            PRIMARY KEY (id)
        ) 

    An index can then be created on field_one by using the name_path of myRecord.field_one. See Indexing Embedded Records for a more detailed explanation of indexing records.

For example, if table Users has a field called lastName, then you can index that field with the following statement:

CREATE INDEX surnameIndex ON Users (lastName)

Note that depending on the amount of data in your store, creating indexes can take a long time. This is because index creation requires Oracle NoSQL Database to examine all the data in the store.

Multi-Key Indexes

Multi-key indexes are used to index all the elements of an array. They are also used to index all of the elements and/or values of a map.

For each table row, a multi-key index contains as many entries as the number of elements/entries in the array/map that is being indexed (although duplicate entries are not represented in the index). To avoid an explosion in the number of index entries, only one array/map may be contained in a single multi-key index.

To create a multi-key index, use one of the following forms:

CREATE INDEX [IF NOT EXISTS] index-name ON table-name (name-path.keys())

or

CREATE INDEX [IF NOT EXISTS] index-name ON table-name (name-path.values())

or

CREATE INDEX [IF NOT EXISTS] index-name ON table-name \
(name-path.keys(),name-path.values())

or

CREATE INDEX [IF NOT EXISTS] index-name ON table-name (name-path[])

The syntax shown, above, is identical to that described in Simple Indexes, with the following additions:

  • .keys()

    The index is created on the keys in a map. If used, name-path must be a map.

  • .values()

    The index is created on the values in a map. If used, name-path must be a map.

  • []

    The index is created on an array. If used, name-path must be array.

For each of the previously identified forms, a comma-seperated list of name-paths may be provided. Some restrictions apply.

Multi-Key Index Restrictions

The following restrictions apply to multi-key indexes:

  • There is at least one name-path that uses a multi-key step (.keys(), .values(), or []). Any such path is called a multi-key path, and the associated index field a multi-key field. The index definition may contain more than one multi-key path, but all multi-key paths must use the same name-path before their multi-key step.

  • Any non-multi-key paths must be simple paths.

  • The combined path specified for the index must contain at least one map and/or array. These must contain indexable atomic items, or record items, or map items. That is, an index of an array of arrays is not supported, nor is an index of maps containing arrays.

    For example, given the following table definition:

    create table Foo (
    id INTEGER,
    complex1 RECORD(mapField MAP(ARRAY(MAP(INTEGER)))),
    complex2 RECORD(matrix ARRAY(ARRAY(RECORD(a LONG, b LONG))) 
    primary key(id)
    ) 

    The path expression complex2.matrix[] is not valid, because the result of this path expression is a sequence of arrays, not atomic items. Neither is complex2.matrix[][].a valid, because you cannot index arrays inside other arrays (in fact this path will raise a syntax error, because the syntax allows at most one [] per index path).

    On the other hand, the path complex1.mapField.someKey[].someOtherKey is valid. In this case, the path complex1.mapField.someKey specifies an array containing maps, which is valid. Notice that in this index path, someKey and someOtherKey are map-entry keys. So, although we are indexing arrays that are contained inside maps, and the arrays being indexed contain maps, the path is valid, because it is selecting specific entries from the map, rather than indexing all the map entries in addition to all the array entries.

  • If the index is indexing an array-valued field:

    • If the array contains indexable atomic items:

      • There must be a single multi-key index path of the form M[] (without any name_path following after the []). Again, this implies that you cannot index more than one array in the same index.

      • For each table row (R), a number of index entries are created as follows:

        The simple index paths (if any) are computed on R.

        Then, M[] is computed (as if it were a query path expression), returning either NULL, or EMPTY, or all the elements of the array returned by M.

        Finally, for each value (V) returned by M[], an index entry is created whose field values are V and the values of the simple paths.

      • Any duplicate index entries (having equal field values and the same primary key) created by the above process are eliminated.

    • If the array contains records or maps:

      • All of the multi-key paths must be of the form M[].name_path. Each name_path appearing after M[] in the multi-key index path must return at most one indexable atomic item.

      • For each table row (R), a number of index entries are created as follows:

        The simple index paths (if any) are computed on R.

        Then, M[] is computed (as if it were a query path expression), returning either NULL, or EMPTY, or all the elements of the array returned by M.

        Next, for each value (V) returned by M[], one index entry is created as follows:

        The elements contained in each V are computed. This returns a single indexable atomic item (which may be the NULL or EMPTY item). An index entry is created for each of these, whose field values are the values of the simple index paths plus the values found for element contained in V.

      • Any duplicate index entries (having equal field values and the same primary key) created by the above process are eliminated.

    • If the index is indexing a map-valued field, the index may be indexing only map keys, or only map elements, or both keys and elements. In all cases, the definition of map indexes can be given in terms of array indexes, by viewing maps as arrays containing records with 2 fields: a field with name “key” and value a map key, and a field named “element” and value the corresponding map element (that is, MAP(T) is viewed as ARRAY(RECORD(key STRING, element T))). Then, the 2 valid kinds for map indexes are:

      1. A single multi-key index path using a keys() step. Using the array view of maps, M.keys() is equivalent to M[].key.

      2. One or more multi-key index paths, all using a .values() step. If Ri is an value contained in the map, then each of these has the form M.values().Ri. Using the array view of maps, each M.values().Ri path is equivalent to M[].element.Ri.

JSON Indexes

An index is a JSON index if it indexes at least one field that is contained inside JSON data.

Because JSON is schema-less, it is possible for JSON data to differ in type across table rows. However, when indexing JSON data, the data type must be consistent across table rows or the index creation will fail. Further, once or more JSON indexes have been created, any attempt to write data of an incorrect type will fail.

Indexing JSON data and working with JSON indexes is performed in much the same way as indexing non-JSON data. To create the index, specify a path to the JSON field using dot notation.

When creating JSON indexes, you must specify the data's type, using the AS keyword. The data type must be atomic, and cannot be a float. That is, only integer, long, double, number, string, and boolean are supported types for JSON indexes. Note that arrays and maps can be indexed so long as they contain these atomic values.

CREATE INDEX [IF NOT EXISTS] index-name ON table-name \
(JSONRow.JSONField AS data_type)

When creating a multi-key index on a JSON map, a type must not be given for the .keys() expression because the type will always be String. However, a type declaration is required for the .values() expression. Beyond that, all the constraints described in Multi-Key Index Restrictions also apply to a JSON multi-keyed index.

CREATE INDEX [IF NOT EXISTS] index-name ON table-name \
(JSONRow.JSONField.keys(),\
    JSONRow.JSONField.values() AS data_type)

For an example of using JSON indexes, see Indexing JSON Fields.

For additional examples of using JSON indexes, see Indexing JSON Data in the SQL Beginner's Guide.