Using SQL commands

An index can be created using the CREATE INDEX command.

Create a single field index:

Example: Create an index on passengers reservation code.
CREATE INDEX fixedschema_conf ON baggageInfo(confNo)

The above is an example of a single-column fixed schema index. The index is created on the confNo field having string data type in the baggageInfo table.

Create a composite index:

Example : Create an index on the full name and phone number of passengers.
CREATE INDEX compindex_namephone ON baggageInfo(fullName,contactPhone)
The above is an example of a composite index. The index is created on two fields in the baggageInfo schema, on full name and the contact phone number.

Note:

You can have one or more fields of this index as fixed schema columns.

Create a JSON index:

An index is called a JSON index if at least one of the fields is inside JSON data. As JSON is schema-less, the data type of an indexed JSON field may be different across rows. When creating an index on JSON fields, if you are unsure what data type to expect for the JSON field, you may use the anyAtomic data type. Alternatively, you can specify one of the Oracle NoSQL Database atomic data types. You do that by declaring a data type using the AS keyword next to every index path into the JSON field.

Example 1: Create an index on the tag number of passengers bags.
CREATE INDEX jsonindex_tagnum ON baggageInfo(bagInfo[].tagnum as INTEGER)

The above is an example of a JSON index. The index is created on the tagnum field present in the baginfo JSON field in the baggageInfo table. Notice that you provide a data type for the tagnum field while creating the index.

The creation of a JSON index will fail if the associated table contains any rows with data that violate the declared data type. Similarly, after creating a JSON index, an insert/update operation will fail if the new row does not conform to the declared data type in the JSON index.

Example 2: Create an index on the route of passengers.
CREATE INDEX jsonindex_routing ON baggageInfo(bagInfo[].routing as ANYATOMIC)
Declaring a JSON index path as anyAtomic has the advantage of allowing the indexed JSON field to have values of various data types. The index entries are sorted in ascending order. When these values are stored in the index, they are sorted as follows:
  • Numbers
  • String
  • boolean

However, this advantage is offset by space and CPU costs. It is because numeric values of any kind in the indexed field will be cast to Number before being stored in the index. This cast takes CPU time, and the resulting storage for the number will be larger than the original storage for the number.

Create an Index on JSON Collection Table

Indexing the fields in the JSON collection table is similar to creating JSON indexes. You must specify the name (along with the path expression) and ANYATOMIC for the type definition, or, for strongly typed indexes, you can specify the JSON type of the fields being indexed.

If you are indexing a top-level JSON field in the document, the field name is the path expression. If the element is deeply nested in a JSON object, you specify the complete path name. In either case, the data type for every index must be specified. It is recommended to use ANYATOMIC in the JSON collection tables for more flexibility.

Example : Create a composite index on the JSON collection table created for a shopping application.
CREATE INDEX idx_ntfy_cty on storeAcct (address.city as ANYATOMIC, notify as ANYATOMIC)

In the composite index above, the notify field is a top-level field of the storeAcct table, which can be indexed by specifying the field name as the path. The city field is nested in the address field and must be indexed using the path expression.

Note:

If you are creating an index on a nested JSON field, the field must be present in all the rows of the table. Otherwise, an error is displayed.

Create a simple index:

An index is called a simple index if, for each row of data in the table, there is one entry created in the index. The index will return a single value that is of atomic data type or any special value (SQL NULL, JSON NULL, EMPTY). Essentially, the index paths of a simple index must not return an array or map or a nested data type.

Example: Create an index on three fields, when the bag was last seen, the last seen station and the arrival date and time.
CREATE INDEX simpleindex_arrival ON baggageInfo(bagInfo[].lastSeenTimeGmt as ANYATOMIC,
bagInfo[].bagArrivalDate as ANYATOMIC, bagInfo[].lastSeenTimeStation as ANYATOMIC)

The above is an example of a simple index created on a JSON document in a JSON field. The index is created on the lastSeenTimeGmt and bagArrivalDate and lastSeenTimeStation, all from the bagInfo JSON document in the info JSON field in the baggageInfo table. If the evaluation of a simple index path returns an empty result, the special value EMPTY is used as an index entry. In the above example, If there is no lastSeenTimeGmt or bagArrivalDate or lastSeenTimeStation entry in the bagInfo JSON document, or if there is no bagInfo JSON array, then the special value EMPTY is indexed.

Create a multikey index:

An index is called a multikey index if, for each row of data in the table, there are multiple entries created in the index. In a multikey index, there is at least one index path that uses an array or a nested data type. In a multikey index, for each table row, index entries are created on all the elements in arrays that are being indexed.

Example 1: Multikey index: Create an index on the series info array of the streaming account application.
CREATE INDEX multikeyindex1 ON stream_acct (acct_data.contentStreamed[].seriesInfo[] AS ANYATOMIC)

The index is created on the seriesInfo[] array in the stream_acct table. Here, all the elements in the seriesInfo[] array in each row of the stream_acct table will be indexed.

Example 2: Nested multikey index: Create an index on the episode details array of the streaming account application.

An index is a nested multikey index if it is created on a field that is present inside an array which in turn is present inside another array.
CREATE INDEX multikeyindex2 ON stream_acct (
    acct_data.contentStreamed[].seriesInfo[].episodes[]  AS ANYATOMIC)

The above is an example of a nested multikey index where the field is present in an array that is present inside another array. The index is created on the episodes[] array in the seriesInfo[] array in the acct_data JSON of the stream_acct table.

Example 3: Composite multikey index:

An index is called a composite multikey index if it is created on more than one field, and at least one of those fields is multikey. A composite multikey index may have a combination of multikey index paths and simple index paths.
CREATE INDEX multikeyindex3 ON stream_acct (acct_data.country AS ANYATOMIC,
acct_data.contentStreamed[].seriesInfo[].episodes[]  AS ANYATOMIC)

The above is an example of a composite multikey index having one multikey index path and one simple index path. The index is created on the country field and episodes[]array in the acct_data JSON column of the stream_acct table.

See Specifications & Restrictions on Multikey index to learn about restrictions on multikey index.

Create an index with NO NULLS clause

You can create an index with the optional WITH NO NULLS clause. In that case, the rows with NULL and/or EMPTY values on the indexed fields will not be indexed.
CREATE INDEX nonull_phone ON baggageInfo (contactPhone) WITH NO NULLS
  • The above query creates an index on the phone number of the passengers. If some passengers do not have a phone number then those fields will not be part of the index.
  • The indexes that are created with the WITH NO NULLS clause may be useful when the data contain a lot of NULL and/or EMPTY values on the indexed fields. It will reduce the time and space overhead during indexing.
  • However, the use of such indexes by queries is restricted. If an index is created with the WITH NO NULLS clause, IS NULL, and NOT EXISTS predicates cannot be used as index predicates for that index.
  • In fact, such an index can be used by a query only if the query has an index predicate for each of the indexed fields.

Create an index with unique keys per row

You can create an index with unique keys per row property.
CREATE INDEX idx_showid ON 
stream_acct(acct_data.contentStreamed[].showId AS INTEGER)
WITH UNIQUE KEYS PER ROW

In the above query, an index is created on showId and there cannot be duplicate showId for a single contentStreamed array. This informs the query processor that for any streaming user, the contentStreamed array cannot contain two or more shows with the same show id. The restriction is necessary because if duplicate show ids existed, they wouldn’t be included in the index. If you insert a row with the same showId two or more items in a single contentStreamed array, an error is thrown and the insert operation is not successful.

Optimization in the query run time :

When you create an index with unique keys per row, the index would contain fewer entries than the number of elements in the contentStreamed array. You could write an efficient query to use this index. The use of such an index by the query would yield fewer results from the FROM clause than if the index was not used.

Examples of creating indexes on functions:

Example 1: Create an index which indexes the rows of the BaggageInfo table by their latest modification time:
CREATE INDEX idx_modtime ON BaggageInfo(modification_time())
This index will be used in a query which has modification_time as the filter condition.
SELECT * FROM BaggageInfo $u WHERE 
modification_time($u) > "2019-08-01T10:45:00"

This query returns all the rows whose most recent modification time is after 2019-08-01T10:45:00. It uses the idx_modtime index defined above. You can verify this by viewing the query plan using the show query command.

Example 2: Create an index which indexes the rows of the BaggageInfo table on the length of the routing field.
CREATE INDEX idx_routlen ON BaggageInfo (length(bagInfo[].routing as string))
This index will be used in a query which has length as the filter condition.
SELECT * from BaggageInfo $bag where length($bag.bagInfo[].routing) > 10

This query returns all the rows whose length of the routing field is greater than 10. It uses the idx_routlen index defined above. You can verify this by viewing the query plan using the show query command.

Example 3: Using a multi-key index path

In the following example, you index the users in the stream_acct table by the id of the shows they watch and the year and month of the dates when the show was watched.
CREATE INDEX idx_showid_year_month ON 
stream_acct(acct_data.contentStreamed[].showId AS INTEGER,
substring(acct_data.contentStreamed[].seriesInfo[].episodes[].date AS STRING,0, 4),
substring(acct_data.contentStreamed[].seriesInfo[].episodes[].date AS STRING,5, 2))
An example of a query using this index is shown below. The query counts the number of users who watched any episode of show 16 in the year 2022.
SELECT count(*) FROM stream_acct s1 WHERE EXISTS 
s1.acct_data.contentStreamed[$element.showId = 16].seriesInfo.
episodes[substring($element.date, 0, 4) = "2022"]
This query will use the index idx_showid_year_month. You can verify this by viewing the query plan using the show query command.
show query SELECT count(*) FROM stream_acct s1 WHERE EXISTS
> s1.acct_data.contentStreamed[$element.showId = 16].seriesInfo.episodes[substring($element.date, 0, 4) = "2022"]

{
  "iterator kind" : "GROUP",
  "input variable" : "$gb-1",
  "input iterator" :
  {
    "iterator kind" : "RECEIVE",
    "distribution kind" : "ALL_SHARDS",
    "distinct by fields at positions" : [ 1 ],
    "input iterator" :
    {
      "iterator kind" : "SELECT",
      "FROM" :
      {
        "iterator kind" : "TABLE",
        "target table" : "stream_acct",
        "row variable" : "$$s1",
        "index used" : "idx_showid_year_month",
        "covering index" : true,
        "index row variable" : "$$s1_idx",
        "index scans" : [
          {
            "equality conditions" : {"acct_data.contentStreamed[].showId":16,"substring#acct_data.contentStreamed[].seriesInfo[].episodes[].date@,0,4":"2022"},
            "range conditions" : {}
          }
        ]
      },
      "FROM variable" : "$$s1_idx",
      "SELECT expressions" : [
        {
          "field name" : "Column_1",
          "field expression" :
          {
            "iterator kind" : "CONST",
            "value" : 1
          }
        },
        {
          "field name" : "acct_id_gen",
          "field expression" :
          {
            "iterator kind" : "FIELD_STEP",
            "field name" : "#acct_id",
            "input iterator" :
            {
              "iterator kind" : "VAR_REF",
              "variable" : "$$s1_idx"
            }
          }
        }
      ]
    }
  },
  "grouping expressions" : [

  ],
  "aggregate functions" : [
    {
      "iterator kind" : "FUNC_COUNT_STAR"
    }
  ]
}