Using APIs to create an index

You can create an index for a NoSQL table using SQL commands or using TableRequest API.

Related Topics

Using SQL commands

An index can be created using the CREATE INDEX command.

Create a single field index:

Example: Create an index on passengers reservation code.
CREATE INDEX fixedschema_conf ON baggageInfo(confNo)

The above is an example of a single-column fixed schema index. The index is created on the confNo field having string data type in the baggageInfo table.

Create a composite index:

Example : Create an index on the full name and phone number of passengers.
CREATE INDEX compindex_namephone ON baggageInfo(fullName,contactPhone)
The above is an example of a composite index. The index is created on two fields in the baggageInfo schema, on full name and the contact phone number.

Note:

You can have one or more fields of this index as fixed schema columns.

Create a JSON index:

An index is called a JSON index if at least one of the fields is inside JSON data. As JSON is schema-less, the data type of an indexed JSON field may be different across rows. When creating an index on JSON fields, if you are unsure what data type to expect for the JSON field, you may use the anyAtomic data type. Alternatively, you can specify one of the Oracle NoSQL Database atomic data types. You do that by declaring a data type using the AS keyword next to every index path into the JSON field.

Example 1: Create an index on the tag number of passengers bags.
CREATE INDEX jsonindex_tagnum ON baggageInfo(bagInfo[].tagnum as INTEGER)

The above is an example of a JSON index. The index is created on the tagnum field present in the baginfo JSON field in the baggageInfo table. Notice that you provide a data type for the tagnum field while creating the index.

The creation of a JSON index will fail if the associated table contains any rows with data that violate the declared data type. Similarly, after creating a JSON index, an insert/update operation will fail if the new row does not conform to the declared data type in the JSON index.

Example 2: Create an index on the route of passengers.
CREATE INDEX jsonindex_routing ON baggageInfo(bagInfo[].routing as ANYATOMIC)
Declaring a JSON index path as anyAtomic has the advantage of allowing the indexed JSON field to have values of various data types. The index entries are sorted in ascending order. When these values are stored in the index, they are sorted as follows:
  • Numbers
  • String
  • boolean

However, this advantage is offset by space and CPU costs. It is because numeric values of any kind in the indexed field will be cast to Number before being stored in the index. This cast takes CPU time, and the resulting storage for the number will be larger than the original storage for the number.

Create a simple index:

An index is called a simple index if, for each row of data in the table, there is one entry created in the index. The index will return a single value that is of atomic data type or any special value (SQL NULL, JSON NULL, EMPTY). Essentially, the index paths of a simple index must not return an array or map or a nested data type.

Example: Create an index on three fields, when the bag was last seen, the last seen station and the arrival date and time.
CREATE INDEX simpleindex_arrival ON baggageInfo(bagInfo[].lastSeenTimeGmt as ANYATOMIC,
bagInfo[].bagArrivalDate as ANYATOMIC, bagInfo[].lastSeenTimeStation as ANYATOMIC)

The above is an example of a simple index created on a JSON document in a JSON field. The index is created on the lastSeenTimeGmt and bagArrivalDate and lastSeenTimeStation, all from the bagInfo JSON document in the info JSON field in the baggageInfo table. If the evaluation of a simple index path returns an empty result, the special value EMPTY is used as an index entry. In the above example, If there is no lastSeenTimeGmt or bagArrivalDate or lastSeenTimeStation entry in the bagInfo JSON document, or if there is no bagInfo JSON array, then the special value EMPTY is indexed.

Create a multikey index:

An index is called a multikey index if, for each row of data in the table, there are multiple entries created in the index. In a multikey index, there is at least one index path that uses an array or a nested data type. In a multikey index, for each table row, index entries are created on all the elements in arrays that are being indexed.

Example 1: Multikey index: Create an index on the series info array of the streaming account application.
CREATE INDEX multikeyindex1 ON stream_acct (acct_data.contentStreamed[].seriesInfo[] AS ANYATOMIC)

The index is created on the seriesInfo[] array in the stream_acct table. Here, all the elements in the seriesInfo[] array in each row of the stream_acct table will be indexed.

Example 2: Nested multikey index: Create an index on the episode details array of the streaming account application.

An index is a nested multikey index if it is created on a field that is present inside an array which in turn is present inside another array.
CREATE INDEX multikeyindex2 ON stream_acct (
    acct_data.contentStreamed[].seriesInfo[].episodes[]  AS ANYATOMIC)

The above is an example of a nested multikey index where the field is present in an array that is present inside another array. The index is created on the episodes[] array in the seriesInfo[] array in the acct_data JSON of the stream_acct table.

Example 3: Composite multikey index:

An index is called a composite multikey index if it is created on more than one field, and at least one of those fields is multikey. A composite multikey index may have a combination of multikey index paths and simple index paths.
CREATE INDEX multikeyindex3 ON stream_acct (acct_data.country AS ANYATOMIC,
acct_data.contentStreamed[].seriesInfo[].episodes[]  AS ANYATOMIC)

The above is an example of a composite multikey index having one multikey index path and one simple index path. The index is created on the country field and episodes[]array in the acct_data JSON column of the stream_acct table.

See Specifications & Restrictions on Multikey index to learn about restrictions on multikey index.

Create an index with NO NULLS clause

You can create an index with the optional WITH NO NULLS clause. In that case, the rows with NULL and/or EMPTY values on the indexed fields will not be indexed.
CREATE INDEX nonull_phone ON baggageInfo (contactPhone) WITH NO NULLS
  • The above query creates an index on the phone number of the passengers. If some passengers do not have a phone number then those fields will not be part of the index.
  • The indexes that are created with the WITH NO NULLS clause may be useful when the data contain a lot of NULL and/or EMPTY values on the indexed fields. It will reduce the time and space overhead during indexing.
  • However, the use of such indexes by queries is restricted. If an index is created with the WITH NO NULLS clause, IS NULL, and NOT EXISTS predicates cannot be used as index predicates for that index.
  • In fact, such an index can be used by a query only if the query has an index predicate for each of the indexed fields.

Create an index with unique keys per row

You can create an index with unique keys per row property.
CREATE INDEX idx_showid ON 
stream_acct(acct_data.contentStreamed[].showId AS INTEGER)
WITH UNIQUE KEYS PER ROW

In the above query, an index is created on showId and there cannot be duplicate showId for a single contentStreamed array. This informs the query processor that for any streaming user, the contentStreamed array cannot contain two or more shows with the same show id. The restriction is necessary because if duplicate show ids existed, they wouldn’t be included in the index. If you insert a row with the same showId two or more items in a single contentStreamed array, an error is thrown and the insert operation is not successful.

Optimization in the query run time :

When you create an index with unique keys per row, the index would contain fewer entries than the number of elements in the contentStreamed array. You could write an efficient query to use this index. The use of such an index by the query would yield fewer results from the FROM clause than if the index was not used.

Examples of creating indexes on functions:

Example 1: Create an index which indexes the rows of the BaggageInfo table by their latest modification time:
CREATE INDEX idx_modtime ON BaggageInfo(modification_time())
This index will be used in a query which has modification_time as the filter condition.
SELECT * FROM BaggageInfo $u WHERE 
modification_time($u) > "2019-08-01T10:45:00"

This query returns all the rows whose most recent modification time is after 2019-08-01T10:45:00. It uses the idx_modtime index defined above. You can verify this by viewing the query plan using the show query command.

Example 2: Create an index which indexes the rows of the BaggageInfo table on the length of the routing field.
CREATE INDEX idx_routlen ON BaggageInfo (length(bagInfo[].routing as string))
This index will be used in a query which has length as the filter condition.
SELECT * from BaggageInfo $bag where length($bag.bagInfo[].routing) > 10

This query returns all the rows whose length of the routing field is greater than 10. It uses the idx_routlen index defined above. You can verify this by viewing the query plan using the show query command.

Example 3: Using a multi-key index path

In the following example, you index the users in the stream_acct table by the id of the shows they watch and the year and month of the dates when the show was watched.
CREATE INDEX idx_showid_year_month ON 
stream_acct(acct_data.contentStreamed[].showId AS INTEGER,
substring(acct_data.contentStreamed[].seriesInfo[].episodes[].date AS STRING,0, 4),
substring(acct_data.contentStreamed[].seriesInfo[].episodes[].date AS STRING,5, 2))
An example of a query using this index is shown below. The query counts the number of users who watched any episode of show 16 in the year 2022.
SELECT count(*) FROM stream_acct s1 WHERE EXISTS 
s1.acct_data.contentStreamed[$element.showId = 16].seriesInfo.
episodes[substring($element.date, 0, 4) = "2022"]
This query will use the index idx_showid_year_month. You can verify this by viewing the query plan using the show query command.
show query SELECT count(*) FROM stream_acct s1 WHERE EXISTS
> s1.acct_data.contentStreamed[$element.showId = 16].seriesInfo.episodes[substring($element.date, 0, 4) = "2022"]

{
  "iterator kind" : "GROUP",
  "input variable" : "$gb-1",
  "input iterator" :
  {
    "iterator kind" : "RECEIVE",
    "distribution kind" : "ALL_SHARDS",
    "distinct by fields at positions" : [ 1 ],
    "input iterator" :
    {
      "iterator kind" : "SELECT",
      "FROM" :
      {
        "iterator kind" : "TABLE",
        "target table" : "stream_acct",
        "row variable" : "$$s1",
        "index used" : "idx_showid_year_month",
        "covering index" : true,
        "index row variable" : "$$s1_idx",
        "index scans" : [
          {
            "equality conditions" : {"acct_data.contentStreamed[].showId":16,"substring#acct_data.contentStreamed[].seriesInfo[].episodes[].date@,0,4":"2022"},
            "range conditions" : {}
          }
        ]
      },
      "FROM variable" : "$$s1_idx",
      "SELECT expressions" : [
        {
          "field name" : "Column_1",
          "field expression" :
          {
            "iterator kind" : "CONST",
            "value" : 1
          }
        },
        {
          "field name" : "acct_id_gen",
          "field expression" :
          {
            "iterator kind" : "FIELD_STEP",
            "field name" : "#acct_id",
            "input iterator" :
            {
              "iterator kind" : "VAR_REF",
              "variable" : "$$s1_idx"
            }
          }
        }
      ]
    }
  },
  "grouping expressions" : [

  ],
  "aggregate functions" : [
    {
      "iterator kind" : "FUNC_COUNT_STAR"
    }
  ]
}

Using TableRequest API

You can use TableRequest API to create an index on a NoSQL table.

The TableRequest class is used to create an index on a table. Execution of operations specified by this request is asynchronous. These are potentially long-running operations. TableResult is returned from TableRequest operations and it encapsulates the state of the table. See Oracle NoSQL Java SDK API Reference for more details on the TableRequest class and its methods.

Download the full code Indexes.java from the examples here.

/**
* Create an index acct_episodes in the stream_acct table
*/
private static void crtIndex(NoSQLHandle handle) throws Exception {
   String createIndexDDL = "CREATE INDEX acct_episodes ON " + tableName +
                           "(acct_data.contentStreamed[].seriesInfo[].episodes[]  AS ANYATOMIC)";
   TableRequest treq = new TableRequest().setStatement(createIndexDDL);
   TableResult tres = handle.tableRequest(treq);
   tres.waitForCompletion(handle, 60000, /* wait 60 sec */
          1000); /* delay ms for poll */
   System.out.println("Index acct_episodes on " + tableName + " is created");
}

The borneo.TableRequest class is used to create an index on a table. All calls to borneo.NoSQLHandle.table_request() are asynchronous so it is necessary to check the result and call borneo.TableResult.wait_for_completion() to wait for the operation to complete. See Oracle NoSQL Python SDK API Reference for more details on table_request and its methods.

Download the full code Indexes.py from the examples here.

#create an index
def create_index(handle):
   statement = '''CREATE INDEX acct_episodes ON stream_acct (acct_data.contentStreamed[].seriesInfo[].episodes[]  AS ANYATOMIC)'''
   request = TableRequest().set_statement(statement)
   table_result = handle.do_table_request(request, 40000, 3000)
   table_result.wait_for_completion(handle, 40000, 3000)
   print('Index acct_episodes on the table stream_acct is created')

The TableRequest class is used to create an index on a table. Execution of operations specified by TableRequest is asynchronous. These are potentially long-running operations. This request is used as the input of a Client.DoTableRequest() operation, which returns a TableResult that can be used to poll until the table reaches the desired state. See Oracle NoSQL Go SDK API Reference for more details on the various methods of the TableRequest class.

Download the full code Indexes.go from the examples here.

//create an index on a table
func createIndex(client *nosqldb.Client, err error, tableName string)(){
   stmt := fmt.Sprintf("CREATE INDEX acct_episodes ON %s "+
		"(acct_data.contentStreamed[].seriesInfo[].episodes[]  AS ANYATOMIC)",tableName)
   tableReq := &nosqldb.TableRequest{
		Statement: stmt,
   }
   tableRes, err := client.DoTableRequest(tableReq)
   if err != nil {
      fmt.Printf("cannot initiate CREATE INDEX request: %v\n", err)
      return
   }
   // The create index request is asynchronous, wait for index creation to complete.
   _, err = tableRes.WaitForCompletion(client, 60*time.Second, time.Second)
   if err != nil {
      fmt.Printf("Error finishing CREATE INDEX request: %v\n", err)
      return
   }
   fmt.Println("Created Index acct_episodes on table ", tableName)
   return
}

You can create an index on a table using the tableDDL method. This method is asynchronous and it returns a Promise of TableResult. The TableResult is a plain JavaScript object that encapsulates the state of the table. For method details, see NoSQLClient class.

Download the full JavaScript code Indexes.js from the examples here and the full TypeScript code Indexes.ts from the examples here.

//creates an index
async function createIndex(handle) {
   const crtindDDL = `CREATE INDEX acct_episodes ON ${TABLE_NAME}(acct_data.contentStreamed[].seriesInfo[].episodes[]  AS ANYATOMIC)`;
   let res =  await handle.tableDDL(crtindDDL);
   console.log('Index acct_episodes is created on table:' + TABLE_NAME);
}

To create an index on a table use either of the methods ExecuteTableDDLAsync or ExecuteTableDDLWithCompletionAsync. Both these methods return Task<TableResult>. TableResult instance contains status of DDL operation such as TableState and table schema. See Oracle NoSQL Dotnet SDK API Reference for more details on these methods.

Download the full code Indexes.cs from the examples here.
// Creates an index on a table
private static async Task createIndex(NoSQLClient client){
   var sql =
      $@"CREATE INDEX acct_episodes ON {TableName}(acct_data.contentStreamed[].seriesInfo[].episodes[]  AS ANYATOMIC)";
   var tableResult = await client.ExecuteTableDDLAsync(sql);
   // Wait for the operation completion
   await tableResult.WaitForCompletionAsync();
   Console.WriteLine(" Index acct_episodes is created on table Table {0}",
                tableResult.TableName);
}