Create and Manage Indexes

An index is a database structure that enables you to retrieve data from database tables efficiently. Indexes provide fast access to the rows of a table when the key(s) you are searching for is contained in the index.

An index is an ordered map in which each row of the data is called an entry. An index can be created on atomic data types, arrays, maps, JSON, and GeoJSON data.. An index can store the following special values:
  • NULL
  • EMPTY
  • json null (It is applicable only for JSON indexes)

If you want to follow along with the examples, download the scripts baggageschema_loaddata.sql and acctstream_loaddata.sql and run it as shown below. This script creates the table used in the example and loads data into the table.

Start your KVSTORE or KVLite and open the SQL.shell.
java -jar lib/kvstore.jar kvlite -secure-config disable
java -jar lib/sql.jar -helper-hosts localhost:5000 -store kvstore
Using the load command, run the scripts.
load -file baggageschema_loaddata.sql
load -file acctstream_loaddata.sql

Classification of Indexes

Indexes can be classified based on fields, schema, entries, or a combination of them.


Description of images_types.png follows
Description of the illustration images_types.png

Single Field Index: An index is called a single field index if it is created on only one field of a table.

Composite Index: An index is called a composite index if it is created on more than one field of a table

Fixed Schema Index: An index is called a fixed schema index if all the fields that are indexed are strongly typed data.

Note:

A data type is called precise if it is not one of the wild card types. Items that have precise types are said to be strongly typed.

Schema-less Index (JSON Index): An index is called a JSON index if at least one of the fields is JSON data or fields inside JSON data.

Simple Index: An index is called a simple index if for each row of data in the table, there is one entry created in the index.

Multikey Index: An index is called a multikey index if for each row of data in the table, there are multiple entries created in the index.

Creating Indexes

An index can be created using the CREATE INDEX command.

Create a single field index:

Example: Create an index on passengers reservation code.
CREATE INDEX fixedschema_conf ON baggageInfo(confNo)

The above is an example of a single-column fixed schema index. The index is created on the confNo field having string data type in the baggageInfo table.

Create a composite index:

Example : Create an index on the full name and phone number of passengers.
CREATE INDEX compindex_namephone ON baggageInfo(fullName,contactPhone)
The above is an example of a composite index. The index is created on two fields in the baggageInfo schema, on full name and the contact phone number.

Note:

You can have one or more fields of this index as fixed schema columns.

Create a JSON index:

An index is called a JSON index if at least one of the fields is inside JSON data. As JSON is schema-less, the data type of an indexed JSON field may be different across rows. When creating an index on JSON fields, if you are unsure what data type to expect for the JSON field, you may use the anyAtomic data type. Alternatively, you can specify one of the Oracle NoSQL Database atomic data types. You do that by declaring a data type using the AS keyword next to every index path into the JSON field.

Example 1: Create an index on the tag number of passengers bags.
CREATE INDEX jsonindex_tagnum ON baggageInfo(bagInfo[].tagnum as INTEGER)

The above is an example of a JSON index. The index is created on the tagnum field present in the baginfo JSON field in the baggageInfo table. Notice that you provide a data type for the tagnum field while creating the index.

The creation of a JSON index will fail if the associated table contains any rows with data that violate the declared data type. Similarly, after creating a JSON index, an insert/update operation will fail if the new row does not conform to the declared data type in the JSON index.

Example 2: Create an index on the route of passengers.
CREATE INDEX jsonindex_routing ON baggageInfo(bagInfo[].routing as ANYATOMIC)
Declaring a JSON index path as anyAtomic has the advantage of allowing the indexed JSON field to have values of various data types. The index entries are sorted in ascending order. When these values are stored in the index, they are sorted as follows:
  • Numbers
  • String
  • boolean

However, this advantage is offset by space and CPU costs. It is because numeric values of any kind in the indexed field will be cast to Number before being stored in the index. This cast takes CPU time, and the resulting storage for the number will be larger than the original storage for the number.

Create a simple index:

An index is called a simple index if, for each row of data in the table, there is one entry created in the index. The index will return a single value that is of atomic data type or any special value (SQL NULL, JSON NULL, EMPTY). Essentially, the index paths of a simple index must not return an array or map or a nested data type.

Example: Create an index on three fields, when the bag was last seen, the last seen station and the arrival date and time.
CREATE INDEX simpleindex_arrival ON baggageInfo(bagInfo[].lastSeenTimeGmt as ANYATOMIC,
bagInfo[].bagArrivalDate as ANYATOMIC, bagInfo[].lastSeenTimeStation as ANYATOMIC)

The above is an example of a simple index created on a JSON document in a JSON field. The index is created on the lastSeenTimeGmt and bagArrivalDate and lastSeenTimeStation, all from the bagInfo JSON document in the info JSON field in the baggageInfo table. If the evaluation of a simple index path returns an empty result, the special value EMPTY is used as an index entry. In the above example, If there is no lastSeenTimeGmt or bagArrivalDate or lastSeenTimeStation entry in the bagInfo JSON document, or if there is no bagInfo JSON array, then the special value EMPTY is indexed.

Create a multikey index:

An index is called a multikey index if, for each row of data in the table, there are multiple entries created in the index. In a multikey index, there is at least one index path that uses an array or a nested data type. In a multikey index, for each table row, index entries are created on all the elements in arrays that are being indexed.

Example 1: Multikey index: Create an index on the series info array of the streaming account application.
CREATE INDEX multikeyindex1 ON stream_acct (acct_data.seriesInfo[] AS ANYATOMIC);

The index is created on the seriesInfo[] array in the stream_acct table. Here, all the elements in the seriesInfo[] array in each row of the stream_acct table will be indexed.

Example 2: Nested multikey index: Create an index on the episode details array of the streaming account application.

An index is a nested multikey index if it is created on a field that is present inside an array which in turn is present inside another array.
CREATE INDEX multikeyindex2 ON stream_acct (
    acct_data.seriesInfo[].episodes[]  AS ANYATOMIC);

The above is an example of a nested multikey index where the field is present in an array that is present inside another array. The index is created on the episodes[] array in the seriesInfo[] array in the acct_data JSON of the stream_acct table.

Example 3: Composite multikey index:

An index is called a composite multikey index if it is created on more than one field, and at least one of those fields is multikey. A composite multikey index may have a combination of multikey index paths and simple index paths.
CREATE INDEX multikeyindex3 ON stream_acct (acct_data.country AS ANYATOMIC,
acct_data.seriesInfo[].episodes[]  AS ANYATOMIC);

The above is an example of a composite multikey index having one multikey index path and one simple index path. The index is created on the country field and episodes[]array in the acct_data JSON column of the stream_acct table.

See Specifications & Restrictions on Multikey index to learn about restrictions on multikey index.

Create an index with NO NULLS clause

You can create an index with the optional WITH NO NULLS clause. In that case, the rows with NULL and/or EMPTY values on the indexed fields will not be indexed.
CREATE INDEX nonull_phone ON baggageInfo (contactPhone) WITH NO NULLS;
  • The above query creates an index on the phone number of the passengers. If some passengers do not have a phone number then those fields will not be part of the index.
  • The indexes that are created with the WITH NO NULLS clause may be useful when the data contain a lot of NULL and/or EMPTY values on the indexed fields. It will reduce the time and space overhead during indexing.
  • However, the use of such indexes by queries is restricted. If an index is created with the WITH NO NULLS clause, IS NULL, and NOT EXISTS predicates cannot be used as index predicates for that index.
  • In fact, such an index can be used by a query only if the query has an index predicate for each of the indexed fields.

Create an index with unique keys per row

You can create an index with unique keys per row property.
CREATE INDEX idx_showid ON 
stream_acct(acct_data.contentStreamed[].showId AS INTEGER)
WITH UNIQUE KEYS PER ROW;

In the above query, an index is created on showId and there cannot be duplicate showId for a single contentStreamed array. This informs the query processor that for any streaming user, the contentStreamed array cannot contain two or more shows with the same show id. The restriction is necessary because if duplicate show ids existed, they wouldn’t be included in the index. If you insert a row with the same showId two or more items in a single contentStreamed array, an error is thrown and the insert operation is not successful.

Optimization in the query run time :

When you create an index with unique keys per row, the index would contain fewer entries than the number of elements in the contentStreamed array. You could write an efficient query to use this index. The use of such an index by the query would yield fewer results from the FROM clause than if the index was not used.

View Index

You can view the indexes in your database.

SHOW INDEXES

The SHOW INDEXES statement provides the list of indexes present in the specified table. If you want the output to be in JSON format, you can specify the optional AS JSON.

Example 1: List indexes on the BaggageInfo table.
SHOW INDEXES ON baggageInfo

indexes
  jsonindex_routing
  jsonindex_tagnum
  simpleindex_arrival
  nonull_phone
Example 2: List indexes on the BaggageInfo table in JSON format.
SHOW AS JSON INDEXES ON baggageInfo
{"indexes"  :
["jsonindex_routing","jsonindex_tagnum","simpleindex_arrival"]}

DESCRIBE INDEX

The DESCRIBE INDEX statement defines the specified index on a table. If you want the output to be in JSON format, you can specify the optional AS JSON.

The description for the index contains the following information:
  • Name of the table on which the index is defined.
  • Name of the index.
  • Type of index. Whether the index is primary index or secondary index.
  • Whether the index is multikey? If the index is multikey then 'Y' is displayed. Otherwise, 'N' is displayed.
  • List of fields on which the index is defined.
  • The declared type of the index.
  • Description of the index.
Example 1: Describe the index multikeyindex3.
DESCRIBE INDEX multikeyindex3 ON stream_acct; 
+-------------+----------------+-----------+----------+-----------------------------------+--------------+-------------+ 
table       |  name          | type      | multiKey | fields                            | declaredType | description
+-------------+----------------+-----------+----------+-----------------------------------+--------------+-------------+
stream_acct | multikeyindex3 | SECONDARY | Y        | acct_data.country                 | ANY_ATOMIC   |                                   | |             |  
            |                |           |          | acct_data.seriesInfo[].episodes[] | ANY_ATOMIC   |            
+-------------+----------------+-----------+----------+-----------------------------------+--------------+-------------+

Example 2: Describe the index multikeyindex2 in JSON format.

DESCRIBE AS JSON INDEX multikeyindex2 ON stream_acct;
{
  "name" : "multikeyindex2",
  "type" :  "secondary",  
  "fields" :["acct_data.seriesInfo[].episodes[]"],  
  "types" : ["ANY_ATOMIC"],  
  "withNoNulls" : false, 
  "withUniqueKeysPerRow" : false
}

Drop Index

You can drop an index from your database when you no longer need it.

The DROP INDEX removes the specified index from the database. If an index with the given name does not exist, then the statement fails, and an error is reported. If the optional IF EXISTS clause is used in the DROP INDEX statement, and if an index with the same name does not exist, then the statement will not execute, and no error is reported.

Example: Drop the index multikeyindex1.
DROP INDEX multikeyindex1 ON stream_acct