CREATE INDEX Statement

The create index statement generates a new index on the specified columns in the given table.

Syntax

create_index_statement ::= 
   CREATE INDEX [IF NOT EXISTS] index_name 
   ON table_name "(" path_list ")" [WITH NO NULLS][WITH UNIQUE KEYS PER ROW][comment]

index_name ::= id
path_list ::= index_path ("," index_path)*
index_path ::= 
   name_path [path_type] | 
   multikey_path_prefix [.name_path] [path_type]
name_path ::= field_name ("." field_name)*
field_name ::= id | DSTRING
multikey_path_prefix ::= 
    field_name ( ("." field_name) | ("[" "]") | ("." VALUES"("")" ) )*
    ( ("[" "]") | ("." VALUES"(""")" ) | ("." KEYS"(""")") )
path_type ::= AS 
   (INTEGER | LONG | DOUBLE | STRING | 
   BOOLEAN | NUMBER | ANYATOMIC |POINT | GEOMETRY)

Semantics

The index name is unique to a table. If an index with the same name already exists in a table, then the statement will fail and report an error. For example, you can have only one index named idx_income1 in the UserInfo table.

The index name is specific to a table. You can use the same index name in multiple tables. For example, you can use the same index name idx_income1 in the UserInfo and Users3 tables.

The index specification is unique to a table. If an index with the same specification already exists in a table, then the statement will fail and report an error.

For example, if you have the following idx_income1 index on UserInfo table,
CREATE INDEX idx_income1 ON UserInfo (info.income AS ANYATOMIC)
then the following statement will throw an error that the idx_income2 index is a duplicate of an existing index with another name. In this case, even though the index names are different, the index specifications are the same.
CREATE INDEX idx_income2 ON UserInfo(info.income AS ANYATOMIC)

If the optional IF NOT EXISTS clause is specified in the CREATE INDEX statement, and if an index with the same name exists, then the statement will not execute and will not report an error.

If the optional WITH NO NULLS clause is specified in the CREATE INDEX statement, then the rows with NULL and/or EMPTY values on the indexed fields will not be indexed.

The indexes that are created with the WITH NO NULLS clause may be useful when the data contain a lot of NULL and/or EMPTY values on the indexed fields. It will reduce the time and space overhead during indexing. However, the use of such indexes by queries is restricted. For more information, see Using Indexes for Query Optimization section.

If the optional WITH UNIQUE KEYS PER ROW clause is used, then there will not be any duplicates among the index keys generated from a row. This property applies to multikey indexes only and is used in optimizing queries that perform unnesting. You could write an efficient query to use this index. The use of such an index by any query would yield fewer results from the FROM clause than if the index was not used.
CREATE INDEX idx_phones ON UserInfo(info.phones[].number AS INTEGER)
WITH UNIQUE KEYS PER ROW

If the optional COMMENT is specified, then this becomes part of the index metadata and is not interpreted. The "comment" will be displayed in the output of the DESCRIBE statement.

If JSON data is indexed, you must specify a data type using the AS keyword next to every index path into the JSON data. For all other typed data, you should not specify the data type, as the data type will be inferred from the table schema.

The index entries are automatically updated when rows are inserted, deleted, or updated in the specified table.

Note: The maximum number of index keys generated per row is 10000. An IllegalArgumentException will be raised during indexing if the number of index keys generated per is row exceeds 10000.