Using APIs to create an index
You can create an index for a NoSQL table using SQL commands or using
TableRequest
API.
Related Topics
Using SQL commands
An index can be created using the CREATE INDEX
command.
Create a single field index:
CREATE INDEX fixedschema_conf ON baggageInfo(confNo)
The above is an example of a single-column fixed schema index. The index
is created on the confNo
field having string
data
type in the baggageInfo
table.
Create a composite index:
CREATE INDEX compindex_namephone ON baggageInfo(fullName,contactPhone)
Note:
You can have one or more fields of this index as fixed schema columns.Create a JSON index:
An index is called a JSON index if at least one of the fields is inside JSON data. As JSON is schema-less, the data type of an indexed JSON field may be different across rows. When creating an index on JSON fields, if you are unsure what data type to expect for the JSON field, you may use the anyAtomic data type. Alternatively, you can specify one of the Oracle NoSQL Database atomic data types. You do that by declaring a data type using the AS keyword next to every index path into the JSON field.
CREATE INDEX jsonindex_tagnum ON baggageInfo(bagInfo[].tagnum as INTEGER)
The above is an example of a JSON index. The index is created on the
tagnum
field present in the baginfo
JSON field
in the baggageInfo
table. Notice that you provide a data type for
the tagnum
field while creating the index.
The creation of a JSON index will fail if the associated table contains any rows with data that violate the declared data type. Similarly, after creating a JSON index, an insert/update operation will fail if the new row does not conform to the declared data type in the JSON index.
CREATE INDEX jsonindex_routing ON baggageInfo(bagInfo[].routing as ANYATOMIC)
- Numbers
- String
- boolean
However, this advantage is offset by space and CPU costs. It is because numeric values of any kind in the indexed field will be cast to Number before being stored in the index. This cast takes CPU time, and the resulting storage for the number will be larger than the original storage for the number.
Create a simple index:
An index is called a simple index if, for each row of data in the table, there is one entry created in the index. The index will return a single value that is of atomic data type or any special value (SQL NULL, JSON NULL, EMPTY). Essentially, the index paths of a simple index must not return an array or map or a nested data type.
CREATE INDEX simpleindex_arrival ON baggageInfo(bagInfo[].lastSeenTimeGmt as ANYATOMIC,
bagInfo[].bagArrivalDate as ANYATOMIC, bagInfo[].lastSeenTimeStation as ANYATOMIC)
The above is an example of a simple index created on a JSON document in
a JSON field. The index is created on the lastSeenTimeGmt
and
bagArrivalDate
and lastSeenTimeStation
, all
from the bagInfo
JSON document in the info JSON field in the
baggageInfo
table. If the evaluation of a simple index path
returns an empty result, the special value EMPTY is used as an index entry. In the
above example, If there is no lastSeenTimeGmt
or
bagArrivalDate
or lastSeenTimeStation
entry in
the bagInfo
JSON document, or if there is no
bagInfo
JSON array, then the special value EMPTY is indexed.
Create a multikey index:
An index is called a multikey index if, for each row of data in the table, there are multiple entries created in the index. In a multikey index, there is at least one index path that uses an array or a nested data type. In a multikey index, for each table row, index entries are created on all the elements in arrays that are being indexed.
CREATE INDEX multikeyindex1 ON stream_acct (acct_data.contentStreamed[].seriesInfo[] AS ANYATOMIC)
The index is created on the seriesInfo[]
array in the
stream_acct table. Here, all the elements in the seriesInfo[]
array
in each row of the stream_acct
table will be indexed.
Example 2: Nested multikey index: Create an index on the episode details array of the streaming account application.
CREATE INDEX multikeyindex2 ON stream_acct (
acct_data.contentStreamed[].seriesInfo[].episodes[] AS ANYATOMIC)
The above is an example of a nested multikey index where the field is
present in an array that is present inside another array. The index is created on
the episodes[]
array in the seriesInfo[]
array in
the acct_data
JSON of the stream_acct
table.
Example 3: Composite multikey index:
CREATE INDEX multikeyindex3 ON stream_acct (acct_data.country AS ANYATOMIC,
acct_data.contentStreamed[].seriesInfo[].episodes[] AS ANYATOMIC)
The above is an example of a composite multikey index having one
multikey index path and one simple index path. The index is created on the
country
field and episodes[]
array in the
acct_data
JSON column of the stream_acct
table.
See Specifications & Restrictions on Multikey index to learn about restrictions on multikey index.
Create an index with NO NULLS clause
CREATE INDEX nonull_phone ON baggageInfo (contactPhone) WITH NO NULLS
- The above query creates an index on the phone number of the passengers. If some passengers do not have a phone number then those fields will not be part of the index.
- The indexes that are created with the WITH NO NULLS clause may be useful when the data contain a lot of NULL and/or EMPTY values on the indexed fields. It will reduce the time and space overhead during indexing.
- However, the use of such indexes by queries is restricted. If an index is created with the WITH NO NULLS clause, IS NULL, and NOT EXISTS predicates cannot be used as index predicates for that index.
- In fact, such an index can be used by a query only if the query has an index predicate for each of the indexed fields.
Create an index with unique keys per row
CREATE INDEX idx_showid ON
stream_acct(acct_data.contentStreamed[].showId AS INTEGER)
WITH UNIQUE KEYS PER ROW
In the above query, an index is created on showId
and
there cannot be duplicate showId
for a single
contentStreamed
array. This informs the query processor that
for any streaming user, the contentStreamed
array cannot contain
two or more shows with the same show id. The restriction is necessary because if
duplicate show ids existed, they wouldn’t be included in the index. If you insert a
row with the same showId
two or more items in a single
contentStreamed
array, an error is thrown and the insert
operation is not successful.
Optimization in the query run time :
When you create an index with unique keys per row, the index would
contain fewer entries than the number of elements in the
contentStreamed
array. You could write an efficient query to
use this index. The use of such an index by the query would yield fewer results from
the FROM clause than if the index was not used.
Examples of creating indexes on functions:
BaggageInfo
table by their latest modification
time:CREATE INDEX idx_modtime ON BaggageInfo(modification_time())
modification_time
as the filter
condition.SELECT * FROM BaggageInfo $u WHERE
modification_time($u) > "2019-08-01T10:45:00"
This query returns all the rows whose most recent modification time is
after 2019-08-01T10:45:00. It uses the idx_modtime
index defined
above. You can verify this by viewing the query plan using the show
query
command.
BaggageInfo
table on the length of the routing
field.CREATE INDEX idx_routlen ON BaggageInfo (length(bagInfo[].routing as string))
length
as
the filter
condition.SELECT * from BaggageInfo $bag where length($bag.bagInfo[].routing) > 10
This query returns all the rows whose length of the routing field is
greater than 10. It uses the idx_routlen
index defined above. You
can verify this by viewing the query plan using the show query
command.
Example 3: Using a multi-key index path
stream_acct
table by the id of the shows they watch and the
year and month of the dates when the show was
watched.CREATE INDEX idx_showid_year_month ON
stream_acct(acct_data.contentStreamed[].showId AS INTEGER,
substring(acct_data.contentStreamed[].seriesInfo[].episodes[].date AS STRING,0, 4),
substring(acct_data.contentStreamed[].seriesInfo[].episodes[].date AS STRING,5, 2))
SELECT count(*) FROM stream_acct s1 WHERE EXISTS
s1.acct_data.contentStreamed[$element.showId = 16].seriesInfo.
episodes[substring($element.date, 0, 4) = "2022"]
idx_showid_year_month
.
You can verify this by viewing the query plan using the show query
command.show query SELECT count(*) FROM stream_acct s1 WHERE EXISTS
> s1.acct_data.contentStreamed[$element.showId = 16].seriesInfo.episodes[substring($element.date, 0, 4) = "2022"]
{
"iterator kind" : "GROUP",
"input variable" : "$gb-1",
"input iterator" :
{
"iterator kind" : "RECEIVE",
"distribution kind" : "ALL_SHARDS",
"distinct by fields at positions" : [ 1 ],
"input iterator" :
{
"iterator kind" : "SELECT",
"FROM" :
{
"iterator kind" : "TABLE",
"target table" : "stream_acct",
"row variable" : "$$s1",
"index used" : "idx_showid_year_month",
"covering index" : true,
"index row variable" : "$$s1_idx",
"index scans" : [
{
"equality conditions" : {"acct_data.contentStreamed[].showId":16,"substring#acct_data.contentStreamed[].seriesInfo[].episodes[].date@,0,4":"2022"},
"range conditions" : {}
}
]
},
"FROM variable" : "$$s1_idx",
"SELECT expressions" : [
{
"field name" : "Column_1",
"field expression" :
{
"iterator kind" : "CONST",
"value" : 1
}
},
{
"field name" : "acct_id_gen",
"field expression" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "#acct_id",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$s1_idx"
}
}
}
]
}
},
"grouping expressions" : [
],
"aggregate functions" : [
{
"iterator kind" : "FUNC_COUNT_STAR"
}
]
}
Using TableRequest API
You can use TableRequest API to create an index on a NoSQL table.
The TableRequest
class is used to create an index
on a table. Execution of operations specified by this request is asynchronous.
These are potentially long-running operations. TableResult
is
returned from TableRequest
operations and it encapsulates the
state of the table. See Oracle NoSQL Java SDK API Reference for
more details on the TableRequest
class and its methods.
/**
* Create an index acct_episodes in the stream_acct table
*/
private static void crtIndex(NoSQLHandle handle) throws Exception {
String createIndexDDL = "CREATE INDEX acct_episodes ON " + tableName +
"(acct_data.contentStreamed[].seriesInfo[].episodes[] AS ANYATOMIC)";
TableRequest treq = new TableRequest().setStatement(createIndexDDL);
TableResult tres = handle.tableRequest(treq);
tres.waitForCompletion(handle, 60000, /* wait 60 sec */
1000); /* delay ms for poll */
System.out.println("Index acct_episodes on " + tableName + " is created");
}
The borneo.TableRequest
class is used to create an
index on a table. All calls to
borneo.NoSQLHandle.table_request()
are asynchronous so it
is necessary to check the result and call
borneo.TableResult.wait_for_completion()
to wait for the
operation to complete. See Oracle NoSQL Python SDK API Reference
for more details on table_request
and its methods.
#create an index
def create_index(handle):
statement = '''CREATE INDEX acct_episodes ON stream_acct (acct_data.contentStreamed[].seriesInfo[].episodes[] AS ANYATOMIC)'''
request = TableRequest().set_statement(statement)
table_result = handle.do_table_request(request, 40000, 3000)
table_result.wait_for_completion(handle, 40000, 3000)
print('Index acct_episodes on the table stream_acct is created')
The TableRequest
class is used to create an index
on a table. Execution of operations specified by TableRequest
is asynchronous. These are potentially long-running operations. This request is
used as the input of a Client.DoTableRequest()
operation, which
returns a TableResult
that can be used to poll until the table
reaches the desired state. See Oracle NoSQL Go SDK API Reference for
more details on the various methods of the TableRequest
class.
//create an index on a table
func createIndex(client *nosqldb.Client, err error, tableName string)(){
stmt := fmt.Sprintf("CREATE INDEX acct_episodes ON %s "+
"(acct_data.contentStreamed[].seriesInfo[].episodes[] AS ANYATOMIC)",tableName)
tableReq := &nosqldb.TableRequest{
Statement: stmt,
}
tableRes, err := client.DoTableRequest(tableReq)
if err != nil {
fmt.Printf("cannot initiate CREATE INDEX request: %v\n", err)
return
}
// The create index request is asynchronous, wait for index creation to complete.
_, err = tableRes.WaitForCompletion(client, 60*time.Second, time.Second)
if err != nil {
fmt.Printf("Error finishing CREATE INDEX request: %v\n", err)
return
}
fmt.Println("Created Index acct_episodes on table ", tableName)
return
}
You can create an index on a table using the
tableDDL
method. This method is asynchronous and it returns
a Promise of TableResult
. The TableResult
is a
plain JavaScript object that encapsulates the state of the table. For method
details, see NoSQLClient class.
//creates an index
async function createIndex(handle) {
const crtindDDL = `CREATE INDEX acct_episodes ON ${TABLE_NAME}(acct_data.contentStreamed[].seriesInfo[].episodes[] AS ANYATOMIC)`;
let res = await handle.tableDDL(crtindDDL);
console.log('Index acct_episodes is created on table:' + TABLE_NAME);
}
To create an index on a table use either of the methods
ExecuteTableDDLAsync
or
ExecuteTableDDLWithCompletionAsync
. Both these methods
return Task<TableResult>
. TableResult
instance contains status of DDL operation such as TableState and table schema.
See Oracle NoSQL Dotnet SDK API Reference for more details on
these methods.
// Creates an index on a table
private static async Task createIndex(NoSQLClient client){
var sql =
$@"CREATE INDEX acct_episodes ON {TableName}(acct_data.contentStreamed[].seriesInfo[].episodes[] AS ANYATOMIC)";
var tableResult = await client.ExecuteTableDDLAsync(sql);
// Wait for the operation completion
await tableResult.WaitForCompletionAsync();
Console.WriteLine(" Index acct_episodes is created on table Table {0}",
tableResult.TableName);
}