Indexing Arrays

You can create an index on an array field (or a field of an array nested inside another array).

Be aware that indexing an array potentially results in multiple index entries for each row, which can lead to very large indexes.

To create the index, first create the table:

CREATE TABLE myArrayTable (
    uid INTEGER,
    testArray ARRAY(STRING),
    PRIMARY KEY(uid)
) 

Once the table has been added to the store, create the index. Be sure to use [] with the field name to indicate that it is an array:

CREATE INDEX arrayFieldIndex on myArrayTable (testArray[]) 

In the case of arrays, the field can be indexed only if the array contains values that are of one of the other indexable types.

An index on an array is a multikey index. An index is called a multikey index if for each row of data in the table, there are multiple entries created in the index. In a multikey index there is at least one index path that uses [] steps. Any such index path will be called a multikey index path.

In a multikey index, for each table row, index entries are created on all the elements in arrays that are being indexed. If the evaluation returns an empty result, the special value EMPTY is used as the index entry. Any duplicate index entries are then eliminated.

To retrieve data using an index of arrays, you first retrieve the index using its name, and create an instance of IndexKey that you will use to perform the index lookup:

Index arrayIndex = myTable.getIndex("arrayFieldIndex");
IndexKey indexKey = arrayIndex.createIndexKey(); 

Next you assign the array field name and its value to the IndexKey that you created using the IndexKey.put() method:

indexKey.put("testArray[]", "One"); 

When you perform the index lookup, the only records that will be returned will be those which have an array with at least one item matching the value set for the IndexKey object. For example, if you have individual records that contain arrays like this:

Record 1: ["One," "Two", "Three"]
Record 2: ["Two", "Three", "One"]
Record 3: ["One", "Three", "One"]
Record 4: ["Two", "Three", "Four"] 

and you then perform an array lookup on the array value "One", then Records 1 - 3 will be returned, but not 4.

After that, you retrieve the matching table rows, and iterate over them in the same way you would any other index type. For example:

TableIterator<Row> iter = tableH.tableIterator(indexKey, null, null);
System.out.println("Results for Array value 'One' : ");
try {
    while (iter.hasNext()) {
        Row rowRet = iter.next();
        int uid = rowRet.get("uid").asInteger().get();
        System.out.println("uid: " + uid);
        ArrayValue avRet = rowRet.get("testArray").asArray();
        for (FieldValue fv: avRet.toList()) {
            System.out.println(fv.asString().get());
        }
    }
} finally {
    if (iter != null) {
        iter.close();
    }
}