3 Overview of SODA Indexing

The performance of SODA QBEs can sometimes be improved by using indexes. You define a SODA index with an index specification, which is a JSON object that specifies how particular QBE patterns are to be indexed for quicker matching.

Suppose that you often use a query such as {"dateField" : {"$date" : DATE-STRING}}, where DATE-STRING is a string in an ISO 8601 format supported by SODA. Here, item method $date transforms DATE-STRING to a SQL value of data type DATE. You can typically improve the performance of queries on a field such as "dateField" by creating a B-tree index for it.

Or suppose that you want to query spatial data in a GeoJSON geometry object. You can improve the performance of such queries by creating a SODA spatial index for that data.

Or suppose that you want to be able to perform full-text queries using QBE operator $contains. You can enable such queries by creating a JSON search index for your data.

Or suppose that you want to perform metadata queries on a JSON data guide, which is a summary of the structural and type information about a set of JSON documents. You can create a JSON search index that holds and automatically updates such data-guide information.

In all such cases you specify the index you want by creating a SODA index specification and then using it to create the specified index.

Each SODA implementation that supports indexing provides a way to create an index. They all use a SODA index specification to define the index to be created. For example, with SODA for REST you use an HTTP POST request, passing URI argument action=index, and providing the index specification in the POST body.

Note:

  • To create a B-tree index you need Oracle Database Release 12c (12.2.0.1) or later.

    To create a B-tree index that indexes a DATE or a TIMESTAMP value you need Oracle Database Release 18c (18.1) or later.

  • To create a spatial index you need Oracle Database Release 12c (12.2.0.1) or later.

  • To create a search index you need Oracle Database Release 12c (12.2.0.1) or later.

Example 3-1 Specifying a B-Tree Index

This example specifies a B-tree non-unique index for numeric field address.zip.

{ "name"   : "ZIPCODE_IDX",
  "fields" : [ { "path"     : "address.zip",
                 "datatype" : "number",
                 "order"    : "asc" } ] }

This indexes the field at path address.zip in Example 2-1 and Example 2-2.

Example 2-3 has no such field, so that document is skipped during indexing.

You can specify that the index requires that all indexed fields be scalar by including scalarRequired : true in the index specification:

{ "name"   : "ZIPCODE_IDX",
  "fields" : [ { "path"     : "address.zip",
                 "datatype" : "number",
                 "order"    : "asc" } ],
  "scalarRequired" : true }

If a specification includes scalarRequired : true, and if the collection contains a document that is missing one or more of the specified fields (just address.zip in this case), or if any of them has a non-scalar value, then an error is raised when creating the index. In addition, if such an index exists when you try to write a document that lacks that one of the indexed fields then an error is raised for the write operation.

Regardless of the value of scalarRequired, an error is raised if you try to write a document that has the targeted field but with a value that is not convertible to the specified data type. For example, for the index defined in Example 3-1, if a document contains field address.zip, but the field value is not convertible to a number, then an error is raised. This would be the case, for instance, for a zip field whose value is an object.

Example 3-2 Specifying a Spatial Index

This example specifies an Oracle Spatial and Graph index named LOCATION_LONG_LAT_IDX, which indexes the GeoJSON geometry object that is the value of field location in your documents:

{ "name"    : "LOCATION_LONG_LAT_IDX",
  "spatial" : "location" }

This index specification applies to all documents that have a location field whose value is a GeoJSON geometry object, and only to such documents. Here’s an example of an object with such a location field, whose value is a geometry object of type Point:

{ "location" : { "type" : "Point",
                 "coordinates" : [ 33.7243, 118.1579 ] } }

That location value is indexed, because its value is a GeoJSON geometry object.

Because neither scalarRequired : true nor lax : true is specified in the index specification, a document that has no location field is silently skipped (not indexed) during indexing.

And if the collection that is queried has a document with an object such as one of the following, whose location values are not GeoJSON geometry objects, then an error is raised during indexing.

{ "location" : [ 33.7243, 118.1579 ] }
{ "location" : "1600 Pennsylvania Ave NW, Washington, DC 20500" }

In addition, if such an index exists, and you try to write a document that has location field with such a non-geometry value, then an error is raised for the write operation.

You can specify that the index requires that all indexed fields be scalar by including scalarRequired : true in the index specification:

{ "name"    : "LOCATION_LONG_LAT_IDX",
  "spatial" : "location",
  "scalarRequired" : true }

With scalarRequired : true, if the collection contains a document that has no location field, then an error is raised when creating the index. In addition, if such an index exists, and you try to write a document that lacks the indexed field (location), then an error is raised for the write operation. (An error is still also raised, for index creation or a write operation, for a location field whose value is not a geometry object.)

Alternatively you can specify that the index does not require indexed fields to be present and have GeoJSON geometry values by including lax : true in the index specification:

{ "name"    : "LOCATION_LONG_LAT_IDX",
  "spatial" : "location",
  "lax" : true }

With lax : true, no error is raised for a document that lacks a location field or for a document with a location field value (such as {"location" : [33.7243, 118.1579]}) that is not a GeoJSON geometry object. The index simply ignores such documents.

Note:

If you have created a SODA spatial index for a field whose value is a GeoJSON geometry object, and if you use a QBE that targets that field, then the index can be picked up for the QBE only if both index and QBE specify the same error-handling behavior for that field. Both must specify the same one of these:

  • scalarRequired : true

  • lax : true

  • Neither scalarRequired : true nor lax : true

Example 3-3 Specifying a JSON Search Index

This example specifies a JSON search index. The index does both of these things:

  • Enables you to perform ad hoc full-word and full-number queries on your JSON documents.

  • Automatically accumulates and updates aggregate structural and type information about your JSON documents: a JSON data guide.

{ "name" : "SEARCH_AND_DATA_GUIDE_IDX" }

This index specification is equivalent. It just makes explicit the default values.

{ "name"      : "SEARCH_AND_DATA_GUIDE_IDX",
  "dataguide" : "on",
  "search_on" : "text_value" }

(To specify a search index without data-guide support, just set field dataguide to "off".)

See Also: