21 Full-Text Search Queries

You can use Oracle SQL condition json_textcontains in a CASE expression or the WHERE clause of a SELECT statement to perform a full-text search of JSON data. You can use PL/SQL procedure CTX_QUERY.result_set to perform facet search over JSON data.

21.1 Oracle SQL Condition JSON_TEXTCONTAINS

You can use Oracle SQL condition json_textcontains in a CASE expression or the WHERE clause of a SELECT statement to perform a full-text search of JSON data.

Oracle Text technology underlies condition json_textcontains. This condition acts like SQL function contains when the latter uses parameter INPATH. The syntax of the search-pattern argument of json_textcontains is the same as that of SQL function contains. This means, for instance, that you can query for text that is near some other text, or query use fuzzy pattern-matching. If the search-pattern argument contains a character or a word that is reserved with respect to Oracle Text search then you must escape that character or word.

To be able to use condition json_textcontains you must first do one of the following; otherwise, an error is raised when you use json_textcontains. (You cannot do both — an error is raised if you try.)

  • Create a JSON search index for the JSON column.

  • Store the column of JSON data to be queried in the In-Memory Column Store (IM column store), specifying keyword TEXT. The column must of data type JSON; otherwise an error is raised. (JSON type is available only if database initialization parameter compatible is at least 20.)

Note:

Oracle SQL function json_textcontains provides powerful full-text search of JSON data. If you need only simple string pattern-matching then you can instead use a path-expression filter condition with any of these pattern-matching comparisons: has substring, starts with, like, like_regex, or eq_regex.

Example 21-1 shows a full-text query that finds purchase-order documents that contain the keyword Magic in any of the line-item part descriptions.

See Also:

Example 21-1 Full-Text Query of JSON Data with JSON_TEXTCONTAINS

SELECT po_document FROM j_purchaseorder
  WHERE json_textcontains(po_document,
                          '$.LineItems.Part.Description',
                          'Magic');

21.2 JSON Facet Search with PL/SQL Procedure CTX_QUERY.RESULT_SET

If you have created a JSON search index then you can also use PL/SQL procedure CTX_QUERY.result_set to perform facet search over JSON data. This search is optimized to produce various kinds of search hits all at once, rather than, for example, using multiple separate queries with SQL function contains.

To search using procedure CTX_QUERY.result_set you pass it a result set descriptor (RSD), which specifies (as a JSON object with predefined operator fields $query, $search, and $facet) the JSON values you want to find from your indexed JSON data, and how you want them grouped or aggregated. The values you can retrieve and act on are either JSON scalars or JSON arrays of scalars.

(Operator-field $query is also used in SODA query-by-example (QBE) queries. You can use operator $contains in the value of field $query for full-text matching similar to that provided by Oracle SQL condition json_textcontains.)

The RSD fields serve as an ordered template, specifying what to include in the output result set. (In addition to the found JSON data, a result set typically includes a list of search-hit rowids and some counts.)

A $facet field value is a JSON array of facet objects, each of which defines JSON data located at a particular path and perhaps satisfying some conditions, and perhaps an aggregation operation to apply to that data.

You can aggregate facet data using operators $count, $min, $max, $avg, and $sum. For example, $sum returns the sum of the targeted data values. You can apply an aggregation operator to all scalar values targeted by a path, or you can apply it separately to buckets of such values, defined by different ranges of values.

Finally, you can obtain the counts of occurrences of distinct values at a given path, using operator $uniqueCount.

For example, consider this $facet value:

[{"$uniqueCount" : "zebra.name"},
 {"$sum"         : {"path"  : "zebra.price",                   
                    "bucket : [{"$lt"  : 3000},
                               {"$gte" : 3000}]},
 {"$avg"         : "zebra.rating"}]

When query results are returned, the value of field $facet in the output is an array of three objects, with these fields:

  • zebra.name — The number of occurrences of each zebra name.

  • zebra.price — The sum of zebra prices, in two buckets: prices less than 3000 and prices at least 3000.

  • zebra.rating — The average of all zebra ratings. (Zebras with no rating are ignored.)


[{"zebra.name"   : [{"value":"Zigs",
                     "$uniqueCount:2},                  
                    {"value":"Zigzag",
                     "$uniqueCount:1},             
                    {"value":"Storm",
                     "$uniqueCount:1}]},
 {"zebra.price"  : [{"value":1000,
                     "$uniqueCount:2},                  
                    {"value":3000,
                     "$uniqueCount:2},                  
                    {"value":2000,
                     "$uniqueCount:1}]},
 {"zebra.rating" : {"$avg":4.66666666666666666667}}]

See Also:

RESULT_SET in Oracle Text Reference