13 Performing Ubiquitous Search with DBMS_SEARCH APIs

Starting with Oracle Database 23ai, you can use the DBMS_SEARCH PL/SQL package for indexing of multiple schema objects in a single index.

13.1 About Ubiquitous Search Index

Ubiquitous database search enables full-text and range-based search across multiple objects within an entire schema.

Overview

A ubiquitous search index (or DBMS_SEARCH index) is a JSON search index with predefined set of preferences and settings enabled for performing full text search on multiple columns, tables, or views. You use the DBMS_SEARCH PL/SQL package to create, manage, and query these indexes.

You can create a DBMS_SEARCH index on multiple columns of a table or multiple columns from different tables in a given schema. You can add data sources, that is tables and views, into this index (without the need to materialize the views). All the columns in the specified sources are indexed and available for a full-text or range-based search.

Using DBMS_SEARCH APIs, you can perform these tasks:
  • Create index:

    The DBMS_SEARCH.CREATE_INDEX API allows you to create a DBMS_SEARCH index.

    By default, this index is created with key indexing preferences, such as BASIC_WORDLIST to allow wildcard search and SEARCH_ON to allow both the full-text and range-search queries. These indexes are asynchronously maintained in the background at predefined intervals, and thus you do not need to explicitly run the SYNC_INDEX and OPTIMIZE_INDEX operations on such indexes.

  • Manage data sources:

    The DBMS_SEARCH.ADD_SOURCE API allows you to add data sources, such as specific tables or views, from different schemas to this index.

    The DBMS_SEARCH.REMOVE_SOURCE API allows you to remove a table or view and all its associated data from the index.

  • View virtual indexed document:

    The DBMS_SEARCH.GET_DOCUMENT API allows you to view a virtual indexed JSON document, which displays metadata values as indexed for each row of your indexed data sources.

  • Query multiple objects:

    The DBMS_SEARCH.FIND API allows you to retrieve a hitlist of all documents based on the specified filter conditions.

    You can perform full-text or range-based queries within a single data source or across multiple sources using the same index.

Index Tables: DATA and METADATA Columns

A DBMS_SEARCH index stores all supported SQL data types (including Object Type columns) in JSON objects, except for the XMLTYPE and LONG data types. You cannot add a table or view as a data source to the index if it has a column with the XMLTYPE or LONG data type.

The DBMS_SEARCH.CREATE_INDEX procedure creates an index table named INDEX_NAME. This index table contains the following two JSON columns:
  • DATA column:

    A DBMS_SEARCH index, also named INDEX_NAME, is created on the DATA column. Note that the index name matches your index table name.

    This is an empty column, and is a placeholder for querying the DBMS_SEARCH index. You add your data sources into this column. You can run PL/SQL queries on the DATA column using the CONTAINS(), JSON_TEXTCONTAINS(), and JSON_EXISTS operators.

    Note:

    All the data sources (such as table, view, or each table in the view definition) that are added to the DBMS_SEARCH index must include at least one Primary Key column. Each table that is part of a view source having a foreign key must also have the Foreign Key constraint, referencing the relevant primary keys defined on the table.
  • METADATA column:

    The METADATA column helps the DBMS_SEARCH index to uniquely identify each row of the table or view that is indexed.

    The METADATA column contains entries of the form:
    {"OWNER":"TABLE_OWNER",
     "SOURCE":"TABLE_NAME", 
     "KEY":"{PrimaryKey_COLUMN_i":PrimaryKey_VALUE_i}}

    Each JSON key of the METADATA column, that is, OWNER, SOURCE and KEY, is also a separate virtual column in the INDEX_NAME table.

    Note that the INDEX_NAME table is partitioned by OWNER and SOURCE. The OWNER and SOURCE metadata values uniquely identify a particular data source from which the row is picked. When querying a particular data source, you can add a WHERE clause condition on the OWNER and SOURCE virtual columns to restrict your query search to a specific partition of that source using partition pruning.

JSON Data Representation

The INDEX_NAME index stores a JSON representation of the following form for each indexed row of the table or view:
{"OWNER":
  {
   "TABLE_NAME":{"COLUMN1_NAME":"COLUMN1_VALUE",…}
  }
}
The METADATA column of the INDEX_NAME table stores corresponding JSON entries of the following form for each indexed row:
{"OWNER":"TABLE_OWNER",
 "SOURCE":"TABLE_NAME", 
 "KEY":"{PrimaryKey_COLUMN_i":PrimaryKey_VALUE_i}}

For example:

If a user named SCOTT owns the EMPLOYEES table:
EMPLOYEES(EMPL_ID NUMBER PRIMARY KEY, NAME VARCHAR2(200), ADDRESS JSON)
and the EMPLOYEES table includes the following row:
(1,'Jones Smith','{"STREET":"Oracle Parkway","CITY":"Redwood City"}')
then, that row is indexed as follows:
{"SCOTT":"EMPLOYEES": 
  { 
   "EMPL_ID":1, 
   "NAME":"Jones Smith", 
   "ADDRESS":{
              "STREET":"Oracle Parkway", 
              "CITY":"Redwood City"
             }
  }
}
The METADATA column of the INDEX_NAME table stores corresponding JSON entries for the row as:
{"OWNER":"SCOTT","SOURCE":"EMPLOYEES","KEY":{"ID": 1}}

where, OWNER specifies the table owner, SOURCE specifies the table name, and KEY specifies the primary key.

Related Topics

13.2 Example: Searching Multiple Objects using Ubiquitous Search Index

This example lists the steps involved in a simple ubiquitous database search scenario.

Assume that as SCOTT, you want to build a text query application that enables users to query documents across the PRODUCTS and CUSTOMERS tables. Instead of creating multiple indexes or manually defining the USER_DATASTORE and MULTI_COLUMN_DATASTORE procedures along with materialized views, you can use the DBMS_SEARCH PL/SQL package to automatically synthesize documents from both the tables into a ubiquitous search index. You can then perform a full-text and range-based search across the PRODUCTS and CUSTOMERS tables using the same index.

This involves the following steps:

  1. Create the PRODUCTS and CUSTOMERS tables:
    CREATE TABLE PRODUCTS(
        id                NUMBER  primary key,
        price             NUMBER,
        description       VARCHAR2(2000),
        in_stock          BOOLEAN,
        long_description  CLOB);
    
    CREATE TABLE CUSTOMERS(
        id                NUMBER  primary key,
        firstname         VARCHAR2(200),
        lastname          VARCHAR2(200),
        middlename        VARCHAR2(200),
        address           JSON,
        added             DATE);
  2. Populate the tables with data:
    INSERT INTO PRODUCTS values 
        (1, 10, 'simple widget', FALSE, 'a low-cost, high-quality widget');
    
    INSERT INTO PRODUCTS values 
        (2, 2000, 'shiny thing', TRUE, 'a useful product, but very expensive');
    
    INSERT INTO CUSTOMERS values 
        (1, 'Robert', 'Smith', 'Englebert Toddwangle', '{ 
            "street": "1230 East 32nd Street", 
            "city": "Frogmortem", 
            "state": "CA", 
            "country": "USA", 
            "zip": 12352}', 
        SYSDATE);
    
    INSERT INTO CUSTOMERS values 
        (99, 'John', 'Doe', NULL, '{
            "street": "123 High Street", 
            "city": "Richmond", 
            "county": "Greater London", 
            "country": "United Kingdom",
            "postcode": "AB12 6DU"}', 
        SYSDATE);
  3. Create a ubiquitous search index named MYINDEX using the DBMS_SEARCH.CREATE_INDEX procedure:
    exec DBMS_SEARCH.CREATE_INDEX('MYINDEX');
  4. Add the PRODUCTS and CUSTOMERS tables as data sources into your index using the DBMS_SEARCH.ADD_SOURCE procedure:
    exec DBMS_SEARCH.ADD_SOURCE('MYINDEX','PRODUCTS');
    exec DBMS_SEARCH.ADD_SOURCE('MYINDEX','CUSTOMERS');

    This procedure combines contents from all columns of the PRODUCTS and CUSTOMERS tables into a single index table named MYINDEX (which matches your index name).

    The MYINDEX table's METADATA column stores a JSON representation of the following form for each indexed row:
    METADATA
    --------------------------------------------------------------------------------
    {"OWNER":"SCOTT","SOURCE":"PRODUCTS","KEY":{"ID":1}}
    {"OWNER":"SCOTT","SOURCE":"PRODUCTS","KEY":{"ID":2}}
    {"OWNER":"SCOTT","SOURCE":"CUSTOMERS","KEY":{"ID":1}}
    {"OWNER":"SCOTT","SOURCE":"CUSTOMERS","KEY":{"ID":99}}
  5. View the virtual indexed document using the DBMS_SEARCH.GET_DOCUMENT procedure:
    SELECT DBMS_SEARCH.GET_DOCUMENT('MYINDEX',METADATA) from MYINDEX;
    The output returns a JSON document with combined metadata values, as indexed in the MYINDEX index for each row of the PRODUCTS and CUSTOMERS tables:
    DBMS_SEARCH.GET_DOCUMENT('MYINDEX',METADATA)
    --------------------------------------------------------------------------------
    {
      "SCOTT" :
      {
        "PRODUCTS" :
        {
          "ID" : 1,
          "PRICE" : 10,
          "DESCRIPTION" : "simple widget",
          "IN_STOCK" : false,
          "LONG_DESCRIPTION" : "a low-cost, high-quality widget"
        }
      }
    }
    
    {
      "SCOTT" :
      {
        "PRODUCTS" :
        {
          "ID" : 2,
          "PRICE" : 2000,
          "DESCRIPTION" : "shiny thing",
          "IN_STOCK" : true,
          "LONG_DESCRIPTION" : "a useful product, but very expensive"
        }
      }
    }
    
    {
      "SCOTT" :
      {
        "CUSTOMERS" :
        {
          "ID" : 1,
          "FIRSTNAME" : "Robert",
          "LASTNAME" : "Smith",
          "MIDDLENAME" : "Englebert Toddwangle",
          "ADDRESS" :
          {
    	"street" : "1230 East 32nd Street",
    	"city" : "Frogmortem",
    	"state" : "CA",
    	"country" : "USA",
    	"zip" : 12352
          },
          "ADDED" : "2023-08-08T15:29:39"
        }
      }
    }
    
    {
      "SCOTT" :
      {
        "CUSTOMERS" :
        {
          "ID" : 99,
          "FIRSTNAME" : "John",
          "LASTNAME" : "Doe",
          "MIDDLENAME" : null,
          "ADDRESS" :
          {
    	"street" : "123 High Street",
    	"city" : "Richmond",
    	"county" : "Greater London",
    	"country" : "United Kingdom",
    	"postcode" : "AB12 6DU"
          },
          "ADDED" : "2023-08-08T15:29:39"
        }
      }
    }
  6. Run queries against your combined index, as follows:

    Using CONTAINS:

    To search all fields, in the SELECT statement, specify the query in the WHERE clause with the CONTAINS operator.
    • Search a specific source (uses partition pruning):
      SELECT METADATA from MYINDEX  
          WHERE CONTAINS(data,'shiny or street')>0
      And OWNER = 'SCOTT' and SOURCE = 'PRODUCTS';
      Output:
      METADATA
      --------------------------------------------------------------------------------
      {"OWNER":"SCOTT","SOURCE":"PRODUCTS","KEY":{"ID":2}}
    • Search across sources:
      SELECT METADATA from MYINDEX  
          WHERE CONTAINS(data,'shiny or street')>0;
      
      Output:
      METADATA
      --------------------------------------------------------------------------------
      {"OWNER":"SCOTT","SOURCE":"PRODUCTS","KEY":{"ID":2}}
      {"OWNER":"SCOTT","SOURCE":"CUSTOMERS","KEY":{"ID":1}}
      {"OWNER":"SCOTT","SOURCE":"CUSTOMERS","KEY":{"ID":99}}
    • Use the SCORE operator with CONTAINS to obtain a relevance score:
      SELECT METADATA, score(1) as search_score from MYINDEX  
          WHERE CONTAINS(data,'shiny or street',1)>0;
      Output:
      METADATA
      --------------------------------------------------------------------------------
      SEARCH_SCORE
      ------------
      {"OWNER":"SCOTT","SOURCE":"PRODUCTS","KEY":{"ID":2}}   4
      
      {"OWNER":"SCOTT","SOURCE":"CUSTOMERS","KEY":{"ID":1}}  3
      
      {"OWNER":"SCOTT","SOURCE":"CUSTOMERS","KEY":{"ID":99}} 3

    Using JSON_TEXTCONTAINS:

    To search for JSON data, in the SELECT statement, specify the query in the WHERE clause with the JSON_TEXTCONTAINS operator. Specify JSON path expression to indicate the fields where you want to search for the specified text.
    • Search a specific source:
      SELECT METADATA from MYINDEX
          WHERE JSON_TEXTCONTAINS(
              data,'$.SCOTT.CUSTOMERS.ADDRESS.county','london');
      Output:
      METADATA
      --------------------------------------------------------------------------------
      {"OWNER":"SCOTT","SOURCE":"CUSTOMERS","KEY":{"ID":99}}
      
    • Use JSON dot-notation to fetch individual primary key value:
      SELECT p.METADATA.KEY from MYINDEX p
          WHERE JSON_TEXTCONTAINS(
              p.data,'$.SCOTT.CUSTOMERS.ADDRESS.county','london');
      Output:
      KEY
      --------------------------------------------------------------------------------
      {"ID":99}
      
    • Use JOIN to fetch multiple primary key values:

      This query joins rows of the CUSTOMERS table according to the equijoin condition:
      SELECT c.firstname, c.lastname from MYINDEX p, CUSTOMERS c
          WHERE JSON_TEXTCONTAINS(
              p.data,'$.SCOTT.CUSTOMERS.ADDRESS.county','london')
              and
              c.id = p.METADATA.KEY.ID;
      
      Output:
      FIRSTNAME
      --------------------------------------------------------------------------------
      LASTNAME
      --------------------------------------------------------------------------------
      John
      Doe
      

    Using JSON_EXISTS:

    To search for a particular value within JSON data, in the SELECT statement, specify the query in the WHERE clause with the JSON_EXISTS operator.
    • Use JSON_EXISTS on number type:
      SELECT METADATA from MYINDEX
      Where json_exists(data,'$.SCOTT.PRODUCTS?(@.PRICE <= 100)');
      Output:
      METADATA
      --------------------------------------------------------------------------------
      {"OWNER":"SCOTT","SOURCE":"PRODUCTS","KEY":{"ID":1}}
      
    • Use JSON_EXISTS on string type:
      SELECT METADATA from MYINDEX
      Where json_exists(data,'$.SCOTT.CUSTOMERS?(@.FIRSTNAME == "John")');
      Output:
      METADATA
      --------------------------------------------------------------------------------
      {"OWNER":"SCOTT","SOURCE":"CUSTOMERS","KEY":{"ID":99}}