13 Performing Ubiquitous Search with DBMS_SEARCH APIs
Starting with Oracle Database 23ai, you can use the DBMS_SEARCH PL/SQL
package for indexing of multiple schema objects in a single index.
13.1 About Ubiquitous Search Index
Ubiquitous database search enables full-text and range-based search across multiple objects within an entire schema.
Overview
A ubiquitous search index (or DBMS_SEARCH
index) is a JSON search index with predefined set of preferences and settings enabled for performing full text search on multiple columns, tables, or views. You use the DBMS_SEARCH PL/SQL
package to create, manage, and query these indexes.
You can create a DBMS_SEARCH
index on multiple columns of a table or multiple columns from different tables in a given schema. You can add data sources, that is tables and views, into this index (without the need to materialize the views). All the columns in the specified sources are indexed and available for a full-text or range-based search.
DBMS_SEARCH
APIs, you can perform these tasks:
-
Create index:
The
DBMS_SEARCH.CREATE_INDEX
API allows you to create aDBMS_SEARCH
index.By default, this index is created with key indexing preferences, such as
BASIC_WORDLIST
to allow wildcard search andSEARCH_ON
to allow both the full-text and range-search queries. These indexes are asynchronously maintained in the background at predefined intervals, and thus you do not need to explicitly run theSYNC_INDEX
andOPTIMIZE_INDEX
operations on such indexes. -
Manage data sources:
The
DBMS_SEARCH.ADD_SOURCE
API allows you to add data sources, such as specific tables or views, from different schemas to this index.The
DBMS_SEARCH.REMOVE_SOURCE
API allows you to remove a table or view and all its associated data from the index. -
View virtual indexed document:
The
DBMS_SEARCH.GET_DOCUMENT
API allows you to view a virtual indexed JSON document, which displays metadata values as indexed for each row of your indexed data sources. -
Query multiple objects:
The
DBMS_SEARCH.FIND
API allows you to retrieve a hitlist of all documents based on the specified filter conditions.You can perform full-text or range-based queries within a single data source or across multiple sources using the same index.
Index Tables: DATA and METADATA Columns
A DBMS_SEARCH
index stores all supported SQL data types (including Object Type
columns) in JSON objects, except for the XMLTYPE
and LONG
data types. You cannot add a table or view as a data source to the index if it has a column with the XMLTYPE
or LONG
data type.
DBMS_SEARCH.CREATE_INDEX
procedure creates an index table named INDEX_NAME
. This index table contains the following two JSON columns:
-
DATA column:
A
DBMS_SEARCH
index, also namedINDEX_NAME
, is created on theDATA
column. Note that the index name matches your index table name.This is an empty column, and is a placeholder for querying the
DBMS_SEARCH
index. You add your data sources into this column. You can run PL/SQL queries on theDATA
column using theCONTAINS()
,JSON_TEXTCONTAINS()
, andJSON_EXISTS
operators.Note:
All the data sources (such as table, view, or each table in the view definition) that are added to theDBMS_SEARCH
index must include at least onePrimary Key
column. Each table that is part of a view source having a foreign key must also have theForeign Key
constraint, referencing the relevant primary keys defined on the table. -
METADATA column:
The
METADATA
column helps theDBMS_SEARCH
index to uniquely identify each row of the table or view that is indexed.TheMETADATA
column contains entries of the form:{"OWNER":"TABLE_OWNER", "SOURCE":"TABLE_NAME", "KEY":"{PrimaryKey_COLUMN_i":PrimaryKey_VALUE_i}}
Each JSON key of the
METADATA
column, that is,OWNER
,SOURCE
andKEY
, is also a separate virtual column in theINDEX_NAME
table.Note that the
INDEX_NAME
table is partitioned byOWNER
andSOURCE
. TheOWNER
andSOURCE
metadata values uniquely identify a particular data source from which the row is picked. When querying a particular data source, you can add aWHERE
clause condition on theOWNER
andSOURCE
virtual columns to restrict your query search to a specific partition of that source using partition pruning.
JSON Data Representation
INDEX_NAME
index stores a JSON representation of the following form for each indexed row of the table or view:{"OWNER":
{
"TABLE_NAME":{"COLUMN1_NAME":"COLUMN1_VALUE",…}
}
}
METADATA
column of the INDEX_NAME
table stores corresponding JSON entries of the following form for each indexed row:{"OWNER":"TABLE_OWNER",
"SOURCE":"TABLE_NAME",
"KEY":"{PrimaryKey_COLUMN_i":PrimaryKey_VALUE_i}}
For example:
SCOTT
owns the EMPLOYEES
table:EMPLOYEES(EMPL_ID NUMBER PRIMARY KEY, NAME VARCHAR2(200), ADDRESS JSON)
EMPLOYEES
table includes the following row:(1,'Jones Smith','{"STREET":"Oracle Parkway","CITY":"Redwood City"}')
{"SCOTT":"EMPLOYEES":
{
"EMPL_ID":1,
"NAME":"Jones Smith",
"ADDRESS":{
"STREET":"Oracle Parkway",
"CITY":"Redwood City"
}
}
}
METADATA
column of the INDEX_NAME
table stores corresponding JSON entries for the row as:{"OWNER":"SCOTT","SOURCE":"EMPLOYEES","KEY":{"ID": 1}}
where, OWNER
specifies the table owner, SOURCE
specifies the table name, and KEY
specifies the primary key.
Related Topics
13.2 Example: Searching Multiple Objects using Ubiquitous Search Index
This example lists the steps involved in a simple ubiquitous database search scenario.
Assume that as SCOTT
, you want to build a text query application that enables users to query documents across the PRODUCTS
and CUSTOMERS
tables. Instead of creating multiple indexes or manually defining the USER_DATASTORE
and MULTI_COLUMN_DATASTORE
procedures along with materialized views, you can use the DBMS_SEARCH PL/SQL
package to automatically synthesize documents from both the tables into a ubiquitous search index. You can then perform a full-text and range-based search across the PRODUCTS
and CUSTOMERS
tables using the same index.
This involves the following steps:
-
Create the
PRODUCTS
andCUSTOMERS
tables:CREATE TABLE PRODUCTS( id NUMBER primary key, price NUMBER, description VARCHAR2(2000), in_stock BOOLEAN, long_description CLOB); CREATE TABLE CUSTOMERS( id NUMBER primary key, firstname VARCHAR2(200), lastname VARCHAR2(200), middlename VARCHAR2(200), address JSON, added DATE);
-
Populate the tables with data:
INSERT INTO PRODUCTS values (1, 10, 'simple widget', FALSE, 'a low-cost, high-quality widget'); INSERT INTO PRODUCTS values (2, 2000, 'shiny thing', TRUE, 'a useful product, but very expensive'); INSERT INTO CUSTOMERS values (1, 'Robert', 'Smith', 'Englebert Toddwangle', '{ "street": "1230 East 32nd Street", "city": "Frogmortem", "state": "CA", "country": "USA", "zip": 12352}', SYSDATE); INSERT INTO CUSTOMERS values (99, 'John', 'Doe', NULL, '{ "street": "123 High Street", "city": "Richmond", "county": "Greater London", "country": "United Kingdom", "postcode": "AB12 6DU"}', SYSDATE);
-
Create a ubiquitous search index named
MYINDEX
using theDBMS_SEARCH.CREATE_INDEX
procedure:exec DBMS_SEARCH.CREATE_INDEX('MYINDEX');
-
Add the
PRODUCTS
andCUSTOMERS
tables as data sources into your index using theDBMS_SEARCH.ADD_SOURCE
procedure:exec DBMS_SEARCH.ADD_SOURCE('MYINDEX','PRODUCTS'); exec DBMS_SEARCH.ADD_SOURCE('MYINDEX','CUSTOMERS');
This procedure combines contents from all columns of the
PRODUCTS
andCUSTOMERS
tables into a single index table namedMYINDEX
(which matches your index name).TheMYINDEX
table'sMETADATA
column stores a JSON representation of the following form for each indexed row:METADATA -------------------------------------------------------------------------------- {"OWNER":"SCOTT","SOURCE":"PRODUCTS","KEY":{"ID":1}} {"OWNER":"SCOTT","SOURCE":"PRODUCTS","KEY":{"ID":2}} {"OWNER":"SCOTT","SOURCE":"CUSTOMERS","KEY":{"ID":1}} {"OWNER":"SCOTT","SOURCE":"CUSTOMERS","KEY":{"ID":99}}
-
View the virtual indexed document using the
DBMS_SEARCH.GET_DOCUMENT
procedure:SELECT DBMS_SEARCH.GET_DOCUMENT('MYINDEX',METADATA) from MYINDEX;
The output returns a JSON document with combined metadata values, as indexed in theMYINDEX
index for each row of thePRODUCTS
andCUSTOMERS
tables:DBMS_SEARCH.GET_DOCUMENT('MYINDEX',METADATA) -------------------------------------------------------------------------------- { "SCOTT" : { "PRODUCTS" : { "ID" : 1, "PRICE" : 10, "DESCRIPTION" : "simple widget", "IN_STOCK" : false, "LONG_DESCRIPTION" : "a low-cost, high-quality widget" } } } { "SCOTT" : { "PRODUCTS" : { "ID" : 2, "PRICE" : 2000, "DESCRIPTION" : "shiny thing", "IN_STOCK" : true, "LONG_DESCRIPTION" : "a useful product, but very expensive" } } } { "SCOTT" : { "CUSTOMERS" : { "ID" : 1, "FIRSTNAME" : "Robert", "LASTNAME" : "Smith", "MIDDLENAME" : "Englebert Toddwangle", "ADDRESS" : { "street" : "1230 East 32nd Street", "city" : "Frogmortem", "state" : "CA", "country" : "USA", "zip" : 12352 }, "ADDED" : "2023-08-08T15:29:39" } } } { "SCOTT" : { "CUSTOMERS" : { "ID" : 99, "FIRSTNAME" : "John", "LASTNAME" : "Doe", "MIDDLENAME" : null, "ADDRESS" : { "street" : "123 High Street", "city" : "Richmond", "county" : "Greater London", "country" : "United Kingdom", "postcode" : "AB12 6DU" }, "ADDED" : "2023-08-08T15:29:39" } } }
-
Run queries against your combined index, as follows:
Using CONTAINS:
To search all fields, in theSELECT
statement, specify the query in theWHERE
clause with theCONTAINS
operator.-
Search a specific source (uses partition pruning):
SELECT METADATA from MYINDEX WHERE CONTAINS(data,'shiny or street')>0 And OWNER = 'SCOTT' and SOURCE = 'PRODUCTS';
Output:METADATA -------------------------------------------------------------------------------- {"OWNER":"SCOTT","SOURCE":"PRODUCTS","KEY":{"ID":2}}
-
Search across sources:
SELECT METADATA from MYINDEX WHERE CONTAINS(data,'shiny or street')>0;
Output:METADATA -------------------------------------------------------------------------------- {"OWNER":"SCOTT","SOURCE":"PRODUCTS","KEY":{"ID":2}} {"OWNER":"SCOTT","SOURCE":"CUSTOMERS","KEY":{"ID":1}} {"OWNER":"SCOTT","SOURCE":"CUSTOMERS","KEY":{"ID":99}}
-
Use the
SCORE
operator withCONTAINS
to obtain a relevance score:SELECT METADATA, score(1) as search_score from MYINDEX WHERE CONTAINS(data,'shiny or street',1)>0;
Output:METADATA -------------------------------------------------------------------------------- SEARCH_SCORE ------------ {"OWNER":"SCOTT","SOURCE":"PRODUCTS","KEY":{"ID":2}} 4 {"OWNER":"SCOTT","SOURCE":"CUSTOMERS","KEY":{"ID":1}} 3 {"OWNER":"SCOTT","SOURCE":"CUSTOMERS","KEY":{"ID":99}} 3
Using JSON_TEXTCONTAINS:
To search for JSON data, in theSELECT
statement, specify the query in theWHERE
clause with theJSON_TEXTCONTAINS
operator. Specify JSON path expression to indicate the fields where you want to search for the specified text.-
Search a specific source:
SELECT METADATA from MYINDEX WHERE JSON_TEXTCONTAINS( data,'$.SCOTT.CUSTOMERS.ADDRESS.county','london');
Output:METADATA -------------------------------------------------------------------------------- {"OWNER":"SCOTT","SOURCE":"CUSTOMERS","KEY":{"ID":99}}
-
Use JSON dot-notation to fetch individual primary key value:
SELECT p.METADATA.KEY from MYINDEX p WHERE JSON_TEXTCONTAINS( p.data,'$.SCOTT.CUSTOMERS.ADDRESS.county','london');
Output:KEY -------------------------------------------------------------------------------- {"ID":99}
-
Use
JOIN
to fetch multiple primary key values:This query joins rows of theCUSTOMERS
table according to theequijoin
condition:SELECT c.firstname, c.lastname from MYINDEX p, CUSTOMERS c WHERE JSON_TEXTCONTAINS( p.data,'$.SCOTT.CUSTOMERS.ADDRESS.county','london') and c.id = p.METADATA.KEY.ID;
Output:FIRSTNAME -------------------------------------------------------------------------------- LASTNAME -------------------------------------------------------------------------------- John Doe
Using JSON_EXISTS:
To search for a particular value within JSON data, in theSELECT
statement, specify the query in theWHERE
clause with theJSON_EXISTS
operator.-
Use
JSON_EXISTS
onnumber
type:SELECT METADATA from MYINDEX Where json_exists(data,'$.SCOTT.PRODUCTS?(@.PRICE <= 100)');
Output:METADATA -------------------------------------------------------------------------------- {"OWNER":"SCOTT","SOURCE":"PRODUCTS","KEY":{"ID":1}}
-
Use
JSON_EXISTS
onstring
type:SELECT METADATA from MYINDEX Where json_exists(data,'$.SCOTT.CUSTOMERS?(@.FIRSTNAME == "John")');
Output:METADATA -------------------------------------------------------------------------------- {"OWNER":"SCOTT","SOURCE":"CUSTOMERS","KEY":{"ID":99}}
-
Related Topics