Creating Hybrid Vector Indexes using DBMS_SEARCH API
The DBMS_SEARCH
package provides procedures and functions
for creating, managing, and querying hybrid vector indexes.
The CREATE_INDEX procedure can be used to create a hybrid vector index.
DBMS_SEARCH.CREATE_INDEX(
index_name VARCHAR2,
tablespace VARCHAR2 DEFAULT NULL,
datatype VARCHAR2 DEFAULT NULL,
lexer VARCHAR2 DEFAULT NULL,
stoplist VARCHAR2 DEFAULT NULL,
wordlist VARCHAR2 DEFAULT NULL,
vectorizer VARCHAR2 DEFAULT NULL
);
The syntax contains parameters that can define both text as well as vector
search preferences. The parameters lexer
, stoplist
,
and wordlist
allow you to specify the text preferences for the hybrid
vector index and the parameter vectorizer
can be used to specify the
vector preference for the hybrid search. When using the vectorizer
parameter, you must first use the DBMS_VECTOR_CHAIN.CREATE_PREFERENCE
PL/SQL function to create a vectorizer preference. For more information, see CREATE_PREFERENCE. After creating a vectorizer
preference, you can use the vectorizer parameter to pass the preference name. You can
define the lexer
, stoplist
and
wordlist
using CREATE_PREFERENCE and CREATE_STOPLIST functions.
Example:
exec CTX_DDL.CREATE_PREFERENCE('my_lexer','BASIC_LEXER');
exec CTX_DDL.CREATE_STOPLIST('my_stoplist','BASIC_STOPLIST');
exec CTX_DDL.CREATE_PREFERENCE('my_wordlist','BASIC_WORDLIST');
begin
DBMS_VECTOR_CHAIN.CREATE_PREFERENCE(
PREF_NAME => 'my_vectorizer_spec',
PREF_TYPE => DBMS_VECTOR_CHAIN.VECTORIZER,
PARAMS => json(
'{"vector_idxtype" : "ivf",
"model" : "ALL_MINILM_L12",
"by" : "words",
"max" : "100",
"overlap" : "10",
"split" : "recursively",
"language" : "english"
}'));
end;
/
begin
DBMS_SEARCH.CREATE_INDEX(
index_name => 'my_hybrid_idx',
datatype => 'JSON',
lexer => 'my_lexer',
stoplist => 'my_stoplist',
wordlist => 'my_wordlist',
vectorizer => 'my_vectorizer_spec');
end;
/
The DBMS_SEARCH
package also contains a procedure, ADD_SOURCE, that allows you to add one or more data sources
(tables, views, or duality views) from different schemas to the
DBMS_SEARCH
index.
DBMS_SEARCH.ADD_SOURCE (
index_name VARCHAR2,
source_name VARCHAR2,
memory VARCHAR2 DEFAULT NULL,
parallel_degree NUMBER DEFAULT NULL);
exec DBMS_SEARCH.ADD_SOURCE('my_hybrid_idx','MYVIEW');
The DBMS_SEARCH
package also contains procedures that allow
you to remove a data source and all its associated data from the index, remove the index
and all its associated data from the database, return a virtual indexed JSON document
for the specified source metadata, retrieve a hitlist, and produce an aggregation of
JSON documents based on the specified filter conditions. Refer to DBMS_SEARCH Package for more details about the procedure,
syntax, usage, restrictions, and so on.
Parent topic: Manage Hybrid Vector Indexes