Creating Hybrid Vector Indexes using DBMS_SEARCH API

The DBMS_SEARCH package provides procedures and functions for creating, managing, and querying hybrid vector indexes.

The CREATE_INDEX procedure can be used to create a hybrid vector index.

Syntax:
DBMS_SEARCH.CREATE_INDEX(
    index_name       VARCHAR2,
    tablespace       VARCHAR2 DEFAULT NULL,
    datatype         VARCHAR2 DEFAULT NULL,
    lexer            VARCHAR2 DEFAULT NULL,
    stoplist         VARCHAR2 DEFAULT NULL,
    wordlist         VARCHAR2 DEFAULT NULL,
    vectorizer       VARCHAR2 DEFAULT NULL
);

The syntax contains parameters that can define both text as well as vector search preferences. The parameters lexer, stoplist, and wordlist allow you to specify the text preferences for the hybrid vector index and the parameter vectorizer can be used to specify the vector preference for the hybrid search. When using the vectorizer parameter, you must first use the DBMS_VECTOR_CHAIN.CREATE_PREFERENCE PL/SQL function to create a vectorizer preference. For more information, see CREATE_PREFERENCE. After creating a vectorizer preference, you can use the vectorizer parameter to pass the preference name. You can define the lexer, stoplist and wordlist using CREATE_PREFERENCE and CREATE_STOPLIST functions.

Example:

The following example also displays the sample code to define lexer, stoplist, wordlist and vectorizer.

 exec CTX_DDL.CREATE_PREFERENCE('my_lexer','BASIC_LEXER');
 exec CTX_DDL.CREATE_STOPLIST('my_stoplist','BASIC_STOPLIST');
 exec CTX_DDL.CREATE_PREFERENCE('my_wordlist','BASIC_WORDLIST');

begin
    DBMS_VECTOR_CHAIN.CREATE_PREFERENCE(
       PREF_NAME => 'my_vectorizer_spec',
       PREF_TYPE =>  DBMS_VECTOR_CHAIN.VECTORIZER,
       PARAMS    => json(
                         '{"vector_idxtype" : "ivf",
                           "model"          : "ALL_MINILM_L12",
                           "by"             : "words",
                           "max"            : "100",
                           "overlap"        : "10",
                           "split"          : "recursively",
                           "language"       : "english"
                          }'));
 end;
 /

 begin
    DBMS_SEARCH.CREATE_INDEX(
       index_name => 'my_hybrid_idx',
       datatype   => 'JSON',
       lexer      => 'my_lexer',
       stoplist   => 'my_stoplist',
       wordlist   => 'my_wordlist',
       vectorizer => 'my_vectorizer_spec');
 end;
/

The DBMS_SEARCH package also contains a procedure, ADD_SOURCE, that allows you to add one or more data sources (tables, views, or duality views) from different schemas to the DBMS_SEARCH index.

Syntax:
DBMS_SEARCH.ADD_SOURCE (
index_name      VARCHAR2,
source_name     VARCHAR2,
memory          VARCHAR2 DEFAULT NULL,
parallel_degree NUMBER DEFAULT NULL);
Example:
exec DBMS_SEARCH.ADD_SOURCE('my_hybrid_idx','MYVIEW');

The DBMS_SEARCH package also contains procedures that allow you to remove a data source and all its associated data from the index, remove the index and all its associated data from the database, return a virtual indexed JSON document for the specified source metadata, retrieve a hitlist, and produce an aggregation of JSON documents based on the specified filter conditions. Refer to DBMS_SEARCH Package for more details about the procedure, syntax, usage, restrictions, and so on.