SEARCH
Use the DBMS_HYBRID_VECTOR.SEARCH
PL/SQL function to run textual queries, vector similarity queries, or hybrid queries against hybrid vector indexes.
Purpose
To search by vectors and keywords. This function lets you perform the following tasks:
-
Facilitate a combined (hybrid) query of textual documents and vectorized chunks:
You can query a hybrid vector index in multiple vector and keyword search combinations called search modes, as described in Understand Hybrid Search. This API accepts a JSON specification for all query parameters.
-
Fuse and reorder the search results:
The search results of a hybrid query are fused into a unified result set as
CLOB
using the specified fusion set operator, and reordered by a combined score using the specified scoring algorithm. -
Run a default query for a simplified search experience:
The minimum input parameters required are the
hybrid_index_name
andsearch_text
. The same text string is used to query against a vectorized chunk index and a text document index.
Syntax
DBMS_HYBRID_VECTOR.SEARCH(
json(
'{ "hybrid_index_name": "<hybrid_vector_index_name>",
"search_text" : "<query string for keyword-and-semantic search>",
"search_fusion" : "INTERSECT | UNION | TEXT_ONLY | VECTOR_ONLY | MINUS_TEXT | MINUS_VECTOR | RERANK",
"search_scorer" : "RRF | RSF",
"vector":
{
"search_text" : "<query string for semantic search>",
"search_vector" : "[<vector_embedding>]",
"search_mode" : "DOCUMENT | CHUNK",
"aggregator" : "MAX | AVG | MEDIAN | BONUSMAX | WINAVG | ADJBOOST | MAXAVGMED",
"result_max" : <maximum number of vector results>,
"score_weight" : <score_weight>,
"rank_penalty" : <rank_penalty>,
"inpath" : ["<valid JSON paths>"]
},
"text":
{
"contains" : "<query string for keyword search>",
"search_text" : "<alternative text to use to construct a contains query automatically>",
"score_weight" : <score_weight>,
"rank_penalty" : <rank_penalty>,
"result_max: : <maximum number of document results>
"inpath" : ["<valid JSON paths>"]
},
"return":
{
"topN" : "<topN_value>",
"values" : ["rowid | score | vector_score | text_score | vector_rank | text_rank | chunk_text | chunk_id | paths"],
"format" : "JSON | XML"
}
}'
)
)
Note:
This API supports two constructs of search. One where you specify a singlesearch_text
field for both semantic search and keyword search (default setting). Another where you specify separate search_text
and contains
query fields using vector
and text
sub-elements for semantic search and keyword search, respectively. You cannot use both of these search constructs in one query.
hybrid_index_name
Specify the name of the hybrid vector index to use.
For information on how to create a hybrid vector index if not already created, see Manage Hybrid Vector Indexes.
search_text
Specify a search text string (your query input) for both semantic search and keyword search.
The same text string is used for a keyword query on document text index (by converting the search_text
into a CONTAINS ACCUM
operator syntax) and a semantic query on vectorized chunk index (by vectorizing or embedding the search_text
for a VECTOR_DISTANCE
search).
For example:
SELECT DBMS_HYBRID_VECTOR.SEARCH(
json('{ "hybrid_index_name" : "my_hybrid_idx",
"search_text" : "C, Python"
}'))
FROM DUAL;
search_fusion
Specify a fusion sort operator to define what you want to retain from the combined set of keyword-and-semantic search results.
Note:
This search fusion operation is applicable only to non-pure hybrid search cases. Vector-only and text-only searches do not fuse any results.Parameter | Description |
---|---|
|
Returns only the rows that are common to both text search results and vector search results. Score condition: |
|
Combines all distinct rows from both text search results and vector search results. Score condition: |
|
Returns all distinct rows from text search results plus the ones that are common to both text search results and vector search results. Thus, the fused results contain the text search results that appear in text search, including those that appear in both. Score condition: |
|
Returns all distinct rows from vector search results plus the ones that are common to both text search results and vector search results. Thus, the fused results contain the vector search results that appear in vector search, including those that appear in both. Score condition: |
|
Returns all distinct rows from vector search results minus the ones that are common to both text search results and vector search results. Thus, the fused results contain the vector search results that appear in vector search, excluding those that appear in both. Score condition: |
|
Returns all distinct rows from text search results minus the ones that are common to both text search results and vector search results. Thus, the fused results contain the text search results that appear in text search, excluding those that appear in both. Score condition: |
|
Returns all distinct rows from text search ordered by the aggregated vector score of their respective vectors. There is no core condition for this field since there is only a single search (text) and then use of their returned document’s vectors. |
For example:
SELECT DBMS_HYBRID_VECTOR.SEARCH(
json('{ "hybrid_index_name" : "my_hybrid_idx",
"search_fusion" : "UNION",
"vector":
{ "search_text" : "leadership experience" },
"text":
{ "contains" : "C and Python" }
}'))
FROM DUAL;
search_scorer
Specify a method to evaluate the combined "fusion" search scores from both keyword and semantic search results.
-
RSF
(default) to use the Relative Score Fusion (RSF) algorithm -
RRF
to use the Reciprocal Rank Fusion (RRF) algorithm
For a deeper understanding of how these algorithms work in hybrid search modes, see Understand Hybrid Search.
For example:
With a single search text string for hybrid search:
SELECT DBMS_HYBRID_VECTOR.SEARCH(
json(
'{ "hybrid_index_name" : "my_hybrid_idx",
"search_text" : "C, Python",
"search_scorer" : "rsf"
}'))
FROM DUAL;
With separate vector and text search strings:
SELECT DBMS_HYBRID_VECTOR.SEARCH(
json(
'{ "hybrid_index_name" : "my_hybrid_idx",
"search_scorer" : "rsf",
"vector":
{ "search_text" : "leadership experience" },
"text":
{ "contains" : "C and Python" }
}'))
FROM DUAL;
vector
Specify query parameters for semantic search against the vector index part of your hybrid vector index:
-
search_text
: Search text string (query text).This string is converted into a query vector (embedding), and is used in a
VECTOR_DISTANCE
query to search against the vectorized chunk index.For example:
SELECT DBMS_HYBRID_VECTOR.SEARCH( json('{ "hybrid_index_name" : "my_hybrid_idx", "vector": { "search_text" : "C, Python } }')) FROM DUAL;
-
search_vector
: Vector embedding (query vector).This embedding is directly used in a
VECTOR_DISTANCE
query to search against the vectorized chunk index.Note:
The vector embedding that you pass here must be generated using the same embedding model used for semantic search by the specified hybrid vector index.For example:
SELECT DBMS_HYBRID_VECTOR.SEARCH( json('{ "hybrid_index_name" : "my_hybrid_idx", "vector": { "search_vector" : vector_serialize( vector_embedding(doc_model using "C, Python, Database" as data) RETURNING CLOB) }}')) FROM DUAL;
-
search_mode
: Document or chunk search mode in which you want to query the hybrid vector index:Parameter Description DOCUMENT
(default)Returns document-level results. In document mode, the result of your search is a list of document IDs from the base table corresponding to the list of best documents identified.
CHUNK
Returns chunk-level results. In chunk mode, the result of your search is a list of chunk identifiers and associated document IDs from the base table corresponding to the list of best chunks identified, regardless of whether the chunks come from the same document or different documents.
The content from these chunk texts can be used as input for LLMs to formulate responses.
For example, semantic search in chunk mode:
SELECT DBMS_HYBRID_VECTOR.SEARCH( json( '{ "hybrid_index_name" : "my_hybrid_idx", "vector": { "search_text" : "leadership experience", "search_mode" : "CHUNK" } }')) FROM DUAL;
-
aggregator
: Aggregate function to apply for ranking the vector scores for each document inDOCUMENT SEARCH_MODE
.Parameter Description MAX
(default)Standard database aggregate function that selects the top chunk score as the result score.
AVG
Standard database aggregate function that sums the chunk scores and divides by the count.
MEDIAN
Standard database aggregate function that computes the middle value or an interpolated value of the sorted scores.
BONUSAVG
This function combines the maximum chunk score with the remainder multiplied by the average score of the other top scores.
WINAVG
This function computes the maximum average of the rolling window (of size
windowSize
) of chunk scores.ADJBOOST
This function computes the average "boosted" chunk score. The chunk scores are boosted with the
BOOSTFACTOR
multiplied by average score of the surrounding chunk's scores (if they exist).MAXAVGMED
This function computes a weighted sum of the
MAX
,AVGN
, andMEDN
values.For example:
SELECT DBMS_HYBRID_VECTOR.SEARCH( json( '{ "hybrid_index_name" : "my_hybrid_idx", "vector": { "search_text" : "leadership experience", "search_mode" : "DOCUMENT", "aggregator" : "AVG" } }')) FROM DUAL;
-
result_max
: The maximum number of vector results in distance order to fetch (approx) from the vector index.Value : Any positive integer greater then
0
(zero)Default: If the field is not specified, by default, the maximum is computed based on topN.
For Example:SELECT DBMS_HYBRID_VECTOR.SEARCH( json( '{ "hybrid_index_name" : "my_hybrid_idx", "vector": { "search_text" : "leadership experience", "search_mode" : "DOCUMENT", "aggregator" : "MAX", "score_weight" : 5, "result_max" : 100 } }')) FROM DUAL;
-
score_weight
: Relative weight (degree of importance or preference) to assign to the semanticVECTOR_DISTANCE
query. This value is used when combining the results of RSF ranking.Value: Any positive integer greater than
0
(zero)Default:
10
(implies 10 times more importance to vector query than text query)For example:
SELECT DBMS_HYBRID_VECTOR.SEARCH( json( '{ "hybrid_index_name" : "my_hybrid_idx", "vector": { "search_text" : "leadership experience", "search_mode" : "DOCUMENT", "aggregator" : "MAX", "score_weight" : 5 } }')) FROM DUAL;
-
rank_penalty
: Penalty (denominator in RRF, represented as1/(rank+penality
) to assign to vector query. This can help in balancing the relevance score by reducing the importance of unnecessary or repetitive words in a document. This value is used when combining the results of RRF ranking.Value:
0
(zero) or any positive integerDefault:
1
For example:
SELECT DBMS_HYBRID_VECTOR.SEARCH( json( '{ "hybrid_index_name" : "my_hybrid_idx", "search_scorer" : "rrf", "vector": { "search_text" : "leadership experience", "search_mode" : "DOCUMENT", "aggregator" : "MAX", "score_weight" : 5, "rank_penalty" : 2 } }')) FROM DUAL;
-
inpath
: Valid JSON pathsvector.inpath
uses the vectorizer paths as you have in the document. Providing this parameter will restrict the search to the paths specified in this field. Accepts an array of paths in valid JSON format - ($.a.b.c.d
).The list of paths match against
VECTORIZER
index path lists to form a query constraint on the vector index search. Simple wild cards on the paths are supported such as $.main.*.For example:
SELECT DBMS_HYBRID_VECTOR.SEARCH( json( '{ "hybrid_index_name" : "my_hybrid_idx", "search_scorer" : "rrf", "vector": { "search_text" : "leadership experience", "search_mode" : "DOCUMENT", "aggregator" : "MAX", "score_weight" : 5, "rank_penalty" : 2, "inpath" : ["$.person.*", "$.product.*"] } }')) FROM DUAL;
text
Specify query parameters for keyword search against the Oracle Text index part of your hybrid vector index:
-
contains
: Search text string (query text).This string is converted into an Oracle Text
CONTAINS
query operator syntax for keyword search.You can use
CONTAINS
query operators to specify query expressions for full-text search, such as OR (|
), AND (&
), STEM ($
), MINUS (-
), and so on. For a complete list of all such operators to use, see Oracle Text Reference.For example:
With a text contains string for pure keyword search:
SELECT DBMS_HYBRID_VECTOR.SEARCH( json('{ "hybrid_index_name" : "my_hybrid_idx", "text": { "contains" : "C and Python" } }')) FROM DUAL;
With separate search texts using
vector
andtext
sub-elements for hybrid search. One search text or a vector embedding to run aVECTOR_DISTANCE
query for semantic search. A second search text to run aCONTAINS
query for keyword search. This query conducts two separate keyword and semantic queries, where keyword scores and semantic scores are combined:SELECT DBMS_HYBRID_VECTOR.SEARCH( json('{ "hybrid_index_name" : "my_hybrid_idx", "vector": { "search_text" : "leadership experience" }, "text": { "contains" : "C and Python" } }')) FROM DUAL;
-
search_text
: The alternative search text to use to construct a contains query automatically.SELECT DBMS_HYBRID_VECTOR.SEARCH( json('{ "hybrid_index_name" : "my_hybrid_idx", "text": { "contains" : "C and Python", "search_text" : "data science skills" } }')) FROM DUAL;
-
score_weight
: Relative weight (degree of importance or preference) to assign to the textCONTAINS
query. This value is used when combining the results of RSF ranking.Value: Any positive integer greater than
0
(zero)Default:
1
(implies neutral weight)For example:
SELECT DBMS_HYBRID_VECTOR.SEARCH( json( '{ "hybrid_index_name" : "my_hybrid_idx", "text": { "contains" : "C and Python", "score_weight" : 1 } }')) FROM DUAL;
-
rank_penalty
: Penalty (denominator in RRF, represented as1/(rank+penality
) to assign to keyword query.This can help in balancing the relevance score by reducing the importance of unnecessary or repetitive words in a document. This value is used when combining the results of RRF ranking.
Value:
0
(zero) or any positive integerDefault:
5
For example:
SELECT DBMS_HYBRID_VECTOR.SEARCH( json( '{ "hybrid_index_name" : "my_hybrid_idx", "text": { "contains" : "C and Python", "rank_penalty" : 5 } }')) FROM DUAL;
-
inpath
: Valid JSON pathsProviding this parameter will restrict the search to the paths specified in this field. Accepts an array of paths in valid JSON format - (
$.a.b.c.d
).For example:
SELECT DBMS_HYBRID_VECTOR.SEARCH( json( '{ "hybrid_index_name" : "my_hybrid_idx", "text": { "contains" : "C and Python", "rank_penalty" : 5, "inpath" : ["$.person.*","$.product.*"] } }')) FROM DUAL;
-
result_max
: The maximum number of document results (ordered by score) to retrieve from the document index. If not provided, the maximum is computed based on the topN.For example:
SELECT DBMS_HYBRID_VECTOR.SEARCH( json( '{ "hybrid_index_name" : "my_hybrid_idx", "text": { "contains" : "C and Python", "rank_penalty" : 5, "inpath" : ["$.person.*","$.product.*"], "result_max" : 100 } }')) FROM DUAL;
return
Specify which fields to appear in the result set:
Parameter | Description |
---|---|
|
Maximum number of best-matched results to be returned Value: Any integer greater than Default: |
|
Return attributes for the search results. Values for scores range between 100 (best) to 0 (worse).
Default: All the above return attributes EXCEPT |
|
Format of the results as:
|
For example:
SELECT DBMS_HYBRID_VECTOR.SEARCH(
json(
'{ "hybrid_index_name" : "my_hybrid_idx",
"search_text" : "C, Python",
"return":
{
"values" : [ "rowid", "score", "paths ],
"topN" : 3,
"format" : "JSON"
}
}'))
FROM DUAL;
Complete Example With All Query Parameters
The following example shows a hybrid search query that performs separate text and vector searches against my_hybrid_idx
. This query specifies the search_text
for vector search using the vector_distance
function as prioritize teamwork and leadership experience
and the keyword for text search using the contains
operator as C and Python
. The search mode is DOCUMENT
to return the search results as topN documents.
SELECT JSON_SERIALIZE(
DBMS_HYBRID_VECTOR.SEARCH(
json(
'{ "hybrid_index_name" : "my_hybrid_idx",
"search_fusion" : "INTERSECT",
"search_scorer" : "rsf",
"vector":
{
"search_text" : "prioritize teamwork and leadership experience",
"search_mode" : "DOCUMENT",
"score_weight" : 10,
"rank_penalty" : 1,
"aggregator" : "MAX",
"inpath" : ["$.main.body", "$.main.summary"]
},
"text":
{
"contains" : "C and Python",
"score_weight" : 1,
"rank_penalty" : 5,
"inpath" : ["$.main.body"]
},
"return":
{
"format" : "JSON",
"topN" : 3,
"values" : [ "rowid", "score", "vector_score",
"text_score", "vector_rank",
"text_rank", "chunk_text", "chunk_id", "paths" ]
}
}'
)
) pretty)
FROM DUAL;
The top 3 rows are ordered by relevance, with higher scores indicating a better match. All the return attributes are shown by default:
[
{
"rowid" : "AAAR9jAABAAAQeaAAA",
"score" : 58.64,
"vector_score" : 61,
"text_score" : 35,
"vector_rank" : 1,
"text_rank" : 2,
"chunk_text" : "Candidate 1: C Master. Optimizes low-level system (i.e. Database)
performance with C. Strong leadership skills in guiding teams to
deliver complex projects.",
"chunk_id" : "1",
"paths" : ["$.main.body","$.main.summary"]
},
{
"rowid" : "AAAR9jAABAAAQeaAAB",
"score" : 56.86,
"vector_score" : 55.75,
"text_score" : 68,
"vector_rank" : 3,
"text_rank" : 1,
"chunk_text" : "Candidate 3: Full-Stack Developer. Skilled in Database, C, HTML,
JavaScript, and Python with experience in building responsive web
applications. Thrives in collaborative team environments.",
"chunk_id" : "1",
"paths" : ["$.main.body", "$.main.summary"]
},
{
"rowid" : "AAAR9jAABAAAQeaAAD",
"score" : 51.67,
"vector_score" : 56.64,
"text_score" : 2,
"vector_rank" : 2,
"text_rank" : 3,
"chunk_text" : "Candidate 2: Database Administrator (DBA). Maintains and secures
enterprise database (Oracle, MySql, SQL Server). Passionate about
data integrity and optimization. Strong mentor for junior DBA(s).",
"chunk_id" : "1",
"paths" : ["$.main.body", "$.main.summary"]
}
]
End-to-end example:
To see how to create a hybrid vector index and explore all types of queries against the index, see Query Hybrid Vector Indexes End-to-End Example.
Related Topics
Parent topic: DBMS_HYBRID_VECTOR