Convert File to Embeddings
You can directly extract vector embeddings from a PDF document, using a single-step statement.
Perform a file-to-text-to-chunks-to-embeddings transformation (using a declared embedding model), by calling a set of DBMS_VECTOR_CHAIN.UTL
functions in a single CREATE TABLE
statement.
This statement creates a relational table (doc_chunks
) from unstructured text chunks and the corresponding vector embeddings:
CREATE TABLE doc_chunks as
(select dt.id doc_id, et.embed_id, et.embed_data, to_vector(et.embed_vector) embed_vector
from
documentation_tab dt,
dbms_vector_chain.utl_to_embeddings(
dbms_vector_chain.utl_to_chunks(dbms_vector_chain.utl_to_text(dt.data), json('{"normalize":"all"}')),
json('{"provider":"database", "model":"doc_model"}')) t,
JSON_TABLE(t.column_value, '$[*]' COLUMNS (embed_id NUMBER PATH '$.embed_id', embed_data VARCHAR2(4000) PATH '$.embed_data', embed_vector CLOB PATH '$.embed_vector')) et
);
Note that each successive function depends on the output of the previous function, so the order of chains is important here. First, the output from utl_to_text
(dt.data
column) is passed as an input for utl_to_chunks
and then the output from utl_to_chunks
is passed as an input for utl_to_embeddings
.
For complete example, run SQL Quick Start Using a Vector Embedding Model Uploaded into the Database, where you embed two Oracle Database Documentation books in the doc_chunks
table and perform similarity searches using vector indexes.
Parent topic: Generate Embeddings: SQL and PL/SQL Examples