About SQL Functions to Generate Embeddings

Choose to implement Vector Utility SQL functions to perform parallel or on-the-fly chunking and embedding operations, within the database. The supplied SQL functions for vector generation are VECTOR_CHUNKS and VECTOR_EMBEDDING.

Vector Utility SQL functions are intended for a direct and quick interaction with data, within pure SQL.

VECTOR_CHUNKS

Use the VECTOR_CHUNKS SQL function if you want to split plain text into chunks (pieces of words, sentences, or paragraphs) in preparation for the generation of embeddings, to be used with a vector index.

For example, you can use this function to build a standalone Text Chunking system that lets you break down a large PDF document into smaller yet semantically meaningful chunk texts. You can experiment with your chunks by running parallel chunking operations, where you can inspect each chunk text, accordingly amend the chunking results, and then proceed further with other data transformation stages.

To generate chunks, this function uses the in-house implementation with Oracle Database.

For detailed information on this function, see VECTOR_CHUNKS.

VECTOR_EMBEDDING

Use the VECTOR_EMBEDDING function if you want to generate a single vector embedding for different data types.

For example, you can use this function in information-retrieval applications or chatbots, where you want to generate a query vector on the fly from a user's natural language text input string. You can then query a vector field with this query vector for a fast similarity search.

To generate an embedding, this function uses a vector embedding model (in ONNX format) that you load into the database.

Note:

To generate embeddings by using third-party vector embedding models, you can use Vector Utility PL/SQL packages. These packages let you work with both embedding models stored in the database and third-party models (by calling third-party REST APIs).

For detailed information on this function, see VECTOR_EMBEDDING.