GenAI Retrieve Task

3.2.6 GenAI Retrieve Task

Use a GenAI Retrieve Task to generate natural language responses by using retrieval augmented generation (RAG). The task retrieves relevant context from a vector table in Oracle Database and sends that context, along with the user query, to a large language model (LLM) to generate the response.

Retrieval Augmented Generation (RAG) is an approach developed to address the limitations of LLMs. RAG combines the strengths of pretrained language models with the ability to retrieve recent and accurate information from a dataset or database in real-time during the generation of responses. For more information about RAG, see About Retrieval Augmented Generation in Oracle AI Vector Search User's Guide.

A GenAI Retrieve Task uses two model configurations:

An LLM profile and model to generate the final natural language response.
An embedding profile and embedding model to convert the query into a vector and search the vector table.

Prerequisites

Before you begin, complete the following tasks:

Create an LLM definition for response generation. See Create an LLM Definition.
Create an embedding model LLM definition. This should match the embedding model used when the data was ingested.
Create a database profile for the Oracle Database instance that contains the vector table. See Create a Database Profile.
Use this task after you have ingested content into a vector table, typically by using a GenAI Ingestion task. For example, you can ingest policy documents, product manuals, or customer records, and then use a GenAI Retrieve task to answer questions based on that ingested content. This task must use the same embedding model, dimensions, index type, and distance type as the ingestion setup for the vector table.

To add a GenAI Retrieve Task

Navigate to the Task tab in a workflow and view all the tasks that you can add using the Workflow Builder. See Access the Task Tab in Workflow Builder.
In the More Tasks dialog box, click GenAI Retrieve Task to add it to the workflow.
Click the task that you have added in the left pane. The Task tab in the right pane displays details about the task, such as its name and parameters. Next, let's provide details for the task.
In the Task Details group, enter the following information.
- Task Name: Mandatory. Enter a unique name for the task. The name must be between 1 to 128 alphanumeric characters in length and cannot contain spaces or any special characters. Optionally, you can use underscore (_) and hyphen (-).
- Task Reference: Mandatory. Enter a value to refer to the task within a workflow definition. This value must be unique within a workflow. The task reference value must be between 1 to 128 alphanumeric characters in length and cannot contain spaces or any special characters. Optionally, you can use underscore (_) and hyphen (-).
In the GenAI Retrieve Parameters group, provide the following information.
- LLM Profile: Mandatory. Select the LLM definition used to generate the final response.
- Model Name: Mandatory. Select the model to use to generate response.
- Embedding Profile Name: Mandatory. Select the LLM definition that contains the embedding model. Use the same embedding profile that was used to ingest the data, unless the new model produces compatible embeddings.
- Embedding Model: Mandatory. Select the embedding model used to convert the query into a vector. This should match the model used during ingestion.
- Data Store Profile: Mandatory. Select the database profile for the Oracle Database instance that contains the vector table.
- Table Name: Mandatory. Enter the table name that contains the vector embeddings.
- Query: Mandatory. Enter the natural language query. You can enter static text or use workflow expressions, such as ${workflow.input.query}.
- Temperature: Optional. Controls randomness of the generated response. Lower values generate more deterministic responses. Higher values generate more varied responses. The default value is 0.2.
- Max Tokens: Optional. Sets the maximum number of tokens that the model can generate. The default value is 1024.
- TopK (LLM): Optional. Controls token sampling for the response generation model. In JSON, use top_k. The default value is 40.
- Dimensions: Optional. Enter the vector dimension count. The default value is 512. This value must match the vector table definition and the embeddings generated during ingestion.
- TopK (RAG): Optional. Sets the number of matching chunks retrieved from the vector table and passed as context to the LLM. In JSON, use topK. The default value is 5.
- RAG Type: Optional. Select the retrieval mode. Supported values are Naive and Advanced. The default value is naive. The advanced mode rewrites the query and compresses retrieved documents before generating the response.
- Index Type: Optional. Select the vector index type. Use the same index configuration as the vector table. Select HNSW for Hierarchical Navigable Small World (HNSW), IVF for Inverted File Flat (IVF), or NONE. The default value is HNSW.
- Distance Type: Optional. Select the distance metric used for vector similarity search. Use the same distance metric used during ingestion. Supported values are COSINE, DOT, EUCLIDEAN, MANHATTAN, and EUCLIDEAN_SQUARED. The default value is cosine.
Click Save to save the changes to the workflow.

MicroTx Workflows displays the changes in JSON code.
Review all the changes, and then click Confirm Save to save the changes.

If you do not want to save the changes, click Cancel, and then click Reset to undo all the changes that you have made since the workflow was last saved.

Example

When you enter information in the Task tab, the corresponding code of the task is updated in the JSON tab. The following sample code displays the JSON code for a GenAI Retrieve task with sample values.

{
    "name": "genai_retrieve",
    "taskReferenceName": "genai_retrieve_ref",
    "inputParameters": {
        "llmProfile": {
            "name": "openai-dev",
            "model": "gpt-4o-mini"
        },
        "maxTokens": 4000,
        "embeddingModelProfile": {
            "name": "oci-cohere-embedding",
            "model": "cohere.embed-multilingual-image-v3.0"
        },
        "dataStoreProfile": "oracle-atp",
        "tableName": "test_vectors",
        "topK": 5,
        "query": "${workflow.input.query}",
        "ragType": "naive",
        "dimensions": "512",
        "indexType": "HNSW",
        "distanceType": "COSINE"
    },
    "type": "GENAI_RETRIEVE"
}

Parent topic: Create System Tasks and Operator Tasks