LLM Management

Large Language Models (LLMs) are used for tasks like text generation, summarization, question answering, and supporting MCP server tools.

Agent Factory supports a diverse selection of LLMs from multiple providers. Use the LLM Management screen to set up and manage both generative models and embedding models within the application. See LLM Preferences for the preferred order of choosing generative and embedding models.

Generative Models

OCI Generative AI

The Agent Factory accepts all generative models available through the OCI Generative AI Service. However, for best results Oracle recommends using the following pre-trained models in combination with the Agent Factory:

xai.grok-4
xai.grok-4-fast-reasoning
xai.grok-4-fast-non-reasoning
openai.gpt-5

Note: Using Google Vertex AI models is not recommended (e.g google.gemini-2.5-pro), as they do not support tool use.

Setup OCI Generative AI Service

Below are the steps to configure xai.grok-4-fast-reasoning pre-trained model from OCI Generative AI Service.

Prerequisites: Setup OCI Generative AI Service. See Getting Started with Generative AI.

Step 1: Click on Model Management on the left side navigation menu.

Click on Model Management

Step 2: Click on Add configuration button placed on the top-right corner.

Select Add Configuration

Step 3: A form will open, under Model type pick Generative model.

Choose Generative Model

Step 4: Give your LLM configuration a preferred name under Configuration name, avoid whitespaces since they are not allowed.

Configuration Name

Step 5: From list of LLM providers pick OCI GenAI.

LLM Providers

Step 6: For Mode, choose API Key if you have a secret token and want to use it to access the LLM. See API Keys. Choose Instance principals otherwise.

With API Key

Step 7: In the Model ID field, enter xai.grok-4-fast-reasoning

Enter Model ID

Step 8: Fill Endpoint and Compartment ID.

Fill OCI credentials

Step 9: If you chose API Key mode in Step 6, fill User, Tenancy, Finger Print, and Region fields with your own credentials from OCI Generative AI Service. Upload your private API key file under Key File. You can request it from your tenancy owner or by logging into your tenancy with your user. See Set Up API Authentication for OCI.

Upload API Key File

Step 10: Click on Test connection to validate the credentials are correct.

Step 11: A success message Connection successful will appear on screen and the Save Configuration button will be enabled, click on it to finalize the process.

Save configuration

Ollama

Agent Factory has support for Ollama so you can access the LLMs running locally on your machine.

Setup Ollama in Your Machine (Linux OCI VM)

Step 1: Open a terminal window in your system and start a new bash shell with root privileges.

sudo bash

Step 2 (Optional): If you are working behind a proxy please set the appropriate proxies.

Step 3: Download and install ollama as instructed by the official site.

Step 4: Pull the llama3.2 model to your local machine.

ollama pull llama3.2

Step 5: Edit the ollama service so Agent Factory’s container can access to it.

systemctl edit ollama

Step 6: Once the editor opens, paste the below two lines and close by using Ctrl + O, Enter, Ctrl + X

[Service]

Environment="OLLAMA_HOST=0.0.0.0:11434"

Step 7: Refresh systemd and restart ollama so the changes are visible.

systemctl daemon-reexec

systemctl daemon-reload

systemctl restart ollama

Step 8 (Optional): In a separate terminal verify ollama service is running by running the following command:

ollama run llama3.2

Adding Ollama Model to Agent Factory

Below are the steps to configure the locally hosted Llama 3.2 model from Ollama that was set up in the previous section.

Step 1: Click LLM management on the left side navigation menu.

Click on LLM Management

Step 2: Click Add configuration button placed on the top-right corner.

Select Add Configuration

Step 3: A form will open, under Model type choose Generative model.

Choose Generative Model

Step 4: Give your LLM configuration a preferred name under Configuration name, avoid whitespaces since they are not allowed.

Give your configuration a name

Step 5: From list of LLM providers pick Ollama.

Pick Ollama as provider

Step 6: Enter llama3.2 as Model ID.

Enter Model ID

Step 7: Enter http://host.containers.internal as URL.

Enter URL

Step 8: Enter 11434 as Port.

Enter URL

Step 9: Click on Test connection to validate the credentials are correct.

Test connection

Step 10: A success message “Connection successful” will appear on screen and the Save Configuration button will be enabled, click on it to finalize the process.

Save configuration

OpenAI

The following models are currently supported:

gpt-4o
gpt-4o-mini

Adding OpenAI Model to Agent Factory

Step 1: Click on LLM management on the left side navigation menu.

Click on LLM Management

Step 2: Click on Add configuration button placed on the top-right corner.

Select Add Configuration

Step 3: A form will open, under Model type pick Generative model.

Choose Generative Model

Step 4: Give your LLM configuration a preferred name under Configuration name, avoid whitespaces since they are not allowed.

Add Configuration name

Step 5: From the list of LLM providers pick OpenAI.

Pick OpenAI

Step 6: Select gpt-4o as the Model ID

Select Model ID

Step 7: Enter your API Key

Enter your API key

Step 8: Click on Test connection to validate the credentials are correct.

Test the connection

Step 9: A success message “Connection successful” will appear on screen and the Save Configuration button will be enabled, click on it to finalize the process.

Save the configuration

vLLM

You can connect to any self-hosted model endpoint.

These are the required fields you need to configure a vLLM:

Model ID: The model identifier/path that the vLLM server is serving (often a filesystem path or a registry-style name).
URL: Host/DNS clients use to reach the server.
Port: The port where the HTTP service is exposed.

Gemini

The following Gemini LLM models are currently supported:

gemini-3.1-pro-preview
gemini-3-pro-preview
gemini-3-flash-preview
gemini-2.5-pro
gemini-2.5-flash
gemini-2.5-flash-lite
gemini-2.0-flash
gemini-2.0-flash-lite

Adding Gemini Model to Agent Factory

You can configure Gemini LLM in Agent Factory using the following authentication methods:

Google Service Account Authentication
API Key authentication

Authentication Compatibility Notice: Knowledge Agent supports both Gemini authentication methods: Service Account (JSON) and API key authentication. However, Agent Builder and other prebuilt agents currently support only Gemini API key authentication. If you plan to use Agent Builder or any prebuilt agents, please ensure your Gemini configuration is created using the API key authentication method.

Configuring Gemini LLM (Google Service Account Authentication)

Prerequisites:

Before creating a Gemini configuration ensure:

A Google Cloud Project exists.
Gemini (Vertex AI) API is enabled in that project.
A Service Account has been created.
A JSON key file has been generated and downloaded securely.

See Google Cloud Service Account Setup Guide for detailed instructions on how to create and configure a Service Account.

Steps to Create a Gemini LLM Configuration Using Service Account Authentication:

Step 1: Go to Model Management from the left navigation menu.

Step 2: Click on Add configuration button placed on the top-right corner.

Add Configuration

Step 3: A form will open, under Model type pick Generative model.

Step 4: Give your LLM configuration a preferred name under Configuration name, avoid whitespaces since they are not allowed.

Step 5: From list of LLM providers pick Gemini.

Gemini Configuration

Step 6: Select from authentication method the Service account option.

Step 7: Enter the Model ID.

Recommended commonly used Gemini LLM models:

gemini-2.5-flash-lite
gemini-2.5-flash
gemini-flash-latest

Step 8: Enter the Region.

Most commonly used regions:

us-central1
us-east1
europe-west4

Step 9: Upload the Google Service Account JSON file.

Upload the JSON file by:

Clicking on the Upload file component
Or drag and drop the file into the dashed upload area

Note: The Service Account must belong to a project where Vertex AI (Gemini) is enabled.

Gemini Service Account

Step 10: Click on Test connection to validate the credentials are correct and connection is possible. Wait a few seconds while a validation request is sent to the Gemini LLM.

Gemini Connection

Step 11: A success message Connection successful will appear on screen and the Save Configuration button will be enabled; click on it to finalize the process.

Once completed, the new configuration will appear in the LLM Configurations table.

Gemini Configuration

Configuring Gemini LLM (API Key Authentication)

Prerequisites:

Before configuring Gemini using API key authentication:

A Gemini API key must be created using either:
- Google AI Studio, or
- Google Cloud Console
The API key must belong to a Google Cloud project.
Vertex AI (Gemini) API must be enabled for that project.

Ensure the API key is not restricted in a way that blocks requests from Agent Factory.

Steps to Create a Gemini LLM Configuration Using API Key Authentication:

Step 1: Go to Model Management from the left navigation menu.

Step 2: Click on Add configuration button placed on the top-right corner.

Step 3: A form will open, under Model type pick Generative model.

Step 4: Give your LLM configuration a preferred name under Configuration name, avoid whitespaces since they are not allowed.

Step 5: From list of LLM providers pick Gemini.

Gemini Configuration

Step 6: Select from authentication method the API Key option.

Step 7: Enter the Model ID
Recommended commonly used Gemini LLM models:

gemini-2.5-flash-lite
gemini-2.5-flash
gemini-flash-latest

Step 8: Paste your Gemini API key into the field provided.

Gemini API Key

Step 9: Click on Test connection to validate the credentials are correct and connection is possible.
Wait a few seconds while a validation request is sent to the Gemini LLM.

Gemini Connection

Step 10: A success message Connection successful will appear on screen and the Save Configuration button will be enabled, click on it to finalize the process.
Once completed, the new configuration will appear in the LLM Configurations table.

Embedding Models

Agent Factory supports the following embedding models. Use these models for transforming text into numerical vectors, enabling semantic search, and retrieval-augmented generation (RAG). Agent Factory include out-of-the-box support for several high-performing embedding models, while also allowing you to bring your preferred models hosted on the OCI Generative AI service, or served via Ollama or vLLM endpoints.

Local Models

The following pre-trained sentence transformer model is bundled with the application and run locally.

multilingual-e5-base (768 dimensions)

Configure a Local Embedding Model

Below are the steps to configure a local embedding model multilingual-e5-base, which is available in Agent Factory out of the box.

Note: If you plan to use local embedding models, ensure the machine running the application has access to GPUs. Otherwise, embedding-related processes—such as Knowledge Agent ingestion—may take a significant amount of time.

Step 1: Click on LLM management on the left side navigation menu.

Click on LLM Management

Step 2: Click on Add configuration button placed on the top-right corner.

Select Add Configuration

Step 3: A form will open, under Model type choose Embedding model.

Pick Embedding Model

Step 4: Give your configuration a name, avoid using whitespaces since they are not allowed.

Give your config a name

Step 5: For Embedding provider pick Local.

Pick local

Step 6: Click on Model ID and pick multilingual-e5-base from the list.

Select a Model ID

Step 7: Verify the connection by clicking on Test connection.

Select a Model ID

Step 8: Save the new configuration.

Select a Model ID

OCI Generative AI

The following Cohere embedding models from OCI Gen AI are supported:

cohere.embed-v4.0
cohere.embed-multilingual-v3.0
cohere.embed-multilingual-light-v3.0
cohere.embed-english-v3.0

Configure an OCI Generative AI embedding model

Below are the steps to configure a local embedding model cohere.embed-v4.0 which is available through OCI Generative AI service using a Fingerprint based authentication.

Step 1: Click on LLM management on the left side navigation menu.

Click on LLM Management

Step 2: Click on Add configuration button placed on the top-right corner.

Select Add Configuration

Step 3: A form will open, under Model type pick Embedding model.

Pick Model Type

Step 4: Give your configuration a name, avoid using whitespaces since they are not allowed.

Enter Configuration Name

Step 5: Pick OCI GenAI as Embedding provider.

Step 6: For Mode, choose API Key if you have a secret token and want to use it to access the LLM. See API Keys. Choose Instance principals otherwise.

Choose API Key

Step 7: Enter cohere.embed-v4.0 as Model ID.

Enter Model ID

Step 8: Fill in Endpoint and Compartment ID.

Fill OCI credentials

Upload API Key File

Step 10: Click on Test connection to validate the credentials are correct.

Step 11: A success message “Connection successful” will appear on screen and the Save Configuration button will be enabled, click on it to finalize the process.

Save configuration

vLLM/Ollama

You can connect to any self-hosted model endpoint.

These are the required fields you need to configure a vLLM:

Model ID: The model identifier/path that the vLLM server is serving (often a filesystem path or a registry-style name).
URL: Host/DNS clients use to reach the server.
Port: The port where the HTTP service is exposed.

Gemini

The following Gemini Embedding Models are currently supported:

gemini-embedding-001
text-embedding-004
text-multilingual-embedding-002
text-embedding-005
textembedding-gecko@003
textembedding-gecko-multilingual@001

Adding Gemini Embedding Model to Agent Factory

You can configure Gemini Embedding Model in Agent Factory using the following authentication methods:

Google Service Account Authentication
API Key authentication

Configuring Gemini Embedding Model (Google Service Account Authentication)

Prerequisites:

Before creating a Gemini configuration ensure:

A Google Cloud Project exists.
Gemini (Vertex AI) API is enabled in that project.
A Service Account has been created.
A JSON key file has been generated and downloaded securely.

See Google Cloud Service Account Setup Guide for detailed instructions on how to create and configure a Service Account.

Steps to Create a Gemini Embedding Configuration Using Service Account Authentication:

Step 1: Go to Model Management from the left navigation menu.

Step 2: Click on Add configuration button placed on the top-right corner.

Step 3: A form will open, under Model type pick Embedding model.

Step 4: Give your LLM configuration a preferred name under Configuration name, avoid whitespaces since they are not allowed.

Step 5: From list of LLM providers pick Gemini.

Gemini Embedding Model Configuration

Step 6: Select from authentication method the Service account option.

Step 7: Enter the Model ID.

Recommended Gemini embedding models:

gemini-embedding-001
text-embedding-004
text-multilingual-embedding-002
text-embedding-005

Step 8: Enter the Region.

Most commonly used regions:

us-central1
us-east1
europe-west4

Step 9: Upload the Google Service Account JSON file.

Upload the JSON file by:

Clicking on the Upload file component
Or drag and drop the file into the dashed upload area

Note: The Service Account must belong to a project where Vertex AI (Gemini) is enabled.

Gemini Embedding Service Account

Step 10: Click on Test connection to validate the credentials are correct and connection is possible. Wait a few seconds while a validation request is sent to the Gemini LLM.

Gemini Embedding Model Configuration

Step 11: A success message Connection successful will appear on screen and the Save Configuration button will be enabled; click on it to finalize the process.

Gemini Embedding Model Table

Once completed, the new configuration will appear in the LLM Configurations table.

Configuring Gemini Embedding Model (API Key Authentication)

Prerequisites:

Before configuring Gemini using API key authentication:

A Gemini API key must be created using either:
- Google AI Studio, or
- Google Cloud Console
The API key must belong to a Google Cloud project.
Vertex AI (Gemini) API must be enabled for that project.

Ensure the API key is not restricted in a way that blocks requests from Agent Factory.

Steps to Create a Gemini Embedding Configuration Using API Key Authentication:

Step 1: Go to Model Management from the left navigation menu.

Step 2: Click on Add configuration button placed on the top-right corner.

Step 3: A form will open, under Model type pick Embedding model.

Step 4: Give your LLM configuration a preferred name under Configuration name, avoid whitespaces since they are not allowed.

Step 5: From list of LLM providers pick Gemini.

Gemini Embedding Model Configuration

Step 6: Select from authentication method the API Key option.

Step 7: Enter the Model ID
Recommended Gemini embedding models:

gemini-embedding-001
text-embedding-005
text-multilingual-embedding-002
text-embedding-005

Step 8: Paste your Gemini API key into the field provided.

Gemini Embedding Model Configuration

Step 9: Click on Test connection to validate the credentials are correct and request runs embedding.
Wait a few seconds while embedding validation request runs.

Gemini Embedding Model Configuration

Gemini Embedding Model Table