Model Selection in Generative AI Agents

The OCI Generative AI Agents service supports Model Selection, where you can select a large language model (LLM) as the routing LLM for your agent during creation or editing. You can select from on-demand models hosted in the OCI Generative AI service or create and use dedicated AI cluster endpoints for supported models.

Model selection provides flexibility to optimize your agent's performance based on your workload (for example, for RAG and SQL tools).

How to Select a Model

When creating or editing an agent in the OCI Console, navigate to the Model Selection section under the agent's configuration.

  1. Select the Routing LLM Type:

    • Select from the following options:
      • Default: Uses the standard model provided by the Generative AI Agents service (Llama 3.3 70B). This is suitable for general-purpose agents without custom model needs.
      • Generative AI Model: Select from on-demand models hosted directly in the OCI Generative AI service.
      • Generative AI Endpoint: Select from models hosted on dedicated AI cluster endpoints in the OCI Generative AI service.
  2. Select a Specific Model or Endpoint:

    • If you selected Generative AI Model or Generative AI Endpoint from the list, the Console displays a list of available options. Browse and select the preferable model or endpoint.
    • The list populates based on the tenancy's available models/endpoints and IAM permissions
  3. Update Default Hyperparameters:

    • Review the default values for the model hyperparameters.
    • Optionally update any of those hyperparameters.
Tip

  • You can use these models when you add RAG, SQL, and Agent tools.
  • To see more information about these models, enable trace when you create an endpoint for this agent.
  • If the list isn't populating, see which policy you need to add from the examples in the User Access to Model Selection in Agents page.

Supported Models

Supported Models and their Hyperparameters
Models that you can select Hyperparameters that you can update
  • Maximum output tokens
  • Temperature
  • Top p
  • Top k
  • Frequency penalty
  • Presence penalty
  • Seed
  • Maximum output tokens
  • Temperature
  • Top p
  • Frequency Penalty
  • Presence Penalty
  • Maximum output tokens
  • Temperature
  • Top p
  • Frequency penalty
  • Presence penalty

Supported Regions

The following regions are supported:

  • Brazil East (Sao Paulo)
  • Germany Central (Frankfurt)
  • Japan Central (Osaka)
  • UK South (London)
  • US East (Ashburn)
  • US Midwest (Chicago)
  • US West (Phoenix)

To confirm a model’s usable regions for the agents model selection feature, perform the following tasks.

  1. From the Generative AI documentation's Models by region page, select one of the supported models listed in the Supported Models section.
  2. Select a region for the model that appears in both the model’s available regions AND the preceding seven supported regions.
  3. Verify whether the model is available in the mode that you need (on-demand or dedicated).

    For access to models in the dedicated mode, only public endpoints are supported.