OCI Chat Completions API

Use the OCI Chat Completions API if you already have application code built around the Chat Completions API or if you need a simpler stateless chat interface. For new applications, we recommend that you use the OCI Responses API. If you need OCI-managed multi-turn conversation history, use the Responses API together with the Conversations API.

Note

The OCI Chat Completions API uses the same format as the OpenAI Chat Completions API with the OCI OpenAI-compatible endpoint. For syntax and request details, see the OpenAI Chat Completions API documentation.
Important

We recommend that you use the OCI Responses API so you can use the latest tool support and the Conversations API.

Use the Responses API when you need:

  • Supported tools
  • Multi-step workflows
  • Structured outputs
  • OCI-managed conversation history
  • A primary interface for agentic workflows

Use the Chat Completions API when you:

  • Already have application code built around /chat/completions
  • Need a simpler stateless chat interface
  • Don't need Responses API features such as supported tools or OCI-managed conversation state

Supported API Endpoint

Base URL Endpoint Path Authentication
https://inference.generativeai.${region}.oci.oraclecloud.com/openai/v1 /chat/completions API key or IAM session

Replace ${region} with a supported region such as us-chicago-1.

Although the request format is OpenAI-compatible, authentication uses OCI credentials, requests are routed through OCI Generative AI inference endpoints, and resources and execution remain in OCI.

Tip

For steps to perform before using this API, see the QuickStart.

Responses API or Chat Completions?

Dimension OCI Responses API OCI Chat API using Chat Completions API
Primary use Unified API for model interaction and agentic capabilities API for model interaction
Best fit Interactive chat, agentic workloads, and long-running tasks Interactive chatbots and text completion
Orchestration Built-in multi-step reasoning and multiple tool calls Single-step inference or generation; multi-step flows require external orchestration
Context management Stateful by default, with optional stateless usage Stateless only where the client manages conversation history
Tool support Built-in tools such as File Search, Code Interpreter, and remote MCP Limited to client-side tools through function calling
Multimodal support Supports text and image inputs and text outputs Primarily text, with multimodal support depending on the model
Streaming Event-based streaming with fine-grained events Token-based streaming
Structured output Native structured outputs and JSON schema enforcement JSON mode is supported, but is less composable
File and vector integration Can be used with the Files API and Vector Stores API Requires separate orchestration
Extensibility Designed for tool-enabled workflows, memory, and containers Designed primarily for chat applications