OCI Chat Completions API
Use the OCI Chat Completions API if you already have application code built around the Chat Completions API or if you need a simpler stateless chat interface. For new applications, we recommend that you use the OCI Responses API. If you need OCI-managed multi-turn conversation history, use the Responses API together with the Conversations API.
Note
The OCI Chat Completions API uses the same format as the OpenAI Chat Completions API with the OCI OpenAI-compatible endpoint. For syntax and request details, see the OpenAI Chat Completions API documentation.
The OCI Chat Completions API uses the same format as the OpenAI Chat Completions API with the OCI OpenAI-compatible endpoint. For syntax and request details, see the OpenAI Chat Completions API documentation.
Important
We recommend that you use the OCI Responses API so you can use the latest tool support and the Conversations API.
We recommend that you use the OCI Responses API so you can use the latest tool support and the Conversations API.
Use the Responses API when you need:
- Supported tools
- Multi-step workflows
- Structured outputs
- OCI-managed conversation history
- A primary interface for agentic workflows
Use the Chat Completions API when you:
- Already have application code built around
/chat/completions - Need a simpler stateless chat interface
- Don't need Responses API features such as supported tools or OCI-managed conversation state
Supported API Endpoint
| Base URL | Endpoint Path | Authentication |
|---|---|---|
https://inference.generativeai.${region}.oci.oraclecloud.com/openai/v1 |
/chat/completions |
API key or IAM session |
Replace ${region} with a supported region such as us-chicago-1.
Although the request format is OpenAI-compatible, authentication uses OCI credentials, requests are routed through OCI Generative AI inference endpoints, and resources and execution remain in OCI.
Responses API or Chat Completions?
| Dimension | OCI Responses API | OCI Chat API using Chat Completions API |
|---|---|---|
| Primary use | Unified API for model interaction and agentic capabilities | API for model interaction |
| Best fit | Interactive chat, agentic workloads, and long-running tasks | Interactive chatbots and text completion |
| Orchestration | Built-in multi-step reasoning and multiple tool calls | Single-step inference or generation; multi-step flows require external orchestration |
| Context management | Stateful by default, with optional stateless usage | Stateless only where the client manages conversation history |
| Tool support | Built-in tools such as File Search, Code Interpreter, and remote MCP | Limited to client-side tools through function calling |
| Multimodal support | Supports text and image inputs and text outputs | Primarily text, with multimodal support depending on the model |
| Streaming | Event-based streaming with fine-grained events | Token-based streaming |
| Structured output | Native structured outputs and JSON schema enforcement | JSON mode is supported, but is less composable |
| File and vector integration | Can be used with the Files API and Vector Stores API | Requires separate orchestration |
| Extensibility | Designed for tool-enabled workflows, memory, and containers | Designed primarily for chat applications |