Use Meta Llama 4 in OCI Generative AI
- Services: Generative AI
- Release Date: May 14, 2025
OCI Generative AI now supports Meta Llama 4 models, Scout and Maverick, on Oracle Cloud Infrastructure (OCI) Generative AI service. These models leverage a Mixture of Experts (MoE) architecture, enabling efficient and powerful processing capabilities. Optimized for multimodal understanding, multilingual tasks, coding, tool-calling, and powering agentic systems, the Llama 4 series brings new possibilities to enterprise AI applications.
- Key Highlights
- 
- Multimodal Capabilities: Both models are natively multimodal, capable of processing and integrating various data types, including text and images.
- Multilingual Support: Trained on data encompassing 200 languages, with fine-tuning support for 12 languages including Arabic, English, French, German, Hindi, Indonesian, Italian, Portuguese, Spanish, Tagalog, Thai, and Vietnamese. Image understanding is limited to English.
- Efficient Deployment: Llama 4 Scout is designed for accessibility with a smaller GPU footprint.
- Knowledge Cutoff: August 2024
- Usage Restrictions: The Llama 4 Acceptable Use Policy restricts their use in the European Union (EU).
- Available for on-demand inferencing and dedicated hosting.
 
- Available Regions
- 
- US Midwest (Chicago) (on-demand and dedicated AI clusters)
- Brazil East (Sao Paulo) (dedicated AI clusters)
- Japan Central (Osaka) (dedicated AI clusters)
- UK South (London) (dedicated AI clusters)
 
- Meta Llama 4 Scout
- 
- Architecture: Features 17 billion active parameters within a total of about 109 billion parameters, using 16 experts.
- Context Window: Supports a context length of 192k tokens.
- Deployment: Designed for efficient operation on a small GPU footprint.
- Performance: Shows superior performance compared to previous models across many benchmarks.
 
- Llama 4 Maverick
- 
- Architecture: Similar to Meta Llama Scout, this model features 17 billion active parameters but within a larger framework of about 400 billion parameters, using 128 experts.
- Context Window: Supports a context length of 512k tokens.
- Performance: Matches advanced models in coding and reasoning tasks.
 
Important Note: Before you use this model, review Meta's Llama 4 Acceptable Use Policy.
For a list of offered models and their regions, see Pretrained Foundational Models in Generative AI. For information about the service, see the Generative AI documentation.