Enable secure and scalable self-service platforms for generative AI and LLMs within OCI
Empower data scientists, developers, and IT teams to independently build, test, and deploy advanced AI models, while ensuring enterprise-level governance and infrastructure optimization.
This solution delivers the core capabilities needed to support generative artificial intelligence (AI) and large language models (LLMs) initiatives in a self-service model—combining secure access, scalable infrastructure, and enterprise-grade governance on Oracle Cloud Infrastructure (OCI).
Use cases and supported services:
- Bring Your Own Large Language Model (BYOLLM) / Code security
validation
Third-party models (for example, Hugging Face) are deployed in the isolated "playground" environment with GPU acceleration and undergo automated security validation. OCI Functions, Oracle Identity Cloud Service (IDCS), and OCI Identity and Access Management (IAM) policies are used for inspection, access control, and secure execution.
- Data science playground
Data science playground is a flexible and scalable environment designed for data science experimentation. Powered by advanced GPU infrastructure, it offers seamless integration with Oracle Database 23ai and optimized vector and object storage for document management and embeddings, ideal for rapid prototyping and efficient scaling of AI projects.
- Multi-modal AI
OCI supports multi-modal models by integrating text, voice, and image inputs. These models are hosted on high-performance GPU instances.
- Speech-to-text
OCI Speech is Oracle's speech-to-text service that converts audio to text with high accuracy. Integrated into OCI, it supports multiple languages, real-time and batch transcription, and offers advanced features such as speaker diarization, word-level confidence, and offensive language filtering. It also connects seamlessly with other OCI services for scalable, real-time processing.
- Retrieval-augmented generation
OCI provides a comprehensive Retrieval-Augmented Generation (RAG) solution by integrating Oracle Database 23ai and OCI Object Storage with generative AI services. Data is transformed into vector embeddings and stored in Oracle Autonomous Database to enable efficient semantic search. The generated responses are enriched with relevant, up-to-date information. RAG workflows are orchestrated through OCI Connector Hub, supporting event-driven execution, automated data ingestion, and real-time scalability.
RAG workflows are orchestrated via OCI Connector Hub, with support for event-driven execution and integration with data ingestion pipelines.
- Vector database
Oracle Database 23ai offers native vector database capabilities through the
VECTOR
data type, enabling the storage of embeddings and semantic search using standard SQL. It supports vector indexing, ONNX-based or external embedding generation, and precision control for similarity queries. Optimized for Exadata, it eliminates the need for separate vector stores, supporting use cases like RAG, recommendations, and generative AI within a unified Oracle environment. - OCI Generative AI agents
These agents are powered by the OCI Generative AI service or third-party models running on OCI bare metal GPU infrastructure.
- OCI speed
To ensure high throughput and performance, bare metal instances (for example, A100, H200, B200, and GB200) are used for both training and inference of large models, supporting rapid experimentation and production-grade workloads.
Architecture
This architecture illustrates how Oracle Cloud Infrastructure (OCI) supports end-to-end generative AI workflows across development, integration, and user interaction.
Flow A: Integration
- Customer applications
- Oracle Integration
- OCI Object Storage (buckets)
- OCI Events detection
- OCI Streaming and OCI Connector Hub
- OCI Functions (logic execution)
- Oracle Process Cloud Service (inference by GPUs)
- Data layer (Oracle Database 23ai and buckets)
Flow B: User interaction
- End-user interfaces (Apex)
- Applications (OCI GenAI Agents, OCI Speech, Oracle Digital Assistant)
- Oracle Process Cloud Service (inference by GPUs)
- Data layer (Oracle Database 23ai and buckets)
Flow C: Development and sandbox
- External model sources
- Code security validation
- Development and testing
- Automation pipeline to production
The following diagram illustrates this reference architecture.
ai-llm-workflow-architecture-oracle.zip
Architecture overview by functional domains
- Development and training (self-service workspace)
The architecture is structured under a centralized compartment for LLM operations:
- Data Science provides an integrated workspace for model development, Jupyter notebooks, and pre-built ML frameworks. Includes quick action tools for model deployment and job execution.
- Model deployment hosts virtual machines (VMs) for model testing and deployment. Users can validate models here before moving them into production.
- Playground is a GPU-accelerated environment (Flex VMs, A10, A100, LS40) offering isolated and high-performance compute resources for custom and third-party models (for example, Hugging Face). It serves as the experimentation zone for Bring Your Own LLM (BYOLLM) workflows.
- Application and function layer
- OCI Speech and language APIs offer ready-to-consume services for transcription, NLU, and entity extraction.
- OCI Functions is used for real-time transcription, NLP, and serverless execution of AI pipelines.
- APEX front-end and monitoring tools provide interfaces for user interaction, analytics, and governance.
- OCI GenAI Agents and Digital Assistant enable conversational experiences using enterprise data and integrated LLMs.
- Processing (production layer)
- OCI Kubernetes Engine (OKE) supports containerized deployment of production models and inference services.
- OCI Generative AI provides API-based access to Oracle-hosted or custom, fine-tuned LLMs, supporting secure and scalable enterprise use cases.
- GPU infrastructure (H100 and RDMA support)
- Bare metal GPU instances (H100 with RDMA) enable multi-node, distributed training and inference with high-throughput, low-latency communication, ideal for massive LLM workloads.
- Optimized for Kubernetes and NVIDIA Multi-Instance GPU (MIG) technology, this setup enables GPU orchestration and dynamic resource sharing, allowing fractional GPU allocation and multi-user scheduling across teams.
- Data and knowledge layer
- Oracle Database 23ai, enhanced with support for vector and semantic search, acts as the retrieval layer for Retrieval-Augmented Generation (RAG) workflows.
- OCI Object Storage buckets store unstructured data, embeddings, documents, and model artifacts.
- MLOps (production model pipeline)
- The architecture includes a CI/CD pipeline for promoting models from the playground environment to production. Currently represented by OCI DevOps is OCI's native, fully-managed, continuous integration and continuous delivery (CI/CD) service that enables organizations to automate the deployment of machine learning models from experimentation to production.
- Integrated build pipelines with Git.
- Automated deployment to VMs or containers.
- Native integration with OCI Artifacts Registry, OCI Functions, and OCI API Gateway.
- Integration and security layer
- OCI Object Storage buckets act as the central storage for models, training data, inference outputs, and embeddings.
- OCI Events, OCI Streaming, and OCI Connector Hub enable event-driven orchestration and service integration across the environment.
- Oracle Identity Cloud Service, IAM policies, OCI Logging, and security lists provide robust governance, authentication, access control, and compliance capabilities across all OCI services.
- Oracle Integration is a pre-built middleware platform that enables secure and seamless integration between on-premises systems and cloud services, supporting real-time data synchronization, API orchestration, and process automation across heterogeneous applications.
The architecture has the following components:
- Availability domains
Availability domains are standalone, independent data centers within a region. The physical resources in each availability domain are isolated from the resources in the other availability domains, which provides fault tolerance. Availability domains don’t share infrastructure such as power or cooling, or the internal availability domain network. So, a failure at one availability domain shouldn't affect the other availability domains in the region.
- Bare metal
Oracle’s bare metal servers provide isolation, visibility, and control by using dedicated compute instances. The servers support applications that require high core counts, large amounts of memory, and high bandwidth. They can scale up to 192 cores, 2.3 TB of RAM, and up to 1 PB of block storage. Customers can build cloud environments on Oracle’s bare metal servers with significant performance improvements over other public clouds and on-premises data centers.
- Compartment
Compartments are cross-regional logical partitions within an OCI tenancy. Use compartments to organize, control access, and set usage quotas for your Oracle Cloud resources. In a given compartment, you define policies that control access and set privileges for resources.
- Connector
Hub
Oracle Cloud Infrastructure Connector Hub is a message bus platform that orchestrates data movement between services on OCI. You can use connectors to move data from a source service to a target service. Connectors also enable you to optionally specify a task (such as a function) to perform on the data before it is delivered to the target service.
You can use OCI Connector Hub to quickly build a logging aggregation framework for security information and event management (SIEM) systems.
- Dynamic routing gateway
(DRG)
The DRG is a virtual router that provides a path for private network traffic between VCNs in the same region, between a VCN and a network outside the region, such as a VCN in another OCI region, an on-premises network, or a network in another cloud provider.
- FastConnect
Oracle Cloud Infrastructure FastConnect creates a dedicated, private connection between your data center and OCI. FastConnect provides higher-bandwidth options and a more reliable networking experience when compared with internet-based connections.
- High-performance
computing
High-performance computing is designed for workloads that require cluster networking and high-speed processor cores for massively parallel workloads.
- Internet
gateway
An internet gateway allows traffic between the public subnets in a VCN and the public internet.
- On-premises network
This is a local network used by your organization.
- Region
An OCI region is a localized geographic area that contains one or more data centers, hosting availability domains. Regions are independent of other regions, and vast distances can separate them (across countries or even continents).
- Route table
Virtual route tables contain rules to route traffic from subnets to destinations outside a VCN, typically through gateways.
- Security list
For each subnet, you can create security rules that specify the source, destination, and type of traffic that is allowed in and out of the subnet.
- Service
gateway
A service gateway provides access from a VCN to other services, such as Oracle Cloud Infrastructure Object Storage. The traffic from the VCN to the Oracle service travels over the Oracle network fabric and does not traverse the internet.
- Tenancy
A tenancy is a secure and isolated partition that Oracle sets up within Oracle Cloud when you sign up for OCI. You can create, organize, and administer your resources on OCI within your tenancy. A tenancy is synonymous with a company or organization. Usually, a company will have a single tenancy and reflect its organizational structure within that tenancy. A single tenancy is usually associated with a single subscription, and a single subscription usually only has one tenancy.
- Virtual cloud network
(VCN) and subnet
A VCN is a customizable, software-defined network that you set up in an OCI region. Like traditional data center networks, VCNs give you control over your network environment. A VCN can have multiple non-overlapping classless inter-domain routing (CIDR) blocks that you can change after you create the VCN. You can segment a VCN into subnets, which can be scoped to a region or to an availability domain. Each subnet consists of a contiguous range of addresses that don't overlap with the other subnets in the VCN. You can change the size of a subnet after creation. A subnet can be public or private.
- Oracle Database
23ai
Oracle Database 23ai release has a focus on AI and developer productivity. It brings AI to your data with the addition of AI Vector Search to Oracle’s converged database. This capability combined with new unified development paradigms and mission-critical capabilities makes it simple for developers and data professionals to power apps, application development, and mission-critical workloads with AI.
- Events
Services on OCI emit events, which are structured messages that describe the changes in resources. Events are emitted for create, read, update, or delete (CRUD) operations, resource lifecycle state changes, and system events that affect cloud resources.
- LoggingOracle Cloud Infrastructure Logging is a highly-scalable and fully-managed service that provides access to the following types of logs from your resources in the cloud:
- Audit logs: Logs related to events produced by OCI Audit.
- Service logs: Logs published by individual services such as OCI API Gateway, OCI Events, OCI Functions, OCI Load Balancing, OCI Object Storage, and VCN flow logs.
- Custom logs: Logs that contain diagnostic information from custom applications, other cloud providers, or an on-premises environment.
- Monitoring
Oracle Cloud Infrastructure Monitoring actively and passively monitors your cloud resources, and uses alarms to notify you when metrics meet specified triggers.
- OCI
Registry
Oracle Cloud Infrastructure Registry is an Oracle-managed service that enables you to simplify your development-to-production workflow. Registry makes it easy for you to store, share, and manage development artifacts, like Docker images.
- Speech
Oracle Cloud Infrastructure Speech harnesses the power of spoken language, enabling you to easily convert media files containing human speech into highly accurate text transcriptions. You can access using the Console, REST API, CLI, and SDK.
- Workflow
Oracle Cloud Infrastructure Workflow is a serverless workflow engine with a graphical flow designer for developers and architects. It accelerates the creation, running, and orchestration of OCI services such as OCI Functions or AI/ML.
- APEX Service
Oracle APEX Application Development is a low-code development platform that enables you to build scalable, feature-rich, secure, enterprise apps that can be deployed anywhere that Oracle Database is installed. You don't need to be an expert in a vast array of technologies to deliver sophisticated solutions. APEX Service includes built-in features such as user interface themes, navigational controls, form handlers, and flexible reports that accelerate the application development process.
- API Gateway
Oracle Cloud Infrastructure API Gateway enables you to publish APIs with private endpoints that are accessible from within your network, and which you can expose to the public internet if required. The endpoints support API validation, request and response transformation, CORS, authentication and authorization, and request limiting.
- OCI Block Volumes
With Oracle Cloud Infrastructure Block Volumes, you can create, attach, connect, and move storage volumes, and change volume performance to meet your storage, performance, and application requirements. After you attach and connect a volume to an instance, you can use the volume like a regular hard drive. You can also disconnect a volume and attach it to another instance without losing data.
- Compute
With Oracle Cloud Infrastructure Compute, you can provision and manage compute hosts in the cloud. You can launch compute instances with shapes that meet your resource requirements for CPU, memory, network bandwidth, and storage. After creating a compute instance, you can access it securely, restart it, attach and detach volumes, and terminate it when you no longer need it.
- Data
Science
Oracle Cloud Infrastructure Data Science is a fully-managed, serverless platform that data science teams can use to build, train, and manage machine learning (ML) models on OCI. It can easily integrate with other OCI services such as Oracle Autonomous Data Warehouse, Oracle Cloud Infrastructure Object Storage, and more. You can build and evaluate high-quality machine learning models that increase business flexibility by putting enterprise-trusted data to work quickly, and you can support data-driven business objectives with easier deployment of ML models. Data Science enables data scientists and machine learning engineers to use packages from the Anaconda Repository for free.
The Data Science Jobs feature enables data scientists to define and run repeatable machine learning tasks on a fully-managed infrastructure.
The Data Science Model Deployment feature allows data scientists to deploy trained models as fully-managed HTTP endpoints that can provide predictions in real time, infusing intelligence into processes and applications, and allowing the business to react to relevant events as they occur.
- DevOps
Oracle Cloud Infrastructure DevOps (developer operations) is a complete continuous integration/continuous delivery (CI/CD) platform for developers to simplify and automate their software development lifecycle. OCI DevOps enables developers and operators to collaboratively develop, build, test, and deploy software. Developers and operators get visibility across the full development lifecycle with a history of source commit through build, test, and deploy phases.
- Functions
Oracle Cloud Infrastructure Functions is a fully-managed, multitenant, highly scalable, on-demand, Functions-as-a-Service (FaaS) platform. It is powered by the Fn Project open source engine. OCI Functions enables you to deploy your code, and either call it directly or trigger it in response to events. OCI Functions uses Docker containers hosted in Oracle Cloud Infrastructure Registry.
- Identity
and Access Management
Oracle Cloud Infrastructure Identity and Access Management (IAM) provides user access control for OCI and Oracle Cloud Applications. The IAM API and the user interface enable you to manage identity domains and the resources within them. Each OCI IAM identity domain represents a standalone identity and access management solution or a different user population.
- Integration
Oracle Integration is a fully-managed, preconfigured environment that allows you to integrate cloud and on-premises applications, automate business processes, and develop visual applications. It uses an SFTP-compliant file server to store and retrieve files and allows you to exchange documents with business-to-business trading partners by using a portfolio of hundreds of adapters and recipes to connect with Oracle and third-party applications.
- Kubernetes
cluster
A Kubernetes cluster is a set of machines that run containerized applications. Kubernetes provides a portable, extensible, open source platform for managing containerized workloads and services in those nodes. A Kubernetes cluster is formed of worker nodes and control plane nodes.
- Kubernetes
control
plane
A Kubernetes control plane manages the resources for the worker nodes and pods within a Kubernetes cluster. The control plane components detect and respond to events, perform scheduling, and move cluster resources.
The following are the control plane components:- kube-apiserver: Runs the Kubernetes API server.
- etcd: Distributed key-value store for all cluster data.
- kube-scheduler: Determines which node new unassigned pods will run on.
- kube-controller-manager: Runs controller processes.
- cloud-controller-manager: Links your cluster with cloud-specific API.
- OCI Kubernetes Engine
Oracle Cloud Infrastructure Kubernetes Engine (OCI Kubernetes Engine or OKE) is a fully-managed, scalable, and highly available service that you can use to deploy your containerized applications to the cloud. You specify the compute resources that your applications require, and OKE provisions them on OCI in an existing tenancy. OKE uses Kubernetes to automate the deployment, scaling, and management of containerized applications across clusters of hosts.
- Kubernetes worker
node
A Kubernetes worker node is a worker machine that runs containerized applications within a Kubernetes cluster. Every cluster has at least one worker node.
- Object Storage
OCI Object Storage provides access to large amounts of structured and unstructured data of any content type, including database backups, analytic data, and rich content such as images and videos. You can safely and securely store data directly from the internet or from within the cloud platform. You can scale storage without experiencing any degradation in performance or service reliability.
Use standard storage for "hot" storage that you need to access quickly, immediately, and frequently. Use archive storage for "cold" storage that you retain for long periods of time and seldom or rarely access.
- Streaming
Oracle Cloud Infrastructure Streaming provides a fully-managed, scalable, and durable storage solution for ingesting continuous, high-volume streams of data that you can access and process in real time. You can use OCI Streaming for ingesting high-volume data, such as application logs, operational telemetry, web click-stream data; or for other use cases where data is produced and processed continually and sequentially in a publish-subscribe messaging model.
- Audit
The Oracle Cloud Infrastructure Audit service automatically records calls to all supported OCI public application programming interface (API) endpoints as log events. All OCI services support logging by Oracle Cloud Infrastructure Audit.
- Generative AI
Oracle Cloud Infrastructure Generative AI is a fully-managed OCI service that provides a set of state-of-the-art, customizable, large language models (LLMs) that cover a wide range of use cases for text generation, summarization, semantic search, and more. Use the playground to try out the ready-to-use pretrained models, or create and host your own fine-tuned custom models based on your own data on dedicated AI clusters.
- Load balancer
Oracle Cloud Infrastructure Load Balancing provides automated traffic distribution from a single entry point to multiple servers.
- Network
address translation (NAT) gateway
A NAT gateway enables private resources in a VCN to access hosts on the internet, without exposing those resources to incoming internet connections.
- Generative AI
Oracle Cloud Infrastructure Generative AI is a fully-managed OCI service that provides a set of state-of-the-art, customizable, large language models (LLMs) that cover a wide range of use cases for text generation, summarization, semantic search, and more. Use the playground to try out the ready-to-use pretrained models, or create and host your own fine-tuned custom models based on your own data on dedicated AI clusters.
- Digital
Assistant
Oracle Digital Assistant is a platform that allows you to create and deploy digital assistants for your users. With Oracle Digital Assistant, you can create AI-driven interfaces (or chatbots) for business applications through text, chat, and voice interfaces. Each digital assistant has a collection of one or more specialized skills to help users complete a variety of tasks in natural language conversations. For example, an individual digital assistant might have skills that focus on specific types of tasks such as tracking inventory, submitting time cards, and creating expense reports.
- Policy
An Oracle Cloud Infrastructure Identity and Access Management policy specifies who can access which resources, and how. Access is granted at the group and compartment level, which means you can write a policy that gives a group a specific type of access within a specific compartment or to the tenancy.
- Security zone
Security zones implement key Oracle security best practices by enforcing policies for an entire compartment, such as encrypting data and preventing public access to networks. A security zone is associated with a compartment of the same name and includes security zone policies (a recipe) that applies to the compartment and its sub-compartments. You can't add or move a standard compartment to a security zone compartment.
Recommendations
Ensure the success, scalability, and sustainability of the enterprise AI platform built on OCIs, with a focus on self-service LLM development, MLOps, GPU optimization, and enterprise-grade integration.
- Platform and strategy governance
- Create a dedicated cross-functional center of excellence for AI to
govern:
- Best practices in LLM training and deployment
- Resource allocation and quota management
- Security, compliance, and ethical AI usage
- Avoid uncontrolled resource sprawl by enabling quotas and tagging policies in OCI to ensure traceability and cost accountability across departments and teams.
- Create a dedicated cross-functional center of excellence for AI to
govern:
- GPU resource efficiency and scheduling
- Combine NVIDIA MIG to optimize GPU usage. Assign fractional GPUs per job or user to increase utilization and lower costs.
- GPU fractionation solution:
- Multi-instance GPU (MIG) is a feature available on NVIDIA A100 and H100 GPUs that enables the partitioning of a single
physical GPU into multiple, hardware-isolated instances (or slices),
known as GPU instances.
Each instance functions as an independent GPU with its own:
- Dedicated memory
- Compute cores
- Cache and memory bandwidth
This allows teams to run multiple AI workloads concurrently on a single GPU with predictable performance and hardware-level isolation.
The OCI Kubernetes Engine (OKE) is configured to support MIG-aware scheduling, allowing:
- Each pod to request a specific MIG instance (for example, 1/7th of an A100).
- The Kubernetes scheduler to intelligently allocate available GPU slices based on requests.
- MIG instances to be exposed via the NVIDIA device plug-in and node-feature-discovery, ensuring they are discoverable and schedulable by OKE.
- MIG-enabled GPUs (for example, A100 or H100) are deployed on OCI bare metal instances or as OKE worker nodes.
- OKE handles containerized AI workloads with MIG-aware scheduling.
- Multi-instance GPU (MIG) is a feature available on NVIDIA A100 and H100 GPUs that enables the partitioning of a single
physical GPU into multiple, hardware-isolated instances (or slices),
known as GPU instances.
-
MIG profile Slice fraction Dedicated memory Suitable for... 1g.5gb 1/7 5 GB Lightweight inference, testing 2g.10gb 2/7 10 GB Fine-tuning smaller models 3g.20gb 3/7 20 GB Medium-sized models 7g.40gb Full GPU 40 GB Full-scale training - Use OCI Monitoring to avoid bottlenecks in high-demand phases (for example, model training sprints).
- Model lifecycle and automation
- Standardize CI/CD by deploying models via OCI DevOps pipelines integrated with Git and Container Registry to automate:
- Model packaging
- Testing and validation
- Deployment to OKE or Functions
- Include rollback and validation steps by incorporating A/B testing, canary deployments, and rollback logic to avoid regressions in model behavior.
- Standardize CI/CD by deploying models via OCI DevOps pipelines integrated with Git and Container Registry to automate:
- Data architecture
- Deploy Oracle Database 23ai for storing embeddings and enabling semantic search via Oracle AI
Vector Search. Combine it with:
- OCI Object Storage for documents
- OCI Functions for retrieval orchestration
- Maintain vector freshness by recomputing and updating embeddings regularly when source documents change to ensure RAG output accuracy.
- Deploy Oracle Database 23ai for storing embeddings and enabling semantic search via Oracle AI
Vector Search. Combine it with:
- Security, compliance, and observability
- Enforce IAM-based segmentation by using OCI IAM policies, compartments, and groups to define clear boundaries between development, testing, and production environments.
- Log and audit every critical action by enabling OCI Logging, OCI Monitoring, and OCI Audit logs for all critical components (OKE, Functions, Storage, GPU nodes).
- Multicloud and hybrid integration
- Use OCI FastConnect, service gateway, and private endpoints to ensure high-speed, secure integration with on-premise and third-party AI services (for example, Azure OpenAI, AWS Bedrock).
- Avoid public internet exposure for sensitive workloads. Use private subnets, NAT gateways, and service-to-service authentication whenever possible.
- Self-service enamblement for developer
- Provide curated starter templates and APIs by offering a set of OCI Functions, OCI DevOps pipelines, and notebook templates to on-board new users quickly and safely into the self-service environment.
- Balance autonomy with safeguards by empowering users while maintaining control through policies, quotas, and shared best practices for responsible model development.
Considerations
Consider the following points when deploying this reference architecture.
- Performance
- Deploy high-performance GPU instances, such as A100, H100, H200, B200, and GB200 tailored to specific workload requirements, including training, inference, and large-scale distributed AI.
- Leverage RDMA-enabled GPU clusters for high bandwidth, low-latency distributed workloads.
- Continuously monitor resource usage to proactively mitigate contention.
- Security
- Implement compartmentalization and private subnets to isolate different operational environments.
- Enforce stringent access controls using OCI IAM and IDCS.
- Maintain comprehensive logging and audit trails for all significant operations.
- Availability
- Distribute critical resources across multiple fault domains to ensure fault tolerance.
- Utilize OCI Kubernetes Engine (OKE) with auto-scaling to maintain elasticity.
- Validate backup, recovery, and data replication strategies to achieve business continuity objectives.
- Cost
- Maximize GPU utilization efficiency via NVIDIA MIG fractionalization.
- Implement tiered storage strategies, leveraging OCI Object Storage lifecycle policies.
- Use project-level tagging and budget quotas to maintain financial accountability.
- Integration and deployment
- Standardize CI/CD workflows with OCI DevOps to streamline and automate model life cycle.
- Ensure consistent multicloud integration practices using OCI FastConnect and dynamic routing gateways (DRGs) for secure data flows.
- Data management
- Regularly manage and refresh semantic embeddings in Oracle Database 23ai for accurate retrieval.
- Categorize storage appropriately by data usage patterns (standard versus archive).
- User adoption and management
- Provide structured onboarding resources to accelerate self-service adoption.
- Continuously evaluate self-service environments and adjust policies to balance user freedom with operational governance.
Explore More
Learn more about how OCI empowers scalable, secure, and enterprise-ready generative AI solutions.
Review these additional resources:
- Artificial intelligence:
- Generative AI Agents
- Generative AI Agents in Oracle Cloud Infrastructure Documentation
- AI Solutions Hub
- Multi Ai Agents with Oracle Digital Assistant (video)
- What Is Retrieval-Augmented Generation (RAG)?
- Deploy multicloud generative AI retrieval augmented generation (RAG)
- NVIDIA MIG User Guide
- Oracle Cloud
Infrastructure:
- Oracle Cloud Infrastructure Documentation
- OCI Speech
- OCI OKE RDMA (GitHub)
- Well-architected framework for Oracle Cloud Infrastructure
- Oracle Cloud Cost Estimator
- FastConnect Overview in Oracle Cloud Infrastructure Documentation
- Networking Overview in Oracle Cloud Infrastructure Documentation
- Security Overview in Oracle Cloud Infrastructure Documentation
- Overview of Object Storage in Oracle Cloud Infrastructure Documentation
- GPU Shapes in Oracle Cloud Infrastructure Documentation
- Cloud Adoption Framework
- Deploy multicloud inbound and outbound private network connectivity
- Oracle Integration:
- Oracle Integration 3 in Oracle Cloud Infrastructure Documentation
- Using Integrations in Oracle Integration 3 – Design Best Practices
- Using Integrations in Oracle Integration 3 – About the Connectivity Agent