Deploy Multicloud Generative AI Retrieval Augmented Generation (RAG)

Oracle Cloud Infrastructure Generative AI is a fully managed service for seamlessly integrating large language models (LLM) into a wide range of use cases, including writing assistance, summarization, analysis, and chat.

Use OCI Generative AI and Oracle Integration in a multicloud solution such as Oracle Cloud Infrastructure and Microsoft Azure to query relevant organizational data and use it to provide highly contextualized answers.

In a RAG architecture, Oracle Integration can play the role of a data orchestrator, ensuring that all relevant data sources are available for retrieval. Then, Oracle Cloud Infrastructure Generative AI Agents take over to leverage that data to provide contextually rich answers.

OCI GenAI Agents process that data to provide a contextual response by embedding the retrieved documents and enhancing its generated responses by querying the large language model (LLM).

So, while both are involved in the data lifecycle, their roles are distinct but complementary in building a multicloud RAG architecture.

This multicloud RAG approach provides:

  • Multicloud flexibility: The architecture integrates multiple cloud platforms (OCI and Azure), making it adaptable to the data landscape within enterprises.
  • High-performance connectivity: Oracle Interconnect for Microsoft Azure ensures fast, secure, and reliable data transfer between cloud environments.
  • Dynamic content generation: The agent pulls the most current information from disparate sources, ensuring that LLM responses are accurate and relevant.
  • Embedded document search: By using embeddings and semantic search, OCI GenAI Agents can provide deeper insights based on context rather than just keyword matches.

Architecture

This multicloud solution sources data from both Microsoft Azure and Oracle Cloud Infrastructure (OCI), enabling Oracle Cloud Infrastructure Generative AI Agents to access a broader range of up-to-date information.

OCI GenAI Agents and Oracle Integration together support retrieve, augment, and generate (RAG) services to provide highly contextualized results.

OCI GenAI Agents specifically focus on using generative AI to respond to user queries by retrieving relevant information from knowledge bases or documents to generate answers. The agent provides enriched, context-aware responses by leveraging advanced AI techniques, embeddings, and document chunking to understand and generate relevant content:

  • Retrieve: Extract relevant data from the knowledge sources, usually through advanced hybrid search, combining lexical and semantic search.
  • Augment: Use the retrieved data to provide context for a query, ensuring that the generative AI model has the necessary information.
  • Generate: Use large language models (LLMs) to generate contextual responses to user questions, often enhanced by the data retrieved in the previous steps.

Oracle Integration, on the other hand, provides integration services that connect various applications and systems, allowing for orchestration of data flows across multiple environments:

  • Retrieve: Facilitates data retrieval from different sources by using connectivity agents to privately connect to various data sources or services (database, REST APIs, cloud storage, and so on) on Azure or other hyperscalers.
  • Orchestrate/Augment: Orchestrates workflows and integrates data from multiple sources, augmenting processes by enriching data through preconfigured or dynamic transformations.
  • Manage Data Flow: Unlike the RAG agent, Oracle Integration is not focused on generating responses from data but rather on enabling the smooth movement and transformation of data between systems and applications, ensuring that all the relevant data is available for different services.
Functional Area OCI GenAI Agents Oracle Integration
Purpose Designed to provide AI-driven responses by retrieving data, augmenting it, and using an LLM for generating responses. Designed to integrate and orchestrate data across multiple applications, providing seamless data connectivity but without the LLM-driven generation capabilities.
Data Handling Uses data to generate natural language responses in a context-aware manner. Handles data flow between applications, acting as a bridge between systems without generating content in the same way a LLM does.
Generative Capabilities Has generative AI capabilities and uses LLMs to generate conversational responses or other output. Does not have generative AI capabilities and is used to connect, retrieve, and transform data across services.

The following diagram illustrates the data flow through the architecture:



multicloud-genai-rag-process-oracle.zip

  1. The user interacts with either Oracle Digital Assistant or OCI GenAI Agents, depending on the implementation, to deliver user queries and prompts.
  2. Oracle Integration orchestrates calls among different components: pulling from data sources, handling doc ingestion, and passing user prompts downstream.
  3. Data sources include:
    • Oracle Interconnect for Microsoft Azure provides a high-bandwidth link between OCI and Azure for document repositories, Oracle Database@Azure, and so on.
    • Local file repositories provide on-premises or local files for ingestion.
    • OCI Services, such as Oracle Fusion Cloud Enterprise Resource Planning.
    • Oracle Database@Azure in a delegated subnet for data sharing across Oracle-managed services on Azure.
  4. The document ingestion, chunking, and embedding process can be implemented in different ways:
    1. Oracle Integration (using embedded JavaScript or custom libraries) performs chunking and calls OCI Generative AI to embed.
    2. OCI Functions receives documents, chunks them, then calls OCI Generative AI for embeddings.
    3. Oracle Autonomous Database 23ai performs chunking and embedding using vector functionality.

    The standard result is a set of chunk-text plus vector embeddings completely managed in the multicloud context.

  5. Vectors and chunks are stored in Oracle Autonomous Database 23ai:
    • The typical approach is to store embeddings in the vector index of Oracle Autonomous Database 23ai.
    • The chunk text itself can also be stored directly in a database CLOB (for quick retrieval), or as references that point to the chunk text in OCI Object Storage or in Azure Data Lake.
    • OCI Object Storage can store the original documents if needed, but you don’t necessarily need to keep embeddings there if you’re querying the vector store in the database.
  6. When the user prompts a question, OCI GenAI Agents (or the Digital Assistant) calls Oracle Autonomous Database 23ai to perform a vector similarity search using the user prompt’s embedding to identify the best matching chunks based on vector similarity scores.
  7. OCI Generative AI generates embeddings for questions and document chunks and generates responses using LLM models, providing contextually enriched answers. Chunk retrieval and LLM response also depends on the implementation:
    • If chunk text is stored in the database, it can be retrieved directly.
    • If only references are stored, the system quickly fetches the actual chunk content from OCI Object Storage, Azure Data Lake, or other repository.
    • The relevant chunks are then fed to the LLM in OCI Generative AI along with the user’s original prompt to produce a contextually enriched response.
  8. The final answer is returned either by the Oracle Digital Assistant or by the OCI GenAI Agents interface, depending on the front end to which the user is connected.

The following diagram illustrates the architecture:



multicloud-genai-rag-architecture-oracle.zip

Microsoft Azure provides the following components:
  • Microsoft Azure region

    An Azure region is a geographical area in which one or more physical Azure data centers, called availability zones, reside. Regions are independent of other regions, and vast distances can separate them (across countries or even continents).

    Azure and OCI regions are localized geographic areas. For Oracle Database@Azure, an Azure region is connected to an OCI region, with availability zones (AZs) in Azure connected to availability domains (ADs) in OCI. Azure and OCI region pairs are selected to minimize distance and latency.

  • Microsoft Azure availability zone

    An availability zone is a physically-separate data center within a region that is designed to be highly available and fault tolerant. Availability zones are close enough to have low-latency connections to other availability zones.

  • Microsoft Azure Virtual Network

    Microsoft Azure Virtual Network (VNet) is the fundamental building block for a private network in Azure. VNet enables many types of Azure resources, such as Azure virtual machines (VM), to securely communicate with each other, the internet, and with on-premises networks.

  • Microsoft Azure Delegated Subnet

    Subnet delegation allows you to inject a managed service, specifically a platform-as-a-service (PaaS) service, directly into your virtual network. A delegated subnet can be a home for an externally managed service inside of your virtual network so that the external service acts as a virtual network resource, even though it is an external PaaS service.

  • Microsoft Azure Data Lake Storage

    Data Lake Storage is a cloud-based, enterprise data lake solution. It's engineered to store massive amounts of data in any format, and to facilitate big data analytical workloads. You use it to capture data of any type and ingestion speed in a single location for easy access and analysis using various frameworks.

  • Microsoft Azure Synapse Analytics

    Azure Synapse Analytics combines a centralized service for data storage and processing with an extensible, linked-service architecture that enable you to integrate commonly used data stores, processing platforms, and visualization tools.

Oracle Cloud Infrastructure provides the following components:

  • Region

    An Oracle Cloud Infrastructure region is a localized geographic area that contains one or more data centers, called availability domains. Regions are independent of other regions, and vast distances can separate them (across countries or even continents).

  • Availability domain

    Availability domains are standalone, independent data centers within a region. The physical resources in each availability domain are isolated from the resources in the other availability domains, which provides fault tolerance. Availability domains don’t share infrastructure such as power or cooling, or the internal availability domain network. So, a failure at one availability domain shouldn't affect the other availability domains in the region.

  • Virtual cloud network (VCN) and subnets

    A VCN is a customizable, software-defined network that you set up in an Oracle Cloud Infrastructure region. Like traditional data center networks, VCNs give you control over your network environment. A VCN can have multiple non-overlapping CIDR blocks that you can change after you create the VCN. You can segment a VCN into subnets, which can be scoped to a region or to an availability domain. Each subnet consists of a contiguous range of addresses that don't overlap with the other subnets in the VCN. You can change the size of a subnet after creation. A subnet can be public or private.

  • Route table

    Virtual route tables contain rules to route traffic from subnets to destinations outside a VCN, typically through gateways.

  • Security list

    For each subnet, you can create security rules that specify the source, destination, and type of traffic that must be allowed in and out of the subnet.

  • Generative AI

    Oracle Cloud Infrastructure Generative AI is a fully-managed OCI service that provides a set of state-of-the-art, customizable, large language models (LLMs) that cover a wide range of use cases for text generation, summarization, semantic search, and more. Use the playground to try out the ready-to-use pretrained models, or create and host your own fine-tuned custom models based on your own data on dedicated AI clusters.

  • Integration

    Oracle Integration is a fully-managed, preconfigured environment that allows you to integrate cloud and on-premises applications, automate business processes, and develop visual applications. It uses an SFTP-compliant file server to store and retrieve files and allows you to exchange documents with business-to-business trading partners by using a portfolio of hundreds of adapters and recipes to connect with Oracle and third-party applications.

  • Object storage

    OCI Object Storage provides quick access to large amounts of structured and unstructured data of any content type, including database backups, analytic data, and rich content such as images and videos. You can safely and securely store data directly from the internet or from within the cloud platform. You can scale storage without experiencing any degradation in performance or service reliability.

    Use standard storage for "hot" storage that you need to access quickly, immediately, and frequently. Use archive storage for "cold" storage that you retain for long periods of time and seldom or rarely access.

  • Functions

    Oracle Cloud Infrastructure Functions is a fully-managed, multitenant, highly scalable, on-demand, Functions-as-a-Service (FaaS) platform. It is powered by the Fn Project open source engine. OCI Functions enables you to deploy your code, and either call it directly or trigger it in response to events. OCI Functions uses Docker containers hosted in Oracle Cloud Infrastructure Registry.

  • Analytics

    Oracle Analytics Cloud is a scalable and secure public cloud service that empowers business analysts with modern, AI-powered, self-service analytics capabilities for data preparation, visualization, enterprise reporting, augmented analysis, and natural language processing and generation. With Oracle Analytics Cloud, you also get flexible service management capabilities, including fast setup, easy scaling and patching, and automated lifecycle management.

  • Digital Assistant

    Oracle Digital Assistant is a platform that allows you to create and deploy digital assistants for your users. With Oracle Digital Assistant, you can create AI-driven interfaces (or chatbots) for business applications through text, chat, and voice interfaces. Each digital assistant has a collection of one or more specialized skills to help users complete a variety of tasks in natural language conversations. For example, an individual digital assistant might have skills that focus on specific types of tasks such as tracking inventory, submitting time cards, and creating expense reports.

  • Autonomous Database

    Oracle Autonomous Database is a fully-managed, preconfigured database environment that you can use for transaction processing and data warehousing workloads. You do not need to configure or manage any hardware, or install any software. Oracle Cloud Infrastructure handles creating, backing up, patching, upgrading, and tuning the database.

Acknowledgments

  • Authors: Michele Nicosia, Wei Han, Kailas Jawadekar
  • Contributors: Lyudmil Pelov, Juan Carlos Gonzalez Carrero, Robert Lies