Enable a Low Code Modular LLM App Engine using Oracle Integration and OCI Generative AI

Oracle Integration is a fully managed service that enables a low-code or no-code approach for enterprise connectivity, extension and automation capabilities for quickly modernizing applications, business processes, APIs, and data. With a visual development experience, prebuilt integrations, and embedded best practices, Oracle Integration can orchestrate APIs, apps, systems, etc. and enable AI-powered and human-based custom applications and business flows.

Oracle Cloud Infrastructure Generative AI (OCI Generative AI) is a fully managed service that provides a set of state-of-the-art, customizable large language models (LLMs) that cover a wide range of use cases for text generation, summarization, embedding and chat completion. You can use the playground - an interface in the Console for exploring the hosted pretrained and custom models without writing a single line of code or create and host your own fine-tuned custom models based on your own data on dedicated AI clusters.

Combining Oracle Integration and OCI Generative AI services with other OCI services like Oracle Cloud Infrastructure Streaming (OCI Streaming, an Oracle managed Kafka Service), OCI Document Understanding (serverless service accesible using REST API calls for Optical Character Recognition (OCR), text extraction, key-value extraction, table extraction and classification of documents), Oracle Cloud Infrastructure Language (serverless service accesible using REST API calls for text sentiment analysis, named entity recognition, classification and more), OCI Vision (serverless service accesible using REST API calls for object detection and image classification), Oracle Cloud Infrastructure Data Science (fully managed, serverless platform that data science teams can use to build, train, and manage machine learning - ML models ) and others by using Oracle Integration's native adapters and visual development approach, gives you the possibility to enable modular, scalable, maintainable and secured custom LLM based Applications.

Architecture

This reference architecture provides the necessary considerations and recommendations to enable an AI-based, modular and event-driven LLM App Engine, using:

  • A low-code or no-code approach for the Data Loader and Query Engine flows of your LLM Application with Oracle Integration visual orchestration tools and native adapters for different Social, Productivity and Business Data Channels (users input to the LLM App Engine, either documents, images, business data or queries) and Sources (source of the data used by the LLM App Engine), as well as native adapters to the different OCI Services used by the LLM App Engine (OCI Generative AI REST APIs, Vector Databases or Stores, Oracle Cloud Infrastructure Language REST APIs, Oracle Cloud Infrastructure Data Science Custom Model REST Endpoints and more). This helps to quickly set up your LLM Application Business Flows
  • An event-driven pattern to decouple the Document, Image and Business Data Channels and Sources as well as the Query Channels from the Data Loader and Query Engine modules of the LLM App Engine using the OCI Streaming Service (Oracle managed Kafka Service) and the native adapter we have for this OCI Service in Oracle Integration. This helps to enable a scalable and performant LLM application.
  • A private connection to 3rd party cloud, on-premises apps, systems, and so on, using the Oracle Integration Connectivity Agent, which is the key enabler for hybrid and multicloud integration architectures, specially in an LLM Application where documents, images, business data, query from users can come from those systems and you want to keep the transit for documents and data private and secured. This helps to improve the security of the end-to-end LLM Flow, keeping the traffic within private networks.
  • The possibility to use native LLM models or fine-tuned custom LLM models in your LLM App as services in OCI (orchestrating OCI Generative AI Model Endpoints or OCI Data Science Model Endpoints using Oracle Integration cloud native adapters).
  • A flexible approach to plug or unplug your own User Interface (UI) for your LLM Application with the LLM APP Engine, or a low-code approach to build the UI either using Visual App Builder under Oracle Integration, or Oracle APEX.

The following diagram illustrates this reference architecture.



oci-lowcode-ai-arch-oracle.zip

The Low-Code LLM App Engine is composed of 2 main blocks:

Document, Image and Business Data Loader
Description of oci-generative-ai-llm-data-loader-arch.png follows
Description of the illustration oci-generative-ai-llm-data-loader-arch.png

oci-generative-ai-llm-data-loader-arch-oracle.zip

  • This block first receives the input documents, images or business data added by an user via either a Social/Productivity Channel (for example,Whatsapp, Outlook, Gmail, Twitter, and so on.), a Business Data Channel (for example, 3rd party on-premises or Cloud Web Application, Content Management System, 3rd party Cloud Storage like Microsoft Azure Cloud Storage, AWS S3, Google Cloud Storage, FTP, File Server, ERP, CX, HCM - on-premises or SaaS applications, and so on), a custom Knowledge Search Engine UI (for example, custom UI built using low-code visual app tools like Visual Builder under Oracle Integration or Oracle APEX under Oracle Database) or directly into an OCI Object Storage bucket, using Oracle Integration Visual Orchestration Flows and native adapters.
  • The documents, images or business data are extracted depending on the type of input. For example, for image object, text detection you can use OCI AI Vision service, for document classification, extraction you can use either OCI AI Document Understanding service, for specific types of images, documents you can use a serverless function in your preferred programming language using OCI Functions, and so on).
  • The documents, images or business data can be processed by either:
    • Extracting the metadata using OCI AI Language service (for example, entity, keyword extraction, key-phrase extraction, sentiment analysis, Personal Identifiable Information/PII detection and obfuscation, and so on) for further relevant context retrieval (for example to enable Retrieval Augmented Generation (RAG), and so on).
    • Embedding the data with a LLM Model using OCI Generative AI service for further relevant context retrieval (for example, to enable Retrieval Augmented Generation (RAG), and so on).
    • Summarizing the data with a LLM Model using OCI Generative AI service for further relevant context retrieval (for example, to enable a Summary Index for search on multiple documents, also called Structured Hierarchical Retrieval, and so on).
    • Storing the data in a Relational Store for further search on structured data (for example Oracle Database, Oracle Database Cloud Service, Autonomous Database, MySQL, PostgreSQL, and so on).
    • Indexing the data in a Vector Store for further search on unstructured data (for example, Vector, Summary, Keyword Indexes with OCI AI Vector Search, OCI Search Service with OpenSearch, Qdrant, and so on).

      Note: New AI vector similarity search feature will be available in Oracle Database 23.4c.

Query Engine
Description of oci-generative-ai-llm-query-engine-arch.png follows
Description of the illustration oci-generative-ai-llm-query-engine-arch.png

oci-generative-ai-llm-query-engine-arch-oracle.zip

  • This block first receives the input query from an user via either a Social, Productivity Channel (for example, Whatsapp, Outlook, Gmail, and so on), a Business Data Channel (for example, 3rd party On-Prem/Cloud Web Application, and so on) or from a custom Knowledge Search Engine UI (for example, custom UI built using low-code visual app tools like Visual Builder under Oracle Integration or Oracle APEX under Oracle Database) using Oracle Integration Visual Orchestration Flows and native adapters.
  • The query enters the query pipeline using Oracle Integration where it can be processed by either:
    • Filtering the Query using OCI Generative AI service, in order to avoid Prompt Injection.
    • Rewriting or transforming the Query using OCI Generative AI service, in order to get better relevant context retrieval.
    • Routing the Query with query engine selectors using OCI Generative AI, in order to determine how to execute the query over your data either as data summarization, as a specific context retrieval.
    • Extracting the conversation history from a Chat History Store using OCI Cache with Redis, in case of chat completion use cases where chat memory is required.
    • Embedding the Query using OCI Generative AI service, for relevant context retrieval use cases (for example, to enable Retrieval Augmented Generation (RAG), and so on).
    • Routing the Relevant Context Retrievers using OCI Generative AI service, in order to determine which data sources to get data from for answering the input query.
    • Retrieving the Relevant Context Data to answer the query from either Vector Stores (for example, OCI AI Vector Search, OCI Search Service with OpenSearch, Qdrant, and so on) for Retrieval Augmented Generation (RAG) use cases, Relational Stores (for example, Oracle Database, Oracle Database Cloud Service, Autonomous Database, MySQL, PostgreSQL, and so on) for search on structured business data, or Social, Productivity and Business Data Sources (for example, Twitter, Outlook, Gmail, ERP/HCM/CX Apps, and so on.) for on-demand search on business data, all orchestrated by Oracle Integration using native adapters to connect to these data sources.
    • Re-Ranking the Relevant Context Data retrieved using a Re-Rank model deployed and exposed in OCI Data Science, in order to optimize the search.
    • Generating the final answer to the query using OCI Generative AI service capabilities for Summarization, Generation and Chat Completion.

The architecture has the following components:

  • Oracle Integration

    Oracle Integration is a fully managed service and low-code enterprise connectivity, extension and automation platform for quickly modernizing applications, business processes, APIs, and data. Developers and cloud architects can connect SaaS and on-premises applications six times faster with a visual development experience, prebuilt integrations, and embedded best practices. Oracle Integration gives you native access to events in Oracle Cloud ERP, HCM, and CX. Connect app-specific analytic silos to simplify requisition-to-receipt, recruit-to-pay, lead-to-invoice, and other critical processes. Finally, give your IT and business leaders end-to-end visibility.

  • OCI Generative AI

    Oracle Cloud Infrastructure (OCI) Generative AI is a fully managed service available via an API to seamlessly integrate these versatile language models into a wide range of use cases, including writing assistance, summarization, and chat. The OCI Generative AI service includes the following foundational models:

    • Generation: Give instructions to generate text or extract information from your text.
    • Summarization: Summarize text with your instructed format, length, and tone.
    • Embedding: Convert text to vector embeddings to use in applications for semantic searches, text classification, or text clustering.
  • OCI Document Understanding

    OCI Document Understanding is an AI service that enables developers to extract text, tables, and other key data from document files through APIs and command line interface tools. With OCI Document Understanding, you can automate tedious business processing tasks with prebuilt AI models and customize document extraction to fit your industry-specific needs.

  • Oracle Cloud Infrastructure Language

    OCI Language is a serverless and multi-tenant service that is accessible using REST API calls. It provides pre-trained models that are frequently retrained and monitored to give you the best results. Language provides you with artificial intelligence and machine learning capabilities to detect the language in your unstructured text. Also, it provides other tools to help you further gain insights into your text.

  • OCI Vision

    OCI Vision is an AI service for performing deep-learning–based image analysis at scale. With prebuilt models available out-of-the-box, developers can easily build image recognition and text recognition into their applications without machine learning (ML) expertise. For industry-specific use cases, developers can automatically train custom Vision models with their own data. These models can be used to detect visual anomalies in manufacturing, extract text from documents to automate business workflows, and tag items in images to count products or shipments. In addition to gaining access to pre-trained models, developers can create custom models without data science expertise or managing custom model infrastructure.

  • Object storage

    Object storage provides quick access to large amounts of structured and unstructured data of any content type, including database backups, analytic data, and rich content such as images and videos. You can safely and securely store and then retrieve data directly from the internet or from within the cloud platform. You can seamlessly scale storage without experiencing any degradation in performance or service reliability. Use standard storage for "hot" storage that you need to access quickly, immediately, and frequently. Use archive storage for "cold" storage that you retain for long periods of time and seldom or rarely access.

  • Data Science

    Oracle Cloud Infrastructure Data Science is a fully managed, serverless platform that data science teams can use to build, train, and manage machine learning (ML) models on Oracle Cloud Infrastructure (OCI). It can easily integrate with other OCI services such as Oracle Autonomous Data Warehouse, Oracle Cloud Infrastructure Object Storage, and more. You can build and evaluate high-quality machine learning models that increase business flexibility by putting enterprise-trusted data to work quickly, and you can support data-driven business objectives with easier deployment of ML models.

  • OCI Search Service with OpenSearch

    OCI Search Service with OpenSearch is an insight engine offered as an Oracle managed service. Without any downtime, Oracle automates patching, updating, upgrading, backing up, and resizing the service. Customers can store, search, and analyze large volumes of data quickly and see results in near real time.

  • OCI Cache with Redis

    Oracle Cloud Infrastructure Cache with Redis is a comprehensive, managed-in-memory caching solution built on the foundation of open source Redis. This fully managed service accelerates data reads and writes, significantly enhancing application response times and database performance to provide an improved customer experience.

  • APEX Service

    Oracle APEX Application Development (APEX) is a low-code development platform that enables you to build scalable, feature-rich, secure, enterprise apps that can be deployed anywhere that Oracle Database is installed. You don't need to be an expert in a vast array of technologies to deliver sophisticated solutions. APEX Service includes built-in features such as user interface themes, navigational controls, form handlers, and flexible reports that accelerate the application development process.

  • Oracle Database 23 (AI Vector Search)

    Oracle Database 23c delivers the most complete and simple converged database for developers looking to build new microservice, graph, document, and relational applications.Oracle has announced the plan to add semantic search capabilities using AI vectors to Oracle Database 23c. The collection of features, called AI Vector Search, includes a new vector data type, vector indexes, and vector search SQL operators that enable the Oracle Database to store the semantic content of documents, images, and other unstructured data as vectors, and use these to run fast similarity queries. For more information, see the Press Release link in the Explore More section.

  • Streaming

    Oracle Cloud Infrastructure Streaming provides a fully managed, scalable, and durable storage solution for ingesting continuous, high-volume streams of data that you can consume and process in real time. You can use Streaming for ingesting high-volume data, such as application logs, operational telemetry, web click-stream data; or for other use cases where data is produced and processed continually and sequentially in a publish-subscribe messaging model.

  • Events

    Oracle Cloud Infrastructure services emit events, which are structured messages that describe the changes in resources. Events are emitted for create, read, update, or delete (CRUD) operations, resource lifecycle state changes, and system events that affect cloud resources.

  • Functions

    Oracle Cloud Infrastructure Functions is a fully managed, multitenant, highly scalable, on-demand, Functions-as-a-Service (FaaS) platform. It is powered by the Fn Project open source engine. Functions enable you to deploy your code, and either call it directly or trigger it in response to events. Oracle Functions uses Docker containers hosted in Oracle Cloud Infrastructure Registry.

  • API Gateway

    Oracle API Gateway enables you to publish APIs with private endpoints that are accessible from within your network, and which you can expose to the public internet if required. The endpoints support API validation, request and response transformation, CORS, authentication and authorization, and request limiting.

  • Web Application Firewall (WAF)

    Oracle Cloud Infrastructure Web Application Firewall (WAF) is a payment card industry (PCI) compliant, regional-based and edge enforcement service that is attached to an enforcement point, such as a load balancer or a web application domain name. WAF protects applications from malicious and unwanted internet traffic. WAF can protect any internet facing endpoint, providing consistent rule enforcement across a customer's applications.

  • Region

    An Oracle Cloud Infrastructure region is a localized geographic area that contains one or more data centers, called availability domains. Regions are independent of other regions, and vast distances can separate them (across countries or even continents).

  • Virtual cloud network (VCN) and subnet

    A VCN is a customizable, software-defined network that you set up in an Oracle Cloud Infrastructure region. Like traditional data center networks, VCNs give you complete control over your network environment. A VCN can have multiple non-overlapping CIDR blocks that you can change after you create the VCN. You can segment a VCN into subnets, which can be scoped to a region or to an availability domain. Each subnet consists of a contiguous range of addresses that don't overlap with the other subnets in the VCN. You can change the size of a subnet after creation. A subnet can be public or private.

Recommendations

Use the following recommendations as a starting point. Your requirements might differ from the architecture described here.
  • Maintenance and High Availability

    The reference architecture uses nearly only Oracle-managed PaaS services. There is no need to install, patch, update, or upgrade the software using this solution. This is valid for: Oracle Integration, OCI Generative AI, OCI Document Understanding, OCI Vision, Oracle Cloud Infrastructure Language, Oracle Cloud Infrastructure Data Science, OCI Object Storage, OCI Events, OCI Streaming, OCI Functions, OCI API Gateway, Oracle Cloud Infrastructure Web Application Firewall.

    The only component that could request attention is the Oracle Integration Connectivity Agent installed in a compute instance to access the resources like OCI OpenSearch cluster, Autonomous Database, and so on that reside in a private network. Follow the guidelines in the Oracle Integration documentation to make the Oracle Integration Connectivity agent easy to maintain and highly available.

  • Scalability and size

    This reference architecture uses PaaS services and it is scalable out-of-the-box for most of the services it includes. Note that the OCI OpenSearch cluster as well as the OCI Cache with Redis cluster do not scale up and down automatically (only manually). So, a right scaling of the solution is needed based on your use case.

  • Connectivity

    All connections within OCI should be established through a private network:

    • You can use either private endpoints option or connectivity agents in Oracle Integration to connect to private OCI Services like OCI Streaming, Oracle Autonomous Database, Oracle Database, Oracle Database Cloud Service, and so on.
    • Oracle Integration Connectivity Agent(s) which connect to private services like OCI OpenSearch cluster, OCI Cache with Redis cluster, and so on should be installed in an OCI VM within the same private subnet where these services are deployed.
    • OCI Streaming Kafka Streams or Topics that you create should be associated with a Stream Pool deployed with a private endpoint (associated with a private subnet in an OCI VCN). In case of OCI Streaming Private Kafka Streams or Topics like the Document and Image Extraction Result Topics (see logical block, Document, Image and Business Data Loader above) which receive the OCI Document Understanding Extraction Result files metadata from OCI Events, you can leverage OCI Functions with OCI Events to deliver messages to private streaming endpoints.
    • Connectivity Agent(s) which connect to 3rd party On-Premises or Cloud services (for example, Azure SQL Databases) should be installed in a VM within the same private subnet where these external services are deployed.
  • Restrict access to an Oracle Integration instance

    Restrict the networks that have access to your Oracle Integration instance by configuring the Oracle Integration Allowlist (formerly a whitelist). Only users, systems from the specific IP addresses, Classless Inter-Domain Routing (CIDR) blocks, and virtual cloud networks that you specify can access the Oracle Integration instance.

    In this reference architecture, Oracle Integration Allowlist could restrict access to the Oracle Integration instance, only allowing requests initiated by Cloud Applications deployed on OCI, Oracle SaaS Apps, non-Oracle Cloud, On-Premises Web, Saas, Cloud Applications, and VCN OCID(s) associated with the VMs hosting the Oracle Integration Connectivity Agents.

Considerations

Consider the following points when deploying this reference architecture.

  • Security

    Dedicated AI clusters in OCI Generative AI are compute resources that you can use for fine-tuning custom LLM models or for hosting endpoints for custom LLM models. The clusters are dedicated to your models and not shared with users in other tenancies. Using Custom Model OCI Generative AI lets you refine the models using your own data. Otherwise, you can use your own data by implementing Retrieval Augmented Generation (RAG) combining Embedding, Indexing, and so on techniques using OCI Generative AI on-demand models, Vector Stores (for example, OCI Vector Search, Open Search, and so on), and more.

    Use Oracle Cloud Infrastructure Identity and Access Management (OCI IAM) policies to control who can access your cloud resources (example, Oracle Integration, OCI Language, OCI Vision, OCI Generative AI service, OCI Streaming, OCI Compute Instances, and so on) and what operations can be performed. To protect the database passwords or any other secrets, consider using the OCI Vault service.

    The documents and images are stored in private OCI Object Storage buckets. A temporary link with a short life is created when a user clicks on the document within the Knowledge Search Engine UI. Use Oracle Cloud Infrastructure Web Application Firewall (WAF) filters and rules to protect Oracle Integration REST-triggered orchestration flows exposed via OCI API Gateway from malicious attacks like DDOs attacks, SQL Injection threads, and so on.

  • Scalability

    When creating OCI Streaming Streams or Topics, administrators specify the number of Streams they plan to use. Streams could be created per Business Domain (for example, InvoiceStream, PurchaseOrderStream, and so on). Administrators also specify partitions they plan to use per Stream or Topic. Partitions allow you to distribute a stream, topic by splitting messages across multiple nodes, allowing to have multiple consumers reading from a stream, topic in parallel (in this case, you could have multiple clones of the same consumer integration flow in Oracle Integration, each one reading from a different partition of a Stream or Topic using OCI Streaming adapter as the trigger).

    When creating Oracle Integration instances, administrators specify the number of message packs they plan to use for per instance.

  • Resource limits

    Consider the best practices, limits by service, and compartment quotas for your tenancy.

Acknowledgments

  • Author: Juan Carlos González Carrero
  • Contributors: Bob Peulen, Alexandru Negrea