Implement an API Management Platform for Enterprise AI Models and Services

Chances are your organization has implemented AI, if so, it's likely you're among the many organizations who struggle to enforce enterprise-level security and standardize processes when building applications on top of private and public AI models.

Consuming AI models can pose common challenges for organizations from all industries including:

  • Integration Complexity: Managing point-to-point integrations between applications and AI models often leads to complexity when organizations want to adopt different models.
  • Security Standardization: Implementing consistent security measures across different AI models proves to be a significant challenge.
  • Access Control: Enforcing role-based access control to AI model APIs based on user roles and responsibilities can be difficult to manage effectively.
  • Monetization: Building AI models that lack monetizing capabilities if models are made available to external consumers.
  • Consumption and Resource Management: Setting quotas for subscribers to limit consumption of AI models.
  • Throttling: Throttling and rate limiting AI model APIs.
  • Monitoring: Monitoring and tracking capabilities to visualize the consumption of AI model APIs.

This architecture outlines a solution to help customers leverage the features of Oracle Cloud Infrastructure API Gateway and other OCI services to address these challenges in an AI solution.

Architecture

This architecture uses OCI API Gateway as middleware to manage the point-to-point integration between AI models and other OCI services. Use this architecture for AI use cases that require enterprise-level security flows and process standardization.

Standardized Security

Many foundational AI models and other AI services use different authentication mechanisms such as OAuth 2.0, Open ID, JWT, and so on. OCI API Gateway can help to standardize API authentication to AI models.

Virtualization or Abstraction Layer

As most modern enterprise organizations leverage the latest AI models from different providers who specialize on specific domains, consuming AI models directly from applications can create point-to-point integration complexity. OCI API Gateway is used as a service virtualization layer to make it easy to switch from one AI model to another.

OCI API Gateway and OCI Vault can abstract AI model API credentials from consumers, while the AI model credentials are stored in OCI Vault. Consumers access OCI API Gateway endpoints with client credentials generated from confidential applications created for that consumer. OCI API Gateway authenticates users against client credentials, and on successful authentication, OCI API Gateway retrieves the model API credentials from OCI Vault to invoke the backend model API endpoint.

Access Control and Governance

OCI API Gateway can enforce granular access control on AI model APIs to grant API access based on domain, role, or responsibility of the consumer. OCI API Gateway deployments enable packaging APIs based on domain to enable consumers to request subscribing to specific deployments. The rate limiting and throttling features of OCI API Gateway helps control the usage and performance of the AI models.

Cost Control

Usage plans and quotas can be leveraged to effectively control AI model consumption costs. Usage plans enable you to create different plan tiers which can be allocated to consumers based on priority and business value. For companies making third-party AI services available to their teams, usage plans can ensure employee usage is governed and monitored to prevent incurring large costs.

OCI Cache with Redis can be integrated with OCI API Gateway to reduce costs and improve performance of AI model APIs by caching frequent requests and offloading the inference requests on models.

Logging and Monitoring

OCI API Gateway's out-of-box reporting dashboard helps businesses gain insights on AI model usage, performance, and identifies cost saving opportunities.

You can stream logs to Oracle Cloud Infrastructure Logging Analytics for troubleshooting, monitoring AI model consumption behavior, generating custom reports to monitor resource consumption and making informed decisions on the future investments of your organization's AI portfolio. Logs can be streamed to billing systems if organizations want to monetize fine-tuned AI models.

OCI API Gateway can emit metrics to OCI Monitoring where usage plan metrics can be used to monitor top consuming customers and other dimensions to troubleshoot deployment and OCI API Gateway issues.

Networking

OCI API Gateway can be accessed from public internet and through private network connection.

Users and applications from the internet can access OCI API Gateway in a public subnet fronted by internet gateway.

Users and applications from on-premises can access OCI API Gateway in a private subnet via OCI FastConnect or VPN. Applications on Microsoft Azure or Google Cloud can access OCI API Gateway in a private subnet through the corresponding Oracle Interconnect for Microsoft Azure or Oracle Interconnect for Google Cloud.

The following diagram illustrates the architecture.



api-gateway-ai-architecture.zip

The following diagram illustrates the workflow between OCI API Gateway, AI models, and other Oracle services:



The workflow resembles the following:

  1. AI consumers from the internet connect to AI service API's through API Gateway. Consumers include both web and mobile apps built using any UI technology such as Oracle Visual Builder, Oracle Analytics Cloud, Embedded Visual Builder Cloud Service application within Oracle SaaS. On-prem consumers can establish a high performance secure tunnel between OCI and on-premises data centers that enables on-prem consumers to access AI models without using the internet.
  2. OCI API Gateway is integrated with OCI Identity and Access Management for authentication to achieve standardized security enforcement through OAuth 2.0 and basic authentication.
  3. OCI Vault stores AI model API credentials securely, and can abstract backend API credentials from consumers.
  4. Stream OCI API Gateway logs to OCI Logging to retain logs longer and build reports through logging analytics to generate insights.
  5. Integrate with OCI Cache with Redis to help reduce costs and improve the performance of AI model APIs by caching frequent requests.
  6. OCI Functions can be used as a wrapper around AI models that don't have REST endpoints. OCI Functions supports implementation from different languages such as Python, Java, Node, Go, Ruby, and C#.
  7. Integrate OCI API Gateway with AI services directly if the AI service exposes REST endpoints.
  8. Oracle Integration Cloud Service can implement complex transformations or implement orchestration logic before returning the inference output to consumers.

The architecture has the following components:

  • Region

    An Oracle Cloud Infrastructure region is a localized geographic area that contains one or more data centers, called availability domains. Regions are independent of other regions, and vast distances can separate them (across countries or even continents).

  • Availability domains

    Availability domains are standalone, independent data centers within a region. The physical resources in each availability domain are isolated from the resources in the other availability domains, which provides fault tolerance. Availability domains don’t share infrastructure such as power or cooling, or the internal availability domain network. So, a failure at one availability domain shouldn't affect the other availability domains in the region.

  • Fault domains

    A fault domain is a grouping of hardware and infrastructure within an availability domain. Each availability domain has three fault domains with independent power and hardware. When you distribute resources across multiple fault domains, your applications can tolerate physical server failure, system maintenance, and power failures inside a fault domain.

  • Virtual cloud network (VCN) and subnets

    A VCN is a customizable, software-defined network that you set up in an Oracle Cloud Infrastructure region. Like traditional data center networks, VCNs give you control over your network environment. A VCN can have multiple non-overlapping CIDR blocks that you can change after you create the VCN. You can segment a VCN into subnets, which can be scoped to a region or to an availability domain. Each subnet consists of a contiguous range of addresses that don't overlap with the other subnets in the VCN. You can change the size of a subnet after creation. A subnet can be public or private.

  • API Gateway

    Oracle API Gateway enables you to publish APIs with private endpoints that are accessible from within your network, and which you can expose to the public internet if required. The endpoints support API validation, request and response transformation, CORS, authentication and authorization, and request limiting.

  • Functions

    Oracle Cloud Infrastructure Functions is a fully managed, multitenant, highly scalable, on-demand, Functions-as-a-Service (FaaS) platform. It is powered by the Fn Project open source engine. Functions enable you to deploy your code, and either call it directly or trigger it in response to events. Oracle Functions uses Docker containers hosted in Oracle Cloud Infrastructure Registry.

  • Cache with Redis

    Oracle Cloud Infrastructure Cache with Redis is a comprehensive, managed-in-memory caching solution built on the foundation of open source Redis. This fully managed service accelerates data reads and writes, significantly enhancing application response times and database performance to provide an improved customer experience.

  • Integration

    Oracle Integration is a fully managed service that allows you to integrate your applications, automate processes, gain insight into your business processes, and create visual applications.

  • Vault

    Oracle Cloud Infrastructure Vault enables you to centrally manage the encryption keys that protect your data and the secret credentials that you use to secure access to your resources in the cloud. You can use the Vault service to create and manage vaults, keys, and secrets.

  • Logging
    Logging is a highly scalable and fully managed service that provides access to the following types of logs from your resources in the cloud:
    • Audit logs: Logs related to events emitted by the Audit service.
    • Service logs: Logs emitted by individual services such as API Gateway, Events, Functions, Load Balancing, Object Storage, and VCN flow logs.
    • Custom logs: Logs that contain diagnostic information from custom applications, other cloud providers, or an on-premises environment.
  • Monitoring

    Oracle Cloud Infrastructure Monitoring service actively and passively monitors your cloud resources using metrics to monitor resources and alarms to notify you when these metrics meet alarm-specified triggers.

  • Identity and Access Management (IAM)

    Oracle Cloud Infrastructure Identity and Access Management (IAM) is the access control plane for Oracle Cloud Infrastructure (OCI) and Oracle Cloud Applications. The IAM API and the user interface enable you to manage identity domains and the resources within the identity domain. Each OCI IAM identity domain represents a standalone identity and access management solution or a different user population.

  • Policy

    An Oracle Cloud Infrastructure Identity and Access Management policy specifies who can access which resources, and how. Access is granted at the group and compartment level, which means you can write a policy that gives a group a specific type of access within a specific compartment, or to the tenancy.

  • Cache with Redis

    Oracle Cloud Infrastructure Cache with Redis is a comprehensive, managed-in-memory caching solution built on the foundation of open source Redis. This fully managed service accelerates data reads and writes, significantly enhancing application response times and database performance to provide an improved customer experience.

  • Vision

    Oracle Cloud Infrastructure Vision is an AI service for performing deep-learning–based image analysis at scale. With prebuilt models available out of the box, developers can easily build image recognition and text recognition into their applications without machine learning (ML) expertise.

  • Generative AI

    Oracle Cloud Infrastructure Generative AI is a fully managed OCI service that provides a set of state-of-the-art, customizable large language models (LLMs) that cover a wide range of use cases for text generation, summarization, semantic search, and more. Use the playground to try out the ready-to-use pretrained models, or create and host your own fine-tuned custom models based on your own data on dedicated AI clusters.

  • Document Analysis

    Oracle Cloud Infrastructure Document Understanding is an AI service for performing deep-learning–based document analysis at scale. With prebuilt models available out of the box, developers can easily build intelligent document processing into their applications without machine learning (ML) expertise.

  • Digital Assistant

    Oracle Digital Assistant is a platform that allows you to create and deploy digital assistants for your users. With Oracle Digital Assistant, you can create AI-driven interfaces (or chatbots) for business applications through text, chat, and voice interfaces. Each digital assistant has a collection of one or more specialized skills, to help users complete a variety of tasks in natural language conversations. For example, an individual digital assistant might have skills that focus on specific types of tasks such as tracking inventory, submitting time cards, and creating expense reports.

  • Oracle Database 23ai

    Oracle Database 23ai brings the power of AI to enterprise data and applications. Oracle AI vector search allows documents, images, and relational data that are stored in mission-critical databases to be easily searched based on their conceptual content.

  • Oracle Autonomous Database Select AI

    Oracle Autonomous Database Select AI enables Oracle Autonomous Database to use generative AI with large language models (LLMs) to convert user’s input into Oracle SQL. Oracle Autonomous Database Select AI processes the natural language prompt, supplements the prompt with metadata, and then generates and runs a SQL query.

  • Oracle HeatWave Gen AI

    Oracle HeatWave Gen AI with vector store can be used for a retrieval-augmented generation (RAG) implementation to improve the accuracy and performance of AI models.

Considerations

When implementing OCI API Gateway for AI model API mangement, consider the following:

  • Security

    AI models use large amounts of enterprise data. Governance teams should ensure security measures handle data by enforcing masking, encryption, and access controls.

  • AI Model Terms of Use and Licenses

    Third-party AI models come with their own licenses and agreement terms. AI governance teams should be aware of legal terms of use to ensure compliance when exposing models through OCI API Gateway.

Acknowledgments

  • Author: Subburam Mathuraiveeran
  • Contributor: Wei Han, Robert Wunderlich, Pankhuri Sen