Deploy a Disaster Recovery Solution for an Oracle Cloud Infrastructure API Gateway

Oracle Cloud Infrastructure API Gateway is available in an OCI region governed by service-level agreements (SLAs). This reference architecture details the architecture underlying the implementation of a cross-region, customer-managed disaster recovery solution for OCI API Gateway.

This disaster recovery solution supports:
  • “Active-passive” topologies, wherein only one gateway processes the entire load even though both gateways are expected to be up and running. An active gateway is determined not by its state (Active/Deleted/Updating) but by the gateway to which the traffic is directed and is enabled to execute all functions.
  • Customer-managed replication of OCI API Gateway categories; for example API deployments, usage plans, and API subscribers.
If both primary and secondary OCI API Gateways use the same API categories and authorization providers, after the switchover, you can access the APIs the same way across the regions.

Before You Begin

Before you begin configuring disaster recovery configuration for integrations, you must:
  • Provision a second OCI API Gateway in a different OCI region.
  • Obtain a custom DNS/host name (in a domain of your choice) and an associated SSL certificate.

Architecture

This reference architecture for OCI API Gateway consists of two OCI API Gateways in two different cloud regions, which are accessed by using a single custom endpoint (URL). To implement single custom endpoint, you can use an OCI Domain Name System (DNS) zone to resolve the custom endpoint name.

You can use these custom URLs as the entry point to the OCI API Gateways; for example, api.mycompany.com. For more information about configuring custom endpoints, see "Managing DNS Service Zones", which you can access from "Explore More", below.

The two OCI API Gateways in the architecture are designated as primary and secondary, and both gateways run concurrently; however, only one of the gateways receives traffic. Initially, the primary gateway receives the traffic flow. If the primary region becomes unavailable, the DNS record can be updated to route the traffic to the secondary region.

The following diagram illustrates this reference architecture for a public gateway (landing zone pattern for OCI API Gateway):


Description of apigw-oci-customer-managed-dr-topology.png follows
Description of the illustration apigw-oci-customer-managed-dr-topology.png

apigw-oci-customer-managed-dr-topology-oracle.zip

These architectures have the following components:
  • Tenancy

    A tenancy is a secure and isolated partition that Oracle sets up within Oracle Cloud when you sign up for Oracle Cloud Infrastructure. You can create, organize, and administer your resources in Oracle Cloud within your tenancy. A tenancy is synonymous with a company or organization. Usually, a company will have a single tenancy and reflect its organizational structure within that tenancy. A single tenancy is usually associated with a single subscription, and a single subscription usually only has one tenancy.

  • Region

    An Oracle Cloud Infrastructure region is a localized geographic area that contains one or more data centers, called availability domains. Regions are independent of other regions, and vast distances can separate them (across countries or even continents).

  • Compartment

    Compartments are cross-regional logical partitions within an Oracle Cloud Infrastructure tenancy. Use compartments to organize, control access, and set usage quotas for your Oracle Cloud resources. In a given compartment, you define policies that control access and set privileges for resources.

  • Availability domain

    Availability domains are standalone, independent data centers within a region. The physical resources in each availability domain are isolated from the resources in the other availability domains, which provides fault tolerance. Availability domains don’t share infrastructure such as power or cooling, or the internal availability domain network. So, a failure at one availability domain shouldn't affect the other availability domains in the region.

  • Virtual cloud network (VCN) and subnet

    A VCN is a customizable, software-defined network that you set up in an Oracle Cloud Infrastructure region. Like traditional data center networks, VCNs give you control over your network environment. A VCN can have multiple non-overlapping CIDR blocks that you can change after you create the VCN. You can segment a VCN into subnets, which can be scoped to a region or to an availability domain. Each subnet consists of a contiguous range of addresses that don't overlap with the other subnets in the VCN. You can change the size of a subnet after creation. A subnet can be public or private.

  • Route table

    Virtual route tables contain rules to route traffic from subnets to destinations outside a VCN, typically through gateways.

  • Security list

    For each subnet, you can create security rules that specify the source, destination, and type of traffic that must be allowed in and out of the subnet.

  • Oracle Cloud Infrastructure DNS service

    Public DNS zones hold the authoritative DNS records that reside on OCI's name servers. You can create public zones with publicly available domain names reachable on the internet. For more information, see "Overview of DNS", which you can access from "Explore More", below..

  • Internet gateway

    The internet gateway allows traffic between the public subnets in a VCN and the public internet.

  • Oracle Cloud Infrastructure Web Application Firewall (WAF)

    WAFs protect applications from malicious and unwanted internet traffic with a cloud-based, PCI-compliant, global web application firewall service. By combining threat intelligence with consistent rule enforcement on OCI Flexible Network Load Balancer, Oracle Cloud Infrastructure Web Application Firewall strengthens defenses and protects internet-facing application servers and internal applications. (This component is optional)

  • Flexible Load Balancer

    A load balancer improves resource utilization by directing requests across application services that operate in parallel. As demand increases, the number of application services can be increased, and the load balancer will use them to balance the processing of requests. (This component is optional)

  • Bastion service

    Oracle Cloud Infrastructure Bastion provides restricted and time-limited secure access to resources that don't have public endpoints and that require strict resource access controls, such as bare metal and virtual machines, Oracle MySQL Database Service, Autonomous Transaction Processing (ATP), Oracle Cloud Infrastructure Kubernetes Engine (OKE), and any other resource that allows Secure Shell Protocol (SSH) access. With OCI Bastion service, you can enable access to private hosts without deploying and maintaining a jump host. In addition, you gain improved security posture with identity-based permissions and a centralized, audited, and time-bound SSH session. OCI Bastion removes the need for a public IP for bastion access, eliminating the hassle and potential attack surface when providing remote access.

  • API Gateway

    Oracle Cloud Infrastructure API Gateway enables you to publish APIs with private endpoints that are accessible from within your network, and which you can expose to the public internet if required. The endpoints support API validation, request and response transformation, CORS, authentication and authorization, and request limiting.

  • Analytics

    Oracle Analytics Cloud is a scalable and secure public cloud service that empowers business analysts with modern, AI-powered, self-service analytics capabilities for data preparation, visualization, enterprise reporting, augmented analysis, and natural language processing and generation. With Oracle Analytics Cloud, you also get flexible service management capabilities, including fast setup, easy scaling and patching, and automated lifecycle management.

  • Identity and Access Management (IAM)

    Oracle Cloud Infrastructure Identity and Access Management (IAM) is the access control plane for Oracle Cloud Infrastructure (OCI) and Oracle Cloud Applications. The IAM API and the user interface enable you to manage identity domains and the resources within the identity domain. Each OCI IAM identity domain represents a standalone identity and access management solution or a different user population.

  • Identity Domain (IDom)

    IAM uses Identity Domains (IDom) to provide identity and access management features such as authentication, single sign-on (SSO), and identity lifecycle management for Oracle Cloud as well as for Oracle and non-Oracle applications, whether SaaS, cloud hosted, or on premises.

  • Policy

    An Oracle Cloud Infrastructure Identity and Access Management policy specifies who can access which resources, and how. Access is granted at the group and compartment level, which means you can write a policy that gives a group a specific type of access within a specific compartment, or to the tenancy.

  • Audit

    The Oracle Cloud Infrastructure Audit service automatically records calls to all supported Oracle Cloud Infrastructure public application programming interface (API) endpoints as log events. All OCI services support logging by Oracle Cloud Infrastructure Audit.

  • Logging
    Logging is a highly scalable and fully managed service that provides access to the following types of logs from your resources in the cloud:
    • Audit logs: Logs related to events emitted by the Audit service.
    • Service logs: Logs emitted by individual services such as API Gateway, Events, Functions, Load Balancing, Object Storage, and VCN flow logs.
    • Custom logs: Logs that contain diagnostic information from custom applications, other cloud providers, or an on-premises environment.
  • Logging analytics

    Logging Analytics is a machine learning-based cloud service that monitors, aggregates, indexes, and analyzes all log data from on-premises and multicloud environments. Enabling users to search, explore, and correlate this data to troubleshoot and resolve problems faster and derive insights to make better operational decisions.

Recommendations

Use the following recommendations as a starting point when deploying a disaster recovery solution for an OCI API Gateway. Your requirements might differ from the architecture described here.
  • VCN

    When you create a VCN, determine the number of CIDR blocks required and the size of each block based on the number of resources that you plan to attach to subnets in the VCN. Use CIDR blocks that are within the standard private IP address space.

    Select CIDR blocks that don't overlap with any other network (in Oracle Cloud Infrastructure, your on-premises data center, or another cloud provider) to which you intend to set up private connections.

    After you create a VCN, you can change, add, and remove its CIDR blocks.

    When you design the subnets, consider your traffic flow and security requirements. Attach all the resources within a specific tier or role to the same subnet, which can serve as a security boundary.

    Use regional subnets.

  • Connectivity (Private API Gateway)

    When you deploy resources to OCI, you might start small, with a single connection to your on-premises network. This single connection could be through OCI FastConnect or through IPSec VPN. To plan for redundancy, consider all the components (hardware devices, facilities, circuits, and power) between your on-premises network and OCI. Also consider diversity to ensure that facilities are not shared between the paths.

Considerations

When deploying a disaster recovery solution for an OCI API Gateway, consider these factors.

  • Use an OCI DNS Management Zone

    Configure DNS records for your OCI API Gateways. You can use an OCI DNS zone to manage DNS records and provide host name resolution for your OCI API Gateways.

    After you've acquired a domain (or a subdomain) for your API gateways, add an OCI DNS zone through the OCI Console or the API. For details on creating an OCI DNS zone and adding a record to it, see "Managing DNS Service Zones,", which you can access from "Explore More", below.
    • In the zone, add the OCI API Gateways custom host name as a CNAME record.
    • After you've successfully published the zone changes, update your domain to use the OCI DNS name servers.
  • Security

    Use OCI Identity and Access Management (IAM) policies to control who can access your cloud resources and what operations can be performed. OCI cloud services use IAM policies such as allowing OCI API Gateway to invoke functions. OCI API Gateway can also control access using OAuth authentication and authorization. IAM allows authentication and authorization that can be federated via IAM. As a result the OCI API Gateway has the power to authenticate against a wide array of services and authentication setups.

  • Performance and cost

    OCI API Gateway supports response caching by integrating with an external cache server (such as a Redis or KeyDB server), that helps in avoiding unnecessary load on back-end services. When responses are cached, if similar requests are received, they can be completed by retrieving data from a response cache rather than sending the request to the backend service. This reduces the load on backend services and thereby helps in improving performance and reducing costs. OCI API Gateway also caches authorization tokens (based on their time to leave TTL), reducing the load on the Identity Provider and improving performance.

  • Availability

    Consider using a high-availability option based on your deployment requirements and your region. The options include distributing resources across multiple availability domains in a region and distributing resources across the fault domains within an availability domain.

  • Monitoring and alerts

    Set up monitoring and alerts on API Gateway metrics.

Deploy

You can deploy this reference architecture on Oracle Cloud Infrastructure by performing below steps:

The Terraform code oci_apigateway_api is available on GitHub. See "Explore More", below, for a link.

  1. Sign in to Oracle Cloud Infrastructure console with your Oracle Cloud credentials
  2. Set up the required secondary site networking infrastructure as shown in the architecture diagram; this includes these components: VCN, Subnet, DRG/Internet Gateway (private/public API Gateway), Security List, Routing Table, Service Gateway, FastConnect/VPN, and CPE, DNS Public Zone
  3. Create an OCI API Gateway. See "Creating an API Gateway" instructions to prepare and create OCI API Gateway instance. You can access this document from "Explore More", below.
  4. Restrict the networks that have access to your API gateway by configuring the Security List, Routing Table, and Network Security Groups, if required.
  5. Automate deployment synchronization. This step ensures that the stacks for the API deployments, API usage plans, and API subscriptions are synchronized regularly between the primary and the standby instance by using CI/CD.

    Note:

    Every configuration and change on top of OCI API management resources API deployments, API usage plans, and API subscriptions can be saved and managed as the Terraform stack. The stack can then be uploaded to the secondary side or applied to a different site by using OCI DevOps.
  6. Execute the failover tasks.
    • Switch to the disaster recovery environment.
    • Switch from your primary instance to the standby instance during outages by updating the DNS record at your DNS provider or in the OCI DNS zone to route the traffic to the secondary site through the API Gateway. You can do this manually from the OCI console or by using OCI REST API. After the failover process, the standby/secondary site API Gateway becomes your primary API Gateway and the API Gateway previously designated as primary becomes the new standby API Gateway.

Acknowledgments

  • Author: Peter Obert
  • Contributor: Robert Wunderlich