Implement disaster recovery with local and regional standbys on Oracle Database@Azure

Ensuring uninterrupted business continuity is critical to success when designing applications. Achieving this requires a robust disaster recovery strategy designed to quickly restore functionality in case of disruptions.

For years, organizations have relied on Oracle Exadata Database Service which is Oracle's premier disaster recovery technologies to support mission-critical applications, whether on-premises or within Oracle Cloud Infrastructure (OCI). Oracle Exadata Database Service on Dedicated Infrastructure on Oracle Database@Azure brings the same industry-leading performance, feature set, and price parity as Exadata on OCI. It leverages Microsoft Azure's availability zones (AZs) and regions to provide low latency to Azure's applications on top of unmatched high availability and disaster recovery capabilities, ensuring seamless operations during maintenance and in the event of disruption.

Architecture

This architecture shows Oracle Exadata Database Service on Dedicated Infrastructure on Oracle Database@Azure in a disaster recovery topology using two standby databases, a local standby across zones, and a remote standby across regions.

The following diagram illustrates this reference architecture.



local-regional-standby-dr-db-azure-arch-oracle.zip

The Oracle Database runs in an Exadata virtual machine (VM) cluster in the Primary region. For data protection and disaster recovery, Oracle Active Data Guard replicates two Exadata VM clusters, the first in the same region but in a different zone (local standby), and the second in a different region (remote standby). A local standby is ideal for failover scenarios, offering zero data loss for local failures while applications continue operating without the performance overhead of communicating with a remote region. A remote standby is typically used for disaster recovery or to offload read-only workloads. With multiples availability zones, you can leverage Azure multi availability zone application tier deployment to build a reliable solution and replicate the application tier to the standby location.

You can route Active Data Guard traffic through the Azure network. However, this architecture focuses on Active Data Guard network traffic through the OCI network to optimize network throughput and latency.

The Oracle Exadata Database Service on Dedicated Infrastructure on Oracle Database@Azure network is connected to the Exadata client subnet using a Dynamic Routing Gateway (DRG) managed by Oracle. A DRG is also required to create a peer connection between Virtual Cloud Networks (VCNs) in different regions. Because only one DRG is allowed per VCN in OCI, a second VCN acting as a Hub VCN with its own DRG is required to connect the primary and standby VCNs in each region. In this architecture:

  • The primary Exadata VM cluster is deployed in Region 1, zone 1 in VCN1 with CIDR 10.10.0.0/16.
  • The hub VCN in the primary Region 1 is HubVCN1 with CIDR 10.11.0.0/16.
  • The first standby Exadata VM cluster is deployed in Region 1, zone 2 in VCN2 with CIDR 10.20.0.0/16.
  • The hub VCN is the same as the hub VCN for the primary database, HubVCN1 as it resides in the same region.
  • The second standby Exadata VM cluster is deployed in Region 2 in VCN3 with CIDR 10.30.0.0/16.
  • The hub VCN in the remote standby Region 2 is HubVCN3 with CIDR 10.33.0.0/16.

No subnet is required for the hub VCNs to enable transit routing, therefore these VCNs can use very small IP CIDR ranges. The VCNs on the OCI child site are created after the Oracle Exadata Database Service on Dedicated Infrastructure VM clusters on Oracle Database@Azure are created for the Primary and Standby databases.

Microsoft Azure provides the following components:

  • Azure region

    An Azure region is a geographical area in which one or more physical Azure data centers, called availability zones, reside. Regions are independent of other regions, and vast distances can separate them (across countries or even continents).

    Azure and OCI regions are localized geographic areas. For Oracle Database@Azure, an Azure region is connected to an OCI region, with availability zones (AZs) in Azure connected to availability domains (ADs) in OCI. Azure and OCI region pairs are selected to minimize distance and latency.

  • Azure availability zone

    Azure availability zones are physically separate locations within an Azure region, designed to ensure high availability and resiliency by providing independent power, cooling, and networking.

  • Azure Virtual Network (VNet)

    Azure Virtual Network (VNet) is the fundamental building block for your private network in Azure. VNet enables many types of Azure resources, such as Azure virtual machines (VMs), to securely communicate with each other, the internet, and on-premises networks.

  • Azure delegated subnet

    A delegated subnet allows you to insert a managed service, specifically a platform-as-a-service (PaaS) service, directly into your virtual network as a resource. You have full integration management of external PaaS services within your virtual networks.

  • Azure Virtual Network Interface Card (VNIC)

    The services in Azure data centers have physical network interface cards (NICs). Virtual machine instances communicate using virtual NICs (VNICs) associated with the physical NICs. Each instance has a primary VNIC that's automatically created and attached during launch and is available during the instance's lifetime.

OCI provides the following components:

  • Region

    An Oracle Cloud Infrastructure region is a localized geographic area that contains one or more data centers, hosting availability domains. Regions are independent of other regions, and vast distances can separate them (across countries or even continents).

  • Availability domains

    Availability domains are standalone, independent data centers within a region. The physical resources in each availability domain are isolated from the resources in the other availability domains, which provides fault tolerance. Availability domains don’t share infrastructure such as power or cooling, or the internal availability domain network. So, a failure at one availability domain shouldn't affect the other availability domains in the region.

  • Virtual cloud network (VCN) and subnets

    A VCN is a customizable, software-defined network that you set up in an Oracle Cloud Infrastructure region. Like traditional data center networks, VCNs give you control over your network environment. A VCN can have multiple non-overlapping CIDR blocks that you can change after you create the VCN. You can segment a VCN into subnets, which can be scoped to a region or to an availability domain. Each subnet consists of a contiguous range of addresses that don't overlap with the other subnets in the VCN. You can change the size of a subnet after creation. A subnet can be public or private.

  • Route table

    Virtual route tables contain rules to route traffic from subnets to destinations outside a VCN, typically through gateways.

  • Local peering

    Local peering enables you to peer one VCN with another VCN in the same region. Peering means the VCNs communicate using private IP addresses, without the traffic traversing the internet or routing through your on-premises network.

  • Dynamic routing gateway (DRG)

    The DRG is a virtual router that provides a path for private network traffic between VCNs in the same region, between a VCN and a network outside the region, such as a VCN in another Oracle Cloud Infrastructure region, an on-premises network, or a network in another cloud provider.

  • Object storage

    OCI Object Storage provides access to large amounts of structured and unstructured data of any content type, including database backups, analytic data, and rich content such as images and videos. You can safely and securely store data directly from the internet or from within the cloud platform. You can scale storage without experiencing any degradation in performance or service reliability.

    Use standard storage for "hot" storage that you need to access quickly, immediately, and frequently. Use archive storage for "cold" storage that you retain for long periods of time and seldom or rarely access.

  • Data Guard

    Oracle Data Guard and Oracle Active Data Guard provide a comprehensive set of services that create, maintain, manage, and monitor one or more standby databases and that enable production Oracle databases to remain available without interruption. Oracle Data Guard maintains these standby databases as copies of the production database by using in-memory replication. If the production database becomes unavailable due to a planned or an unplanned outage, Oracle Data Guard can switch any standby database to the production role, minimizing the downtime associated with the outage. Oracle Active Data Guard provides the additional ability to offload read-mostly workloads to standby databases and also provides advanced data protection features.

  • Oracle Database Autonomous Recovery Service

    Oracle Database Autonomous Recovery Service is an Oracle Cloud service that protects Oracle databases. Backup automation and enhanced data protection capabilities for OCI databases allow you to offload all backup processing and storage requirements to Oracle Database Autonomous Recovery Service, which reduces backup infrastructure costs and manual administration overhead.

  • Exadata Database Service on Dedicated Infrastructure

    Oracle Exadata Database Service on Dedicated Infrastructure enables you to leverage the power of Exadata in the cloud. Oracle Exadata Database Service delivers proven Oracle Database capabilities on purpose-built, optimized Oracle Exadata infrastructure in the public cloud. Built-in cloud automation, elastic resource scaling, security, and fast performance for all Oracle Database workloads helps you simplify management and reduce costs.

  • Oracle Database@Azure

    Oracle Database@Azure is the Oracle Database service (Oracle Exadata Database Service on Dedicated Infrastructure and Oracle Autonomous Database Serverless) running on Oracle Cloud Infrastructure (OCI), deployed in Microsoft Azure data centers. The service offers features and price parity with OCI. Purchase the service on Azure Marketplace.

    Oracle Database@Azure integrates Oracle Exadata Database Service, Oracle Real Application Clusters (Oracle RAC), and Oracle Data Guard technologies into the Azure platform. Users manage the service on the Azure console and with Azure automation tools. The service is deployed in Azure Virtual Network (VNet) and integrated with the Azure identity and access management system. The OCI and Oracle Database generic metrics and audit logs are natively available in Azure. The service requires users to have an Azure subscription and an OCI tenancy.

    Autonomous Database is built on Oracle Exadata infrastructure, is self-managing, self-securing, and self-repairing, helping eliminate manual database management and human errors. Autonomous Database enables development of scalable AI-powered apps with any data using built-in AI capabilities using your choice of large language model (LLM) and deployment location.

    Both Oracle Exadata Database Service and Oracle Autonomous Database Serverless are easily provisioned through the native Azure Portal, enabling access to the broader Azure ecosystem.

Recommendations

Use the following recommendations as a starting point when performing disaster recovery for Oracle Exadata Database Service on Dedicated Infrastructure on Oracle Database@Azure. Your requirements might differ from the architecture described here.
  • Use Active Data Guard for comprehensive data corruption prevention with automatic block repair, online upgrades and migrations, and offload workload to standby with read-mostly scale-out.
  • Enable Application Continuity to mask database outages during planned and unplanned events from end-users and ensure uninterrupted applications.
  • Set up automatic backup to Oracle Database Autonomous Recovery Service (in Azure or OCI) even though the data is protected by Oracle Data Guard to minimize the backup workload on the database by implementing the incremental forever backup strategy that eliminates weekly full backups. Alternatively, customers can use OCI Object Storage for automatic backups.
  • Enable backups from standby to achieve backup replication across regions.
  • Use OCI Full Stack DR to orchestrate database switchover and failover operations.
  • Use OCI Vault to store the database's Transparent Data Encryption (TDE) keys using customer-managed keys.

Considerations

When performing local and regional disaster recovery for Oracle Exadata Database Service on Oracle Database@Azure, consider the following.

  • When Exadata VM clusters are created in the Oracle Database@Azure child site, each VM cluster is created within its own OCI VCN. Oracle Data Guard requires that the databases communicate with each other to ship redo data. Peer the VCNs to enable this communication. Hence, the Exadata VM cluster VCNs must not share overlapping IP CIDR ranges.
  • Preparation for a disaster scenario requires a comprehensive approach that considers different business requirements and availability architectures and that encompasses those considerations in an actionable, high availability, and disaster recovery plan. The scenario described here provides guidelines to help select the approach that best fits your application deployment by using a simple but effective failover for the disaster recovery configuration in your OCI and Microsoft Azure environments.
  • OCI is the preferred network for achieving better performance, measured by latency and throughput, and for achieving reduced cost, including the first 10 TB/month egress for free.

Deploy

Learn how to configure the network communication between regions shown in the architecture diagram.

Follow these steps to configure the Primary region:

  1. Add the following ingress rules to the Security List of the client subnet of VCN1 to allow incoming traffic from VCN2 and VCN3.
    Stateless Source IP Protocol Source Port Range Destination Port Range Allows Description
    No 10.20.0.0/16 TCP 1521 1521 TCP traffic for ports: 1521 Allow ingress from VCN2
    No 10.30.0.0/16 TCP 1521 1521 TCP traffic for ports: 1521 Allow ingress from VCN3
  2. Create Virtual Cloud Network HubVCN1 with CIDR 10.11.0.0/16.
  3. Create Local Peering Gateway HubLPG1 and HubLPG2 in Virtual Cloud Network HubVCN1.
  4. Create Local Peering Gateway LPG1R and LPG1L in Virtual Cloud Network VCN1.
  5. Create Local Peering Gateway LPG1R and LPG1L in Virtual Cloud Network VCN2.
  6. Establish the local peering connection between LPG1R and HubLPG1.
  7. Establish the local peering connection between LPG2R and HubLPG2.
  8. Establish the local peering connection between LPG1L and LPG2L.
  9. Add route rules to the route table of the client subnet of VCN1 to forward traffic targeted for VCN2 to LPG1L and forward traffic targeted for VCN3 to LPG1R.
    Destination Target Type Target Route Type Description
    10.20.0.0/16 Local Peering Gateway LPG1L Static Traffic to VCN2
    10.30.0.0/16 Local Peering Gateway LPG1R Static Traffic to VCN3
  10. Add route rules to the route table of the client subnet of VCN2 to forward traffic targeted for VCN1 to LPG2L and forward traffic targeted for VCN3 to LPG2R.
    Destination Target Type Target Route Type Description
    10.10.0.0/16 Local Peering Gateway LPG2L Static Traffic to VCN1
    10.30.0.0/16 Local Peering Gateway LPG2R Static Traffic to VCN3
  11. Create Route Table HubLPG1rt in HubVCN1.
  12. Associate Route Table HubLPG1rt to Local Peering Gateway HubLPG1.
  13. Associate Route Table HubLPG2rt to Local Peering Gateway HubLPG2.
  14. Create Dynamic Routing Gateway DRG1.
  15. Create Route Table DRG1rt in HubVCN1.
  16. Add two route rules to Route Table DRG1rt: One to forward traffic targeted for VCN1 to HubLPG1 and a second route rule to forward traffic targeted for VCN2 to HubLPG2.
    Destination Target Type Target Route Type Description
    10.10.0.0/16 Local Peering Gateway HubLPG1 Static Traffic to VCN1
    10.10.0.0/16 Local Peering Gateway HubLPG2 Static Traffic to VCN2
  17. To attach DRG1 to HubVCN1:
    1. Select Autogenerated Drg Route Table for VCN attachments.
    2. Select the existing route table DRG1rt.
    3. Select VCN CIDR blocks.
  18. Create a Remote Peering Connection in DRG1, named RPC1.
  19. Add a route rule to HubLPG1rt to forward traffic targeted for VCN2 and VCN3 to DRG1.
    Destination Target Type Target Route Type Description
    10.30.0.0/16 Dynamic Routing Gateway DRG1 Static Traffic to VCN3
  20. Add a route rule to HubLPG2rt to forward traffic targeted for VCN2 and VCN3 to DRG1.
    Destination Target Type Target Route Type Description
    10.30.0.0/16 Dynamic Routing Gateway DRG1 Static Traffic to VCN3

Follow these steps to create the first Standby region (Region 2):

  1. Add the following ingress rules to the Security List of the client subnet of VCN3 to allow incoming traffic from VCN1 and VCN2.
    Stateless Source IP Protocol Source Port Range Destination Port Range Allows Description
    No 10.10.0.0/16 TCP 1521 1521 TCP traffic for ports: 1521 Allow ingress from VCN1
    No 10.20.0.0/16 TCP 1521 1521 TCP traffic for ports: 1521 Allow ingress from VCN2
  2. Create Virtual Cloud Network HubVCN3 with CIDR 10.33.0.0/16.
  3. Create Local Peering Gateway HubLPG3 in Virtual Cloud Network HubVCN3.
  4. Create Local Peering Gateway LPG3R in Virtual Cloud Network VCN3.
  5. Establish the local peering connection between LPG3R and HubLPG3.
  6. Add route rules to the Route Table of the client subnet of VCN3 to forward traffic targeted for VCN1 and VCN2 to LPG3R.
    Destination Target Type Target Route Type Description
    10.10.0.0/16 Local Peering Gateway LPG3R Static Traffic to VCN1
    10.20.0.0/16 Local Peering Gateway LPG3R Static Traffic to VCN2
  7. Create Route Table HubLPG3rt in HubVCN3.
  8. Associate Route Table HubLPG3rt to Local Peering Gateway HubLPG3.
  9. Create Dynamic Routing Gateway DRG3.
  10. Create Route Table DRG3rt in HubVCN3.
  11. Add a route rule to DRG3rt to forward traffic targeted for VCN3 to HubLPG3.
    Destination Target Type Target Route Type Description
    10.30.0.0/16 Local Peering Gateway HubLPG3 Static Traffic to VCN3
  12. To attach DRG3 to HubVCN3:
    1. Select Autogenerated Drg Route Table for VCN attachments.
    2. Select the existing route table DRG3rt.
    3. Select VCN CIDR blocks.
  13. Create a remote peering connection in DRG3, named RPC3 .
  14. Establish a remote peering connection between RPC1 (region 1) and RPC3 (region 3).
  15. Add two route rules to HubLPG3rt to forward traffic targeted for VCN1 and VCN2 to DRG3.
    Destination Target Type Target Route Type Description
    10.10.0.0/16 Dynamic Routing Gateway DRG3 Static Traffic to VCN1
    10.20.0.0/16 Dynamic Routing Gateway DRG3 Static Traffic to VCN2

Acknowledgments

  • Authors: Sinan Petrus Toma, Sebastian Solbach, Julien Silverston
  • Contributor: Sreya Dutta