Implement Cross-Region Disaster Recovery for Oracle Exadata Database Service on Oracle Database@Azure
When designing applications, it is essential to ensure business continuity by establishing a robust disaster recovery mechanism for restoring operations in the event of an outage.
For many years, customers have trusted Oracle Exadata Database Service using Oracle Maximum Availability Architecture (MAA) to power mission-critical applications both on premises and on Oracle Cloud Infrastructure (OCI). Oracle Exadata Database Service on Oracle Database@Azure offers feature and price parity with Exadata on OCI and can be deployed across multiple Microsoft Azure availability zones (AZs) and regions to ensure high availability and disaster recovery.
Architecture
This architecture shows a high-availability, containerized Azure Kubernetes Service (AKS) application with Oracle Exadata Database Service on Oracle Database@Azure in a cross-region, disaster recovery topology.
A high-availability, containerized Azure Kubernetes Service (AKS) application is deployed in two Azure regions: a primary region and a standby region. The container images are stored in the Azure container registry and are replicated between primary and standby regions. Users access the application externally through a public load balancer.
For data protection, the Oracle Database is running in an Exadata virtual machine (VM) cluster in the primary region, with Oracle Data Guard or Oracle Active Data Guard replicating the data to the standby database running on an Exadata VM cluster in the standby region.
The database transparent data encryption (TDE) keys are stored in Oracle Cloud Infrastructure Vault and replicated between the Azure and OCI regions. The automatic backups are in OCI for both the primary and standby regions. Customers can use Oracle Cloud Infrastructure Object Storage or Oracle Database Autonomous Recovery Service as the preferred storage solution.
The Oracle Exadata Database Service on Oracle Database@Azure network is connected to the Exadata client subnet by using a dynamic routing gateway (DRG) managed by Oracle. A DRG is also required to create a peer connection between VCNs in different regions. Because only one DRG is allowed per VCN in OCI, a second VCN with its own DRG is required to connect the primary and standby VCNs in each region. In this example:
- The primary Exadata VM cluster is deployed in the VCN Primary VCN client subnet (10.5.0.0/24).
- The Hub VCN Primary VCN for the transit network is 10.15.0.0/29.
- The standby Exadata VM cluster is deployed in the VCN Standby VCN client subnet (10.6.0.0/24).
- The Hub VCN Standby VCN for the transit network is 10.16.0.0/29.
No subnet is required for the Hub VCNs to enable transit routing, therefore these VCNs can use a very small network. The VCNs on the OCI child site are created after the Oracle Exadata Database Service VM clusters on Oracle Database@Azure have been created for the primary and standby databases.
The following diagram illustrates the architecture:
Microsoft Azure provides the following components:
- Microsoft Azure regionAn Azure region is a geographical area in which one or more physical Azure data centers, called availability zones, reside. Regions are independent of other regions, and vast distances can separate them (across countries or even continents). Azure and OCI regions are localized geographic areas. For Oracle Database@Azure, an Azure region is connected to an OCI region, with availability zones (AZs) in Azure connected to availability domains (ADs) in OCI. Azure and OCI region pairs are selected to minimize distance and latency. 
- Microsoft Azure availability zoneAn availability zone is a physically-separate data center within a region that is designed to be highly available and fault tolerant. Availability zones are close enough to have low-latency connections to other availability zones. 
- Microsoft Azure Virtual NetwokMicrosoft Azure Virtual Network (VNet) is the fundamental building block for a private network in Azure. VNet enables many types of Azure resources, such as Azure virtual machines (VM), to securely communicate with each other, the internet, and with on-premises networks. 
- Microsoft Azure Delegated SubnetSubnet delegation alows you to inject a managed service, specifically a platform-as-a-service (PaaS) service, directly into your virtual network. A delegated subnet can be a home for an externally managed service inside of your virtual network so that the external service acts as a virtual network resource, even though it is an external PaaS service. 
- Microsoft Azure VNICThe services in Azure data centers have physical network interface cards (NICs). Virtual machine instances communicate using virtual NICs (VNICs) associated with the physical NICs. Each instance has a primary VNIC that's automatically created and attached during launch and is available during the instance's lifetime. 
- Microsoft Azure Route tableVirtual route tables contain rules to route traffic from subnets to destinations outside a VNet, typically through gateways. Route tables are associated with subnets in a VNet. 
- Azure Virtual Network GatewayAzure Virtual Network Gateway service establishes secure, cross-premises connectivity between an Azure virtual network and an on-premises network. It allows you to create a hybrid network that spans your data center and Azure. 
Oracle Cloud Infrastructure provides the following components:
- RegionAn Oracle Cloud Infrastructure region is a localized geographic area that contains one or more data centers, hosting availability domains. Regions are independent of other regions, and vast distances can separate them (across countries or even continents). 
- Availability domainAvailability domains are standalone, independent data centers within a region. The physical resources in each availability domain are isolated from the resources in the other availability domains, which provides fault tolerance. Availability domains don’t share infrastructure such as power or cooling, or the internal availability domain network. So, a failure at one availability domain shouldn't affect the other availability domains in the region. 
- Virtual cloud network (VCN) and subnetsA VCN is a customizable, software-defined network that you set up in an Oracle Cloud Infrastructure region. Like traditional data center networks, VCNs give you control over your network environment. A VCN can have multiple non-overlapping CIDR blocks that you can change after you create the VCN. You can segment a VCN into subnets, which can be scoped to a region or to an availability domain. Each subnet consists of a contiguous range of addresses that don't overlap with the other subnets in the VCN. You can change the size of a subnet after creation. A subnet can be public or private. 
- Route tableVirtual route tables contain rules to route traffic from subnets to destinations outside a VCN, typically through gateways. 
- Security listFor each subnet, you can create security rules that specify the source, destination, and type of traffic that is allowed in and out of the subnet. 
- Dynamic routing gateway (DRG)The DRG is a virtual router that provides a path for private network traffic between VCNs in the same region, between a VCN and a network outside the region, such as a VCN in another Oracle Cloud Infrastructure region, an on-premises network, or a network in another cloud provider. 
- Service gatewayA service gateway provides access from a VCN to other services, such as Oracle Cloud Infrastructure Object Storage. The traffic from the VCN to the Oracle service travels over the Oracle network fabric and does not traverse the internet. 
- Local peeringLocal peering enables you to peer one VCN with another VCN in the same region. Peering means the VCNs communicate using private IP addresses, without the traffic traversing the internet or routing through your on-premises network. 
- Network security group (NSG)NSGs act as virtual firewalls for your cloud resources. With the zero-trust security model of Oracle Cloud Infrastructure you control the network traffic inside a VCN. An NSG consists of a set of ingress and egress security rules that apply to only a specified set of VNICs in a single VCN. 
- Object storageOCI Object Storage provides access to large amounts of structured and unstructured data of any content type, including database backups, analytic data, and rich content such as images and videos. You can safely and securely store data directly from the internet or from within the cloud platform. You can scale storage without experiencing any degradation in performance or service reliability. Use standard storage for "hot" storage that you need to access quickly, immediately, and frequently. Use archive storage for "cold" storage that you retain for long periods of time and seldom or rarely access. 
- Data GuardOracle Data Guard and Oracle Active Data Guard provide a comprehensive set of services that create, maintain, manage, and monitor one or more standby databases and that enable production Oracle databases to remain available without interruption. Oracle Data Guard maintains these standby databases as copies of the production database by using in-memory replication. If the production database becomes unavailable due to a planned or an unplanned outage, Oracle Data Guard can switch any standby database to the production role, minimizing the downtime associated with the outage. Oracle Active Data Guard provides the additional ability to offload read-mostly workloads to standby databases and also provides advanced data protection features. 
- Oracle Database Autonomous
                                Recovery ServiceOracle Database Autonomous Recovery Service is an Oracle Cloud service that protects Oracle databases. Backup automation and enhanced data protection capabilities for OCI databases allow you to offload all backup processing and storage requirements to Oracle Database Autonomous Recovery Service, which reduces backup infrastructure costs and manual administration overhead. 
- Exadata Database ServiceOracle Exadata is an enterprise database platform that runs Oracle Database workloads of any scale and criticality with high performance, availability, and security. Exadata’s scale-out design employs unique optimizations that let transaction processing, analytics, machine learning, and mixed workloads run faster and more efficiently. Consolidating diverse Oracle Database workloads on Exadata platforms in enterprise data centers, on Oracle Cloud Infrastructure (OCI), and in multicloud environments helps organizations increase operational efficiency, reduce IT administration, and lower costs. enables you to leverage the power of Exadata in the cloud. Oracle Exadata Database Service delivers proven Oracle Database capabilities on purpose-built, optimized Oracle Exadata infrastructure in the public cloud. Built-in cloud automation, elastic resource scaling, security, and fast performance for all Oracle Database workloads helps you simplify management and reduce costs. 
- Oracle Database@AzureOracle Database@Azure is the Oracle Database service (Oracle Exadata Database Service on Dedicated Infrastructure and Oracle Autonomous Database Serverless) running on Oracle Cloud Infrastructure (OCI), deployed in Microsoft Azure data centers. The service offers features and price parity with OCI. Purchase the service on Azure Marketplace. Oracle Database@Azure integrates Oracle Exadata Database Service, Oracle Real Application Clusters (Oracle RAC), and Oracle Data Guard technologies into the Azure platform. Users manage the service on the Azure console and with Azure automation tools. The service is deployed in Azure Virtual Network (VNet) and integrated with the Azure identity and access management system. The OCI and Oracle Database generic metrics and audit logs are natively available in Azure. The service requires users to have an Azure subscription and an OCI tenancy. Autonomous Database is built on Oracle Exadata infrastructure, is self-managing, self-securing, and self-repairing, helping eliminate manual database management and human errors. Autonomous Database enables development of scalable AI-powered apps with any data using built-in AI capabilities using your choice of large language model (LLM) and deployment location. Both Oracle Exadata Database Service and Oracle Autonomous Database Serverless are easily provisioned through the native Azure Portal, enabling access to the broader Azure ecosystem. 
Recommendations
- Deploy the required Exadata infrastructure in both primary and standby regions. For each Exadata instance, deploy an Exadata VM cluster in the delegated subnet of a Microsoft Azure virtual network (VNet). The Oracle Real Application Clusters (RAC) database can then be instantiated on the cluster. In the same VNet, deploy Azure Kubernetes Service (AKS) in a separate subnet. Configure Oracle Data Guard to replicate data from one Oracle Database to the other, across regions.
- When Exadata VM clusters are created in the Oracle Database@Azure child site, each is created within its own Oracle Cloud Infrastructure virtual cloud network (VCN). Oracle Data Guard requires that the databases communicate with each other to ship redo data. The VCNs must be peered to enable this communication.
Considerations
When performing cross-region disaster recovery for Oracle Exadata Database Service on Oracle Database@Azure, consider the following.
- Preparation for a disaster scenario requires a comprehensive approach that considers different business requirements and availability architectures and that encompasses those considerations in an actionable, high-availability (HA), disaster-recovery (DR) plan. The scenario described here provides guidelines to help select the approach that best fits your application deployment by using a simple but effective failover for the disaster recovery configuration in your Oracle Cloud Infrastructure (OCI) and Microsoft Azure environments.
- Use Oracle Data Guard across regions for the databases provisioned in the Exadata VM Cluster on Oracle Database@Azure by using an OCI-managed network.
- Oracle Cloud Infrastructure is the preferred network for achieving better performance, measured by latency and throughput, and for achieving reduced cost, as the first 10 TB/Month is free.
Deploy
To configure the network communication between regions shown in the above architecture diagram, complete the following high-level steps.
Primary Region
- Create a virtual cloud network (VCN), HUB VCN Primary, in the Oracle Cloud Infrastructure (OCI) primary region.
- Deploy two local peering gateways (LPGs), Primary-LPG and Hub-Primary-LPG, in VCN Primary and HUB VCN Primary respectively.
- Establish a peer LPG connection between the LPGs for HUB VCN Primary and VCN Primary.
- Create a dynamic routing gateway (DRG), Primary-DRG in the Hub VCN Primary VCN.
- In the HUB VCN Primary VCN, create the route
                table, primary_hub_transit_drg, and assign the destination of
                the VCN Primary client subnet, a target type of
                    LPG, and the target
                    Hub-Primary-LPG. For example:
                10.5.0.0/24 target type: LPG, Target: Hub-Primary-LPG
- In the HUB VCN Primary VCN, create a second route
                table, primary_hub_transit_lpg, and assign the destination of
                the VCN Standby client subnet, a target type
                    DRG, and a target Primary-DRG. For
                example:
                10.6.0.0/24 target type: DRG, Target: Primary-DRG
- From the Hub VCN Primary VCN, attach Hub VCN Primary to the DRG. Edit the DRG VCN attachments, and under advanced options, edit the tab VCN route table to associate it with the the primary_hub_transit_drg route table. This configuration permits transit routing.
- From the Hub VCN Primary VCN, associate the primary_hub_transit_lpg route table with the Hub-Primary-LPG gateway.
- In the Hub VCN Primary default route table, add
                a route rule for the VCN Primary client subnet IP Address
                range to use the LPG. Add another route rule for the VCN
                    Standby client subnet IP Address range to use the DRG. For
                example:10.5.0.0/24 LPG Hub-Primary-LPG 10.6.0.0/24 DRG Primary-DRG
- From Primary-DRG, select the DRG route table,
                    Autogenerated DRG Route Table for RPC, VC, and IPSec
                    attachments. Add a static route to the VCN
                    Primary subnet client IP Address range that uses the Hub
                    VCN Primary VCN with a next hop attachment type of
                    VCN and the next hop attachment name Primary
                    Hub attachment. For example:
                10.5.0.0/24 VCN Primary Hub attachment
- Use the Primary-DRG remote peering connection attachments menu to create a remote peering connection, RPC.
- In the VCN Primary client subnet, update the
                network security group (NSG) to create a security rule to allow ingress for TCP port
                1521. Optionally, you can add SSH port 22 for direct SSH access to the database
                    servers.
                        Note: For a more precise configuration, disable the import route distribution of the Autogenerated DRG Route Table for RPC, VC, and IPSec attachments route table. For Autogenerated DRG Route Table for VCN attachments, create and assign a new import route distribution including only the required RPC attachment.
Standby Region
- Create the VCN, HUB VCN Standby, in the OCI standby region.
- Deploy two LPGs, Standby-LPG and Hub-Standby-LPG, in the VCN Standby and the HUB VCN Standby VCNs respectively.
- Establish a peer LPG connection between LPGs for VCN Standby and HUB VCN Standby.
- Create a DRG, Standby-DRG in the Hub VCN Standby VCN.
- In the HUB VCN Standby VCN, create a route table,
                    standby_hub_transit_drg, and assign the destination of
                the VCN Standby client subnet, a target type of
                    LPG, and a target Hub-Standby-LPG.
                For example:
                10.6.0.0/24 target type: LPG, Target: Hub-Standby-LPG
- In the HUB VCN Standby VCN, create a second
                route table, standby_hub_transit_lpg and assign the
                destination of the VCN Primary client subnet, a target type
                DRG, and a target Standby-DRG. For example:
                10.5.0.0/24 target type: DRG, Target: Standby-DRG
- From the HUB VCN Standby VCN, attach the Hub VCN Standby VCN to the DRG. Edit The DRG VCN attachments and under advanced options, edit the VCN route table to associate it with the standby_hub_transit_drg route table. This configuration permits transit routing.
- From the HUB VCN Standby VCN, in the
                    Hub VCN Standby default route table, add route rules for
                the VCN Standby client subnet IP Address range to use the LPG
                and for the VCN Primary client subnet IP Address range to use
                the DRG. For example:
                10.6.0.0/24 LPG Hub-Standby-LPG 10.5.0.0/24 DRG Standby-DRG
- Associate the route table, standby_hub_transit_lpg with the Hub-Standby-LPG gateway.
- From Standby-DRG, select the DRG route table
                    Autogenerated Drg Route Table for RPC, VC, and IPSec
                    attachments. Add a static route to the VCN
                    Standby subnet client IP Address range that use the Hub
                    VCN Standby VCN with a next hop attachment type of VCN and the next
                hop attachment name Standby Hub attachment. For example:
                10.6.0.0/24 VCN Standby Hub attachment
- Use the Standby-DRG remote peering connection attachments menu to create a remote peering connection, RPC.
- Select the remote peering connection, select Establish Connection, and provide the Primary-DRG OCID. The peering status becomes peered. Both regions are connected.
- In the VCN Standby client subnet, update the NSG to create a security rule to allow ingress for TCP port 1521. Optionally, you can add SSH port 22 for direct SSH access to the database servers.
Data Guard Association
- To enable Oracle Data Guard or Oracle Active Data Guard for the Oracle Database, on the Oracle Database details page, click Data Guard Associations, then click Enable Data Guard.
- On the Enable Data Guard page:
                        - Select the standby region.
- Select the standby availability domain mapped to Azure AZ.
- Select the standby Exadata infrastructure.
- Select the desired standby VM cluster.
- Choose Oracle Data Guard or Oracle Active Data Guard. MAA recommends Oracle Active Data Guard for auto block repair of data corruptions and the ability to offload reporting.
- For cross-region Oracle Data Guard associations, only the maximum performance protection mode is supported.
- Select an existing database home or create one. It's recommended to use the same database software image of the primary database for the standby database home, so that both have the same patches available.
- Enter the password for the SYS user and enable Oracle Data Guard.
 After Oracle Data Guard is enabled, the standby database will be listed in the Data Guard Associations section. 
- (Optional) Enable automatic failover (Fast-Start Failover) to reduce the recovery time in case of failures by installing Data Guard Observer on a separate VM, preferably in a separate location or in the application network.
Explore More
Learn more about the features of this architecture and about related architectures.
