A disaster can be any event that puts your applications at risk, from network outages to equipment failures to natural disasters. A well architected disaster recovery (DR) plan enables you to recover quickly from disasters and continue to provide services to your users.
Oracle Cloud Infrastructure provides highly available, secure, and scalable infrastructure and services that enable you recover your cloud workloads quickly, reliably, and securely.
The first step in planning for DR involves determining the recovery time objective (RTO) and recovery point objective (RPO).
The RTO is the target time within which a given application must be restored after a disaster occurs. Typically, the more critical the applications, the lower the RTO.
The RPO is the period after a disaster occurs for which the application can tolerate lost data before the disaster begins to affect the business.
To build a plan that guarantees the recovery of your applications after a disaster and is cost effective, you must consider both the target time to recover and the tolerance for data loss.
Planning for DR requires a thorough understanding of all the possible scenarios that can cause disasters.
- Application Failure
An application can fail Network Failure of failures in the underlying infrastructure or issues related to changes in software or hardware configuration. It’s important to include monitoring capability in your DR solution design so that application failures are detected and alerts are sent. Depending on your requirements, your DR solution can range from simply backing up application data and configurations to a fully active-to-active failover setup that seamlessly mitigates many types of failures.
- Network Failure
For DR, consider potential network outages in your cloud environment. For example, if you use an IPSec VPN connection to connect your on-premise data centers to Oracle Cloud, the IPSec VPN connection could experience network performance or outage issues. We recommend setting up multiple IPSec VPN connections or using both FastConnect and IPSec VPN connections so that you have sufficient redundancy for your network connections.
- Data Center Failure
An unexpected event could affect an entire data center (availability domain). In your DR solution design, plan for this kind of failure. If your region has multiple availability domains, we recommend deploying your applications across the availability domains to accommodate potential issues for a particular data center. If your region has only one availability domain, consider a combination of multiple fault domains and multiple-region configurations, as defined in the recommendations for a region failure.
- Region Failure
A natural disaster could cause an entire Oracle Cloud Infrastructure region to be out of service. This scenario could be one of the most severe cases in your DR design. To protect against this scenario, deploy your workloads across multiple Oracle Cloud Infrastructure regions. Depending on your DR goals (RTO and RPO), you can back up or replicate your data to another region, or set up a fully active-to-active standby in another region.