Note:

This tutorial requires access to Oracle Cloud. To sign up for a free account, see Get started with Oracle Cloud Infrastructure Free Tier.
It uses example values for Oracle Cloud Infrastructure credentials, tenancy, and compartments. When completing your lab, substitute these values with ones specific to your cloud environment.

Automate Recovery for Oracle Analytics Cloud Using OCI Full Stack Disaster Recovery

Part 1 - Introduction

Oracle Cloud Infrastructure (OCI) Full Stack Disaster Recovery orchestrates the transition of compute, database, and applications between OCI regions from around the globe with a single click. Customers can automate the steps needed to recover one or more business systems without redesigning or re-architecting existing infrastructure, databases, or applications.

Oracle Analytics Cloud (OAC) is a managed OCI Platform as a Service offering (PaaS) which is not something Full Stack DR can manage natively since OAC itself does not expose compute, storage or database to OCI users. But, Full Stack DR can automate recovery for PaaS offerings as long as the engineering team for a given service such as OAC has documented a way to deploy and recover their service for disaster recovery between OCI regions. The OAC engineering team has written Disaster Recovery Configuration for Oracle Analytics Cloud explaining how to manually deploy and recover OAC.

Full Stack DR is not used to install, configure or deploy anything about Oracle Analytics Cloud (OAC) including networking, compute, storage, storage replication, databases or the OAC application/service itself. OAC must be fully deployed for DR across regions by following the step-by-step instructions found in Disaster Recovery Configuration for Oracle Analytics Cloud before attempting to use Full Stack DR in any way.

The manual recovery steps proscribed by OAC engineering in Disaster Recovery Configuration for Oracle Analytics Cloud must also be successfully tested for switchover and switchback (also known as fallback in OAC documentation) before using Full Stack DR.

OAC is normally part of a larger system

This tutorial assumes that Oracle Analytics Cloud is the only application being added to the DR protection groups. This is not normal.

This tutorial is unusual in the fact that only OAC is shown and discussed throughout the document to keep things simple. Normally, OAC will simply be one small part of a much larger, more complex business system that includes many different services and applications in a single Full Stack DR protection group and set of DR plans. It is highly likely you will be following similar Oracle Help Center tutorials for other applications and services such as PeopleSoft, WebLogic Server, Oracle Integration Cloud, and so on.

This tutorial just shows how to implement Oracle Analytics Cloud by itself because we don’t want to overwhelm the reader by introducing too many moving parts and pieces. So, this tutorial shows OAC by itself to reduce confusion and remain focused on what is needed to automate recovery for Oracle Analytics Cloud.

Caution about implementing incrementally

Adding more members to a DR protection group after creating DR plans will delete all existing DR plans in the protection groups at both regions.

Full Stack DR is designed with the assumption the entire application stack for a given business system is already deployed across OCI regions and manual DR has already been proven to work. If your business system includes more than OAC, then add all members for all other applications or OCI services to the DR Protection Groups before creating any DR plans.

How the recovery works

The recovery solution for OAC requires Full Stack DR to execute a series of custom bash scripts during a recovery operation such as a failover or switchover. The scripts referenced in this tutorial are provided by the North American Analytics Specialists team and available in a GitHub repository specifically tailored for this OAC DR solution. The bash scripts are downloaded to a compute instance that is part of the application stack that Full Stack DR will manage during a recovery operation.

This tutorial explains how to download the scripts and how to use them in a later step. This tutorial uses Option 2 below for hosting the bash scripts only because the tutorial does not include anything other than OAC.

Option 1 for hosting scripts

OAC is most often part of a larger, more complex business system that includes an application like Oracle E-Business Suite, PeopleSoft or JD Edwards Enterprise one plus other databases, compute instances and home-grown applications. In this case, simply choose any one of the “movable” compute instances that are already part of the business system to host the scripts. The selected compute instance can be anything where Oracle Linux is installed and will most likely be an existing VM that serves another purpose like an application server or admin server of some sort.

This tutorial will refer to this particular compute instance as the Control Node or DR Node even though it is really fulfills another purpose in the application stack.

Option 2 for hosting scripts

If this is an unusual circumstance where OAC is going to be the only application service that Full Stack DR is going to manage during a recovery operation, then a compute instance will need to be created just to host the scripts.

Normally, Full Stack DR does not require any specialized management servers to automate recovery operations. However, you will create a compute instance that will act as a specialized management server in this case since OAC is not something Full Stack DR can manage natively. The specialized management server is seen throughout this document as Control Node or DR Node. The entire purpose of the Control Node is simply to act as a server where custom scripts can reside and be called by Full Stack DR during a recovery operation. This tutorial will explain how to create custom, user-defined DR plan groups and steps as part of your DR plans to call the scripts installed on the Control Node.

OAC Deployment Architecture

Oracle Analytics Cloud (OAC) must first be deployed for disaster recovery (DR) across OCI regions before introducing Full Stack DR. It is extremely important that the manual steps to recover OAC as documented by OAC engineering are tested and work correctly before attempting to automate the recovery process using Full Stack DR.

Either of the following two reference architectures shown below can be followed when deploying OAC for DR across OCI regions. Both reference architectures illustrate a multi-tier topology with redundant resources distributed across two OCI regions.

Option 1: Deploy OAC public instance

An Oracle Analytics Cloud public instance lives in the Oracle Service Network and can be accessed directly from the internet. The Oracle Analytics Cloud public IP address will be directly configured with the DNS registrar.

Fig 1: OIC reference architecture when using OAC public instances

Option 2: Deploy OAC private instance

An Oracle Analytics Cloud private instance cannot be accessed from the public internet, so it requires an OCI public load balancer to facilitate the access. The public load balancer’s IP address will be added to the DNS registrar.

Fig 2: OIC reference architecture when using OAC private instances

Full Stack DR Deployment Architecture

The following illustrations show the compute resources added as members to each DR Protection Group (DRPG) for Full Stack DR. These represent the various components that Full Stack DR can manage outside the Oracle Analytics Cloud service (OAC).

No OAC components other than the Autonomous Data Warehouse (ADW) are shown in either of the DRPG reference architectures shown below because OAC PaaS is invisible to Full Stack DR. Therefore, nothing about OAC other than ADW is added to DRPGs in either region.

Full Stack DR has built-in automation to handle compute, block storage, file storage, Oracle databases, load balancers and many other resources during a recovery, but it does not have built-in automation for OAC itself. OAC recovery is controlled by a series of bash scripts that can be downloaded from a GitHub repository dedicated to this tutorial. The bash scripts need to be installed on a compute instance of your choice by following any of the following options for placement and control of the scripts.

Option 1: Automating recovery for OAC as a standalone application

This deployment architecture is not typical and was devised for very rare situations where OAC is the only application being recovered by Full Stack DR. In this case, a specialized compute instance shown as DR Node below is created to host the custom bash scripts that Full Stack DR calls to manage the recovery of OAC.

Fig 3: Full Stack DR deployment architecture requiring a specialized DR control node

Option 2: Automating recovery for OAC as part of an application stack

The simplistic deployment architecture shown in Figure 4 below is an example of a more common deployment of OAC where it is simply one component of a larger, more complex application stack where many services and applications need to be recovered together. Most business systems are much more complex than the fictitious one shown below and usually include additional databases, other Oracle and/or non-Oracle applications, along with other OCI services such as OIC, ODI, OHS, IAM, etc.

In this case, there is no need to create a specialized DR Node like the one shown in Figure 3 above. The custom bash scripts that manage the recovery of OAC can be installed on any one of the servers shown as movable compute in figure 4 below.

Fig 4: Full Stack DR deployment architecture without the need for a specialized DR control node

Becoming familiar with the entire process

The Full Stack Disaster Recovery engineering team has created a series of companion videos for this tutorial to help people understand the entire process flow. These videos are part of a YouTube playlist that can be accessed using the following links:

Video 1: Deploy Oracle Analytics Cloud for DR
Video 2: Automate recovery for Oracle Analytics Cloud
Video 3: Scripts used to automate recovery for Oracle Analytics Cloud

Part 2 - Step-by-Step Instructions

This part begins the step-by-step instructions needed to add Oracle Analytics CLoud to Full Stack DR.

Objectives of this tutorial

The following steps will be covered in this tutorial explaining how to automate recovery for Oracle Analytics Cloud (OAC) using Full Stack DR:

Task 1: Deploy OAC for DR across OCI regions
1. Prepare OAC DR Control Node
2. Download custom scripts to the DR Control Node
3. Manually install and deploy OAC for DR across two OCI regions
4. Manually test all recovery steps from desired region 1 to region 2
5. Manually test all recovery steps from desired region 2 to region 1
Task 2: Prepare for Full Stack DR
1. Create IAM policies for Full Stack DR
2. Create IAM policies for other OCI services
3. Create object storage buckets for logs
Task 3: Create DR Protection Groups (DRPG)
Task 4: Add members to region 1 and region 2 DRPGs
Task 5: Create basic DR plans in region 2 (Phoenix)
1. Create switchover plan
2. Create failover plan
Task 6: Customize the switchover plan in region 2 (Phoenix)
Task 7: Customize the failover plan in region 2 (Phoenix)
Task 8: Execute the switchover plan in region 2 (Phoenix)
Task 9: Create basic DR plans in region 1 (Ashburn)
1. Create switchover plan
2. Create failover plan
Task 10: Customize the switchover plan in region 1 (Ashburn)
Task 11: Customize the failover plan in region 1 (Ashburn)

Definitions & assumptions throughout the tutorial

Regions

Region 1 is Ashburn
- Ashburn will start out as the primary region.
- This role will eventually change to standby after you are instructed to perform a switchover in later steps.
Region 2 is Phoenix
- Phoenix will start out as the standby region.
- This role will eventually change to primary after you are instructed to perform a switchover in later steps.

Compartments

You are free to organize OAC and Full Stack DR into any compartment scheme that works within your standards for IT governance. We have chosen to organize applications into their own individual compartments, then organize all DR Protection Groups into a single compartment where completely different business systems can all be seen at a glance.

Organizing all DR protection groups into a single compartment apart from the applications makes it much easier for IT staff to locate and execute DR plans for many completely different business systems.
Having a single compartment for all DR protection groups helps eliminate human error and increases the speed in which DR plans can be found and executed
- Compartment for Oracle Analytics Cloud: oac-demo. The compartment for OAC itself, storage, storage buckets, compute and database related to OAC is oac-demo in this tutorial.
- Compartment for Full Stack DR: myprojects_NA. The compartment for Full Stack DR protection groups and plans is myprojects_NA in this tutorial.

OAC DR Control Node

The DR Control Node is any compute instance that you designate to host custom bash scripts that perform specific tasks to recover OAC. The scripts are called by Full Stack DR during a recovery operation. This tutorial explains how to add the scripts to Full Stack DR in steps 6, 7, 10 & 11.

For OAC as a standalone application: this will be a specialized compute instance you create to act as the host for the custom scripts
For OAC as part of an application stack: this will be any existing compute instance that is a member of a DR protection group (DRPG). For example, Oracle E-Business Suite or PeopleSoft will have application servers that are members of the same DRPGs that are managing recover for OAC; any one of these can fulfill the role of DR Control Node in this tutorial.

Prerequisites

Oracle Analytics Cloud should be deployed for disaster recovery across both regions before beginning work with Full Stack DR. This is covered in Task 1 below.

Task 1: Deploy Oracle Analytics Cloud for disaster recovery

Full Stack DR is not involved in any part of this step.

Task 1.1: Prepare the DR Control Node to run custom automation

Designate a compute instance to act as a DR Control Node for OAC. This can be an existing compute instance, or it can be a compute instance created just for this purpose. See the options below for more detail. Ensure the compute instance(s) acting as the DR Control Node has been configured to run commands using the OCI Cloud Agent: Running Commands on an Instance.

Option 1: OAC as a standalone application

This tutorial assumes OAC is a standalone service, so you will create a compute instance with Oracle Linux in region 1. Use the lowest cost shape with Oracle Linux since it will only be used to host the custom bash scripts. The need for a specialized compute instance dedicated to fulfilling this role is unusual; option 2 is the most common scenario for a majority of organizations.

The specialized compute instance will be added as a member of the DR protection group in region 1 in a later step.

Option 2: OAC as part of an application stack

You can use any single existing movable compute that is part of any Oracle or non-Oracle application being managed by Full Stack DR in region 1. This will fulfill the role of the DR Control Node any time this tutorial refers to the DR Control Node.

It is best to use a movable compute instance, but you can also designate a non-movable compute instance in region 1 and another one in region 2 if you don’t have any movable compute as part of your DR solution. You will need to maintain any changes you make to scripts or the guest OS in both regions if non-movable compute is used for this role.

Option 3: OAC as part of an application stack with multiple PaaS offerings

Perhaps your business system also has Oracle HTTP Server (OHS), Oracle Integration Cloud (OIC), and Oracle Data Integrator (ODI). In this case, you might consider creating a specialized compute instance as you would with Option 1 to host DR recovery scripts for all the various PaaS service.

Task 1.2: Ensure volume group is replicated to region 2

Ensure the boot volume for the DR Control node is a member of a block volume group and the block volume group is replicated to region 2.

Ensure that any other boot & block belonging to any other movable compute for this Full Stack DR project also belongs to block volume groups replicated to region 2.

Task 1.3: Download bash scripts to DR Control Node

Download the custom bash scripts from github that were written specifically for this OAC DR solution. The scripts shown below should be copied to any subdirectory on the compute instance acting as the DR Control Node for OAC

The link above should resolve to the GitHub repository:

This shows the repository path where the bash scripts are located on GitHub.
This shows the repository containing the bash scripts.

Figure 2-4: Screenshot of github repository containing bash scripts for OAC

Task 1.4: Deploy Oracle Analytics Cloud for DR

Deploy Oracle Analytics Cloud for DR across OCI regions using the step-by-step instructions found in the following documents:

Oracle blog explaining the solution: Disaster Recovery Plan for Oracle Analytics Cloud using Manual Switchover Method.
Oracle Technical paper written by OAC engineering: Disaster Recovery Configuration for Oracle Analytics Cloud.
Oracle Architecture Center reference architecture written by Data Platform cloud architects: Design an Oracle Analytics Cloud DR topology with Full Stack Disaster Recovery Service.

Task 1.5: Manually Test Recovery of Oracle Analytics Cloud

It is a best practice to ensure the manual recovery steps The manual steps to recover OAC documented in Disaster Recovery Configuration for Oracle Analytics Cloud must be successful before working with Full Stack DR.

Task 1.6: Next Steps

Return to this document to begin working with Full Stack DR once the following requirements have been completed.

Manually deploy OAC for DR across two desired OCI regions.
Manually test all recovery steps from region 1 (Ashburn) to region 2 (Phoenix).
Manually test all recovery steps from region 2 (Phoenix) to region 1 (Ashburn).

Task 2: Prepare for Full Stack Disaster Recovery

Full Stack DR is not involved in any part of this step. The following steps prepare the tenancy, compartment, OCI services and OAC for automated recovery by Full Stack DR.

Task 2.1: Create IAM policies for Full Stack DR

Configure the required OCI IAM policies for Full Stack Disaster Recovery as outlined in the following documents.

Task 2.2: Create IAM policies for other services managed by Full Stack DR

Full Stack DR must have the ability to control and manage other key OCI services such as compute, networking, storage, vaults, databases and other miscellaneous services. Configure the required OCI IAM policies for other services as explained in the following document.

Policies for Other Services Managed by Full Stack Disaster Recovery.

Task 2.3: Create storage buckets for DRPG logs

Note: Skip Task 2.3 entirely if you are adding OAC to existing DR Protection Groups.

Create Object Storage buckets in the primary and standby regions to store logs generated by Full Stack DR during recovery operations: Object Storage.

Task 2.3.1: Navigate to OCI Object Storage

Begin by navigating to Object Storage & Archive Storage as shown in Figure 2-1 below

Ensure the browser context is set to region 1 (Ashburn).
Select Storage.
Select Buckets.

Figure 2-1: Navigate to object storage

Task 2.3.2: OCI storage bucket in region 1

Create an object storage bucket in region 1. The bucket will be assigned to the DR protection group in region 1 in a later step.

Select the compartment that contains OAC related resources.
Select Create Bucket.
Give the bucket a meaningful name that easily identifies which application and purpose it serves; there is no reason to include the region as part of the name. For example, this name indicates it is used for the Full Stack DR logs related to DR operations for OAC.
Use the default value for tier and Encryption.
Select Create to create the bucket.

Figure 2-2: Create an object storage bucket in region 1

Task 2.3.3: OCI storage bucket in region 2

Follow the same process to create an object storage bucket in region 2 (Phoenix). The bucket will be assigned to the DR protection group in region 2 in a later step.

Change the context to region 2.
Select the compartment that contains OAC related resources in region 2.
Use the same exact name that was assigned to the bucket in region 1 - this will make it easy to identify in a later step.
Select Create to create the bucket.

Figure 2-3: Create an object storage bucket in region 2

Task 3: Create DR Protection Groups in both regions

Note: Skip Task 3 entirely if OAC is being added to existing DR Protection Groups.

Create DR protection groups in region 1 and region 2 if the protection groups for this application stack do not exist yet.

Task 3.1: Navigate to DR Protection Groups

Begin by navigating to DR Protection Groups (Full Stack DR) as shown in Figure 3-1 below.

Ensure the OCI region context is set to region 1 (Ashburn).
Select Migration & Disaster Recovery.
Select DR Protection Groups.

Figure 3-1: Navigate to DR protection groups

Task 3.2: Create a protection group in region 1

Create a basic DR protection group (DRPG) in region 1 as shown in Figure 3-2 below. The peer, role and members will be assigned in later steps.

Select the compartment where you want the DRPG to be created. This can be the same compartment where OAC resources exist, or as in this case, a compartment acting as a repository containing DRPGs for many different business systems.
Select Create DR protection group to open the dialog.

Figure 3-2: Begin creating DR protection group in region 1

Add a name and object storage bucket for the logs as shown in Figure 3-3 below.

Use a meaningful, simple name for the DRGP; this example shows the name of the business system and the region.
Select the object storage bucket created in Task 2 for region 1.

Figure 3-3: Parameters needed to create DR protection group in region 1

Task 3.3: Create a protection group in region 2

Create a basic DR protection group (DRPG) in region 2 as shown in Figure 3-4 below. The peer, role and members will be assigned in later steps.

Change the OCI region context to region 2.
Select the compartment where you want the DRPG to be created. This can be the same compartment where OAC resources exist, or as in this case, a compartment acting as a repository containing DRPGs for many different business systems.
Select Create DR protection group to open the dialog

Figure 3-4: Begin creating DR protection group in region 2

Add a name and object storage bucket for the logs as shown in Figure 3-5 below.

Use a meaningful, simple name for the DRGP; this example shows the name of the business system and the region.
Select the object storage bucket created in Task 2 for region 2

Figure 3-5: Parameters needed to create DR protection group in region 2

Task 3.4: Associate protection groups in region 1 & region 2

Associate the DRPGs in each region as peers of each other and assign the peer roles of primary and standby. This is how Full Stack DR will know which two regions work together for OAC recovery. The roles of primary and standby are automatically changed by Full Stack DR as part of any DR operation/DR plan execution; there is no need to manage the roles manually at any time.

Task 3.4.1: Begin the association

Ensure OCI region context is set to region 1 (Ashburn).
Select Associate to begin the process.

Figure 3-6: Begin DRPG association

Task 3.4.2: Associate protection groups in region 1 & region 2

Provide the parameters as shown in Figure 3-7 below.

Select primary role. Full Stack DR will assign the standby role to region 2 automatically.
Select region 2 (Phoenix) where the other DRPG was created.
Select the peer DRPG that was created in.

Figure 3-7: Parameters needed to associate the DRPGs

Task 3.4.3: What you should see after association is complete

Full Stack DR will show something like Figure 3-8 below once the association is completed.

The current primary peer DRPG is Ashburn (region 1).
The current standby peer DRPG is Phoenix (region 2).

Figure 3-8: Showing the peer relationship from the individual DRPG perspective

The same information can be found whenever the context/view is from a global perspective showing all DR protection groups as shown in Figure 3-9 below.

The current primary peer DRPG is Ashburn (region 1).
The current standby peer DRPG is Phoenix (region 2).

Figure 3-9: Showing the peer relationship from the global DRPG perspective

Task 4: Add members to the DR Protection Groups

Note: This step will delete any existing DR plans in both regions when adding members to existing DR Protection Groups. Full Stack DR cannot save copies or make backups of DR protection groups at the time of this writing. Make sure you have recorded all the information about any DR plan groups and steps in a text file or spreadsheet to help recreate the custom, user-defined plan groups and steps. You can also create bash scripts that call Full Stack DR CLI commands to recreate the custom, user-defined plan groups and steps (this is beyond the scope of this tutorial).

Add the database and DR Control Node as members of the DR protection groups. The DR Control Node is either a compute instance you created just to control OAC or it is a compute instance that is part of the application stack you want to manage with Full Stack DR.

You will add the following resources to the primary DRPG in region 1:

The DR Control Node,
The volume group containing the DR Control Node boot volume,
The primary Autonomous Data Warehouse.

Task 4.1: Begin adding members to DRPG in region 1

Begin by selecting the DRPG in region 1 as shown int Figure 4-1 below.

Ensure the OCI region context is region 1 (Ashburn).
Select the DRPG in region 1.
Select Members.
Click on Add Member to begin the process.

Figure 4-1: How to begin adding members to DR protection group in region 1

Task 4.1.1: Add compute instance for DR node

Add compute instance for DR Control Node as shown in figure 4-2 below.

Acknowledge warning about DR plans.
Select Compute as member resource type.
Select the compute instance you want to use the DR control node.
Select moving instance.
Tell Full Stack DR which VCN & subnet to assign to the VNIC(s) at region 2 during a recovery. Figure 4-2 shows a single VNIC. Full Stack DR does not care how many VNICs you have or how they are configured at either region; specify whatever you need that fits your requirements.

Figure 4-2: Parameters needed to add DR Control Node

Task 4.1.2: Add block volume group for DR node

Add the block volume group containing boot for the DR Control Node. The block volume group must already have cross-region replication configured between the two regions before adding it the DR protection group.

Select Volume group as member resource type.
Ensure correct compartment containing the volume group is selected, then select the volume group.

Figure 4-3: Parameters needed to add boot volume group for DR Control Node

Task 4.1.3: Add primary Autonomous Data Warehouse

Autonomous Data Guard should already be configured for Autonomous Data Warehouse (ADW) at this point as part of Task 1. Add the primary ADW as a member of the DRPG in region 1.

Select Autonomous database as member resource type.
Ensure correct compartment containing the ADW is selected, then select the primary ADW for OAC.

Figure 4-4: Parameters needed to add primary ADW

Task 4.1.4: Verify member resources for region 1

The DRPG for region 1 should have three member resources at a minimum as shown in Figure 4-5 below. The names of your member resources will be different.

The primary ADW.
The movable compute instance and block volume group for the compute instance that we designated to act the OAC DR Control Node.

Figure 4-5: Showing members of DRPG in region 1

Task 4.2: Begin adding members to DRPG in region 2

You will add the following resources to the primary DRPG in region 2:

The standby/remote Autonomous Data Warehouse (ADW).

Begin by selecting the DRPG in region 2 as shown int Figure 4-6 below.

Ensure the OCI region context is region 2 (Phoenix).
Select the DRPG in region 2.
Select Members.
Click on Add Member to begin the process.

Figure 4-6: How to begin adding members to DR protection group in region 2

Task 4.2.1: Add standby Autonomous Data Warehouse

Add standby ADW as member of the DRPG in region 2 as shown in Figure 4-7 below.

Change OCI region context to region 2 (Phoenix).
Select the DRPG created in Task 3.3

Figure 4-7: Parameters needed to add standby ADW

Task 4.2.2: Verify member resources for region 2

The DRPG for region 2 should have one member resource at a minimum as shown in Figure 4-8 below. The name of your member resources will be different.

The standby/remote ADW.

Figure 4-8: Showing the single member of DRPG in region 2

Task 5: Create basic DR plans in region 2 (Phoenix)

This step creates basic switchover and failover plans associated with the standby DR Protection Group in region 2 (Phoenix).

The purpose of each plan is to transition the workload from primary region 1 to standby region 2. The roles of the DR protection groups in both regions are automatically reversed as part of any DR operation, so the protection group in region 1 will become the standby and the protection group in region 2 will become primary after a failover or switchover.

Full Stack DR will pre-populate both plans with built-in steps based on the member resources added in the previous step. The plans will be customized in later steps to handle all the tasks related to OAC during a recovery operation.

The switchover plans are always created in the protection group with the standby role; region 2 is currently the standby protection group, so we will begin in Phoenix.

Task 5.1: Begin creating DR plans

Create a basic plans by selecting the DRPG in region 2 as shown in Figure 5-1 below.

Ensure the OCI region context is region 2 (Phoenix).
Select the standby DRPG in region 2.
Select Plans.
Click on Create Plan to begin the process.

Figure 5-1: How to begin creating basic DR plans in region 2

Task 5.1.1: Create a switchover plan

Creating a DR plan is simple as shown in Figure 5-2 below.

Make the name of the switchover plan simple but meaningful. The name should be as short as possible but easy to understand at a glance to help reduce confusion and human error during a crisis.
Choose the plan type. There are only two plan types at the time of this writing.

Figure 5-2: The parameters needed to create DR switchover plan

Task 5.1.2: Create a failover plan

Follow the same process to create a basic failover plan as shown in Figure 5-3 below.

Make the name of the failover plan simple but meaningful. The name should be as short as possible but easy to understand at a glance to help reduce confusion and human error during a crisis.
Choose the plan type. There are only two plan types at the time of this writing.

Figure 5-3: The parameters needed to create DR failover plan

The standby DR Protection Group in region 2 should now have the two DR plans as shown below. These will handle transitioning workloads from region 1 to region 2. You will create similar plans at region 1 to transition workloads from region 2 back to region 1 in a later step.

Figure 5-4: Showing the two basic DR plans that must exist in region 2 before proceeding any further

Task 6: Customize the switchover plan in region 2 (Phoenix)

The basic DR plans created in Task 5 contain pre-populated steps for recovery tasks that are built into Full Stack DR and do not contain anything to manage recovery tasks specific to OAC. This step explains how to add custom, user-defined DR Plan Groups and steps to manage those things that need to be accomplished during a switchover for OAC:

Stop OAC at the current primary region 1 before stopping any VMs.
Start OAC at the current standby region 2 after launching any VMs.
Recover the periodic snapshot at the standby region 2. The periodic snapshot was set up as part of Task 1.4 above.
Change the snapshot cron job at the standby region 2. The cron job was set up as part of Task 1.4 above.

Task 6.1: Select the switchover plan

Begin by navigating to the switchover plan created in the previous step.

Figure 6-1: How to begin customizing the switchover plan in region 2

Task 6.2: Enable DR Plan Groups that terminate artifacts (optional)

There are two plan groups that are disabled by default in switchover plans as shown in the screenshot below. They are disabled to provide a level of comfort during testing that nothing is actually being deleted and you still have a viable copy of the artifacts as a backup in case something goes wrong during testing.

However, these two plan groups terminate (delete) artifacts that will never be used again as part of any DR operation in the future. The artifacts will simply continue to accumulate over time as you switch back and forth between the two regions causing confusion about which compute instances and volume groups are the ones that should actually be active.

These plan groups should be enabled once Full Stack DR goes into production. Any artifacts that were left in place during testing switchovers and switchbacks while these two plan groups were disabled should be terminated and cleaned up before going into to production to reduce confusion and the likelihood of human error during normal operations.

Optionally, these plan groups can be enabled now to avoid having to manually clean up the superfluous artifacts before going into production.

Figure 6-2: Plan groups disabled by default

Here is what the disabled plan groups do when they are enabled:

This plan group terminates artifacts of compute instances that are left behind at region 1 after the replicated versions of the VMs have been launched at region 2 during the OCI block storage operation to reverse the replication from region 2 back to region 1 as part of the switchover. The leftover VMs are not used during a switchback because the operation to reverse block volume replication creates all new VMs in completely new block volume groups.
This plan group terminates artifacts of block volume groups (VGs) that are left behind at region 1 after the replicated versions of the VGs have been activated at region 2 and volume group replication has been reversed during the switchover. The leftover block volume groups are never used again, not even as part of a switchover from region 2 back to region 1.

Task 6.2.1: Enable terminate compute plan group

Enable the plan group.

Select Enable all steps from the context menu to the right of the plan group name

Figure 6-3: How to enable terminate compute instances

Task 6.2.2 Enable terminate volume groups plan group

Enable the plan group.

Select Enable all steps from the context menu to the right of the plan group name

Figure 6-4: How to enable terminate volume groups

Task 6.3: Create plan group to stop OAC at region 1 (primary)

Now begin adding custom, user-defined DR Plan Groups.

The first user-defined plan group will stop OAC running at the primary region 1. This plan group will contain a single step that calls the oac-start-stop.sh bash script that was downloaded to the DR Control Node in Task 1.4.

Task 6.3.1: Select add plan group

Begin the process to add a plan group.

Click on Add group to begin.

Figure 6-5: Begin adding plan group to stop OAC

Task 6.3.2: Provide plan group name, order & add step

A DR Plan Group can contain many steps that are all executed in parallel. We are just adding a single step to execute a bash script to stop OAC.

Give the plan group a simple but descriptive name. This is optional of course, but a best practice is to add a note about which region the plan group will execute the steps. In this case it is the primary region, so we’ve added “(Primary)” to the group name.
Select a position where the plan group will be inserted into the DR plan. In this case, we are going to insert our user-defined plan group before the built-in plan group that stops the VMs at region 1.
Select the built-in Stop Compute Instances (Primary) plan group.
Click on Add step to open the dialog where we will specify the script to stop OAC.

Figure 6-6: Parameters to create plan group and add step to stop OAC

Task 6.3.3: Provide step name & local script parameters

The Add plan group step dialog allows us to specify parameters about what this one step will perform and how it will behave during recovery. In this case, it will stop OAC at region 1.

We will explain all fields in this dialog, but leave out this detail in all the remaining screenshots in subsequent steps since we are just performing the same process repeatedly.

A descriptive name explaining what task this step performs.
The DR plan should stop if the script fails to stop OAC. This will allow anyone to see there is a problem and fix it. Full Stack DR provides the opportunity to continue running the switchover plan after fixing the problem.
The default value before Full Stack DR declares a failure is one hour. This value can be changed to 30 minutes or whatever is felt to be a more realistic timeout value.
Always select the region where the DR Control Node is running right now, not where it will be running during a switchover. Full Stack DR will keep track of where the VM is running, so you just need to specify where it is right now. In this case, the DR Control Node is running in region 1 (Ashburn).
Select Run local script to inform Full Stack DR that the script will be found on a compute instance. The bash scripts were downloaded to the DR Control Node in Task 1.3.
Select the correct compartment that contains the DR Control Node - it can be any compartment. The select the compute instance that was designated as the DR Control Node (it may be an application server or VM that was created just for this project/tutorial).
Paste in the absolute path where you installed the oac-start-stop.sh script on the DR Control Node. Add stop as the first parameter and the OCI region ID as the second.
Specify opc as the user to execute the script.
Click Add step to add this step to the plan group.

Figure 6-7: Parameters to create the plan step for stopping OAC

Task 6.3.4: Complete adding plan group & step

The step to stop OAC is now added to the DR plan group as shown in figure 6-8 below.

This shows the plan step that was just added. It is possible to add additional steps to a DR plan group, but this plan group will only include the step to stop OAC.
Click on Add to add the DR plan group and step to the DR plan.

Figure 6-8: Finalize adding plan group and step to stop OAC

Task 6.4: Create plan group to start OAC at region 2 (standby)

The second user-defined plan group will start OAC after the DR Control Node is launched at the standby region 2. This plan group will contain a single step that calls the oac-start-stop.sh bash script that was downloaded to the DR Control Node in Task 1.3.

Task 6.4.1: Select add plan group

As before, click Add group to begin.

Figure 6-9: Begin adding plan group to start OAC at standby

Task 6.4.2: Provide plan group name, order & add step

Create a DR plan group to start OAC.

Give the plan group a simple but descriptive name. It’s always a good practice to add “(Standby)” to the group name so it is obvious which region the steps apply at a glance.
Select a position where the plan group will be inserted into the DR plan. In this case, we are going to insert our user-defined plan group after the built-in plan group that launches the replicated version of the DR control node at region 2
Select the built-in Launch Compute Instances (Standby) plan group
Click on Add step to open the dialog where we will specify the script to start OAC

Figure 6-10: Parameters to create plan group and add step to start OAC at standby

Task 6.4.3: Provide step name & local script parameters

The Add plan group step dialog allows us to specify parameters about what this one step will perform and how it will behave during recovery. In this case, it will start OAC at region 2.

Everything in this step is the same as Task 6.3.3 except for the items show in Figure 6-11 below.

A descriptive name explaining what task this step performs.
Paste in the absolute path where you installed the oac-start-stop.sh script on the DR Control Node. Add start as the first parameter and the OCI region ID as the second.
Click Add step to add this step to the plan group.

Figure 6-11: Parameters to create the plan step for starting OAC at standby

Task 6.4.4: Complete adding plan group & step

The step to start OAC is now added to the DR plan group as shown in figure 6-12 below.

This shows the plan step that was just added.
Click on Add to add the DR plan group and step to the DR plan.

Figure 6-12: Finalize adding plan group and step to start OAC at standby

Task 6.5: Create plan group to recover snapshot at region 2 (standby)

A cron job was set up on the DR Control Node as part of Task 1.4. The cron job calls a bash script named oac-create-snapshot.sh that is responsible for exporting a snapshot of OAC data in region 1 and saving it to the object storage bucket in the standby region 2. The cron job and buckets were also created in Task 1.4.

The third user-defined plan group will recover OAC at the standby region 2 using the periodic snapshot that is replicated from the object storage bucket in region 1 to region 2. This plan group will contain a single step that calls the oac-register-snapshot.sh bash script that was downloaded to the DR Control Node in Task 1.3.

Task 6.5.1: Select add plan group

As before, click Add group to begin.

Figure 6-13: Begin adding plan group to recover snapshot at standby

Task 6.5.2: Provide plan group name, order & add step

Create a DR plan group to recover OAC at the standby region 2.

Give the plan group a simple but descriptive name.
Select a position where the plan group will be inserted into the DR plan. In this case, insert the user-defined plan group after the user-defined plan group created in the previous task to start OAC.
Select the built-in Start OAC (Standby) plan group
Click on Add step to open the dialog where we will specify the script to recover the OAC snapshot.

Figure 6-14: Parameters to create plan group and add step to recover the OAC snapshot at standby

Task 6.5.3: Provide step name & local script parameters

The Add plan group step dialog allows us to specify parameters about what this one step will perform and how it will behave during recovery. In this case, it will recover the OAC snapshot at region 2. The snapshot is taken at the primary region during normal operations and stored in an object storage bucket at region 2.

Everything in this task is the same as Task 6.3.3 except for the items show in Figure 6-15 below.

A descriptive name explaining what task this step performs.
Paste in the absolute path where you installed the oac-start-stop.sh script on the DR Control Node. Add OCI region ID as the only parameter (PHX in this example).
Click Add step to add this step to the plan group.

Figure 6-15: Parameters to create the plan step for recovering snapshot at standby

Task 6.5.4: Complete adding plan group & step

The step to recover OAC is now added to the DR plan group as shown in figure 6-16 below.

This shows the plan step that was just added.
Click on Add to add the DR plan group and step to the DR plan.

Figure 6-16: Finalize adding plan group and step to recover snapshot at standby

Task 6.6: Create plan group to reverse snapshot at region 2 (standby)

The last user-defined plan group will change the cron job discussed in Task 6.5 above. Full Stack DR will call the oac-chg-cronjob.sh to modify the cron job so it saves the exported OAC snapshot to the storage bucket in region 1.

Task 6.6.1: Select add plan group

As before, click Add group to begin.

Figure 6-17: Begin adding plan group to reverse snapshot copy at standby

Task 6.6.2: Provide plan group name, order & add step

Create a DR plan group to reverse the OAC snapshot to region 1.

Give the plan group a simple but descriptive name. It’s always a good practice to add “(Standby)” to the group name so it is obvious which region the steps apply at a glance.
Select a position where the plan group will be inserted into the DR plan. In this case, we are going to insert our user-defined plan group after the built-in plan group that recovers the OAC snapshot at region 2.
Select the built-in Recover OAC snapshot (Standby) plan group.
Click on Add step to open the dialog where we will specify the script to start OAC.

Figure 6-18: Parameters to create plan group and add step to reverse snapshot copy at standby

Task 6.6.3: Provide step name & local script parameters

The Add plan group step dialog allows us to specify parameters about what this one step will perform and how it will behave during recovery. In this case, it will reverse the OAC snapshot so it is saved to region 1 which will automatically become the standby region after the switchover has completed.

Everything in this task is the same as Task 6.3.3 except for the items show in Figure 6-19 below.

A descriptive name explaining what task this step performs.
Paste in the absolute path where you installed the oac-chg-cronjob.sh script on the DR Control Node. Add the OCI region key for region 1 (iad in this example) as the first parameter and region key for region 2 (phx in this example) as teh second parameter.
Click Add step to add this step to the plan group.

Figure 6-19: Parameters to create the plan step for reversing snapshot copy at standby

Task 6.6.4: Complete adding plan group & step

The step to reverse OAC snapshot direction is now added to the DR plan group as shown in figure 6-20 below.

This shows the plan step that was just added.
Click on Add to add the DR plan group and step to the DR plan.

Figure 6-20: Finalize adding plan group and step to reverse snapshot copy at standby

The switchover plan should now include the four DR Plan Groups for OAC as shown in the screenshot below. You may have additional plan groups if your protection group includes other applications or OCI services along with OAC.

Figure 6-21: Showing the four user-defined plan groups added to the switchover plan

Task 7: Customize the failover plan in region 2 (Phoenix)

This task explains how to add custom, user-defined DR Plan Groups and steps to manage those things that need to be accomplished during a failover for OAC at region 2 during an actual outage or loss of access to region 1. These will be a subset of the same steps that were just added to the switchover plan in Task 6 above. However, only steps that are executed at standby region 2 will be added to the failover plan since it is assumed region 1 is completely inaccessible during a failover.

Start OAC at the standby region 2 after launching any VMs.
Recover the periodic snapshot at the standby region 2. The periodic snapshot was set up as part of Task 1.4 above.
Change the snapshot cron job at the standby region 2. The cron job was set up as part of Task 1.4 above.

Task 7.1: Select the failover plan

Begin by navigating to the failover plan created in Task 5.

Ensure standby region 2 is still the current region context in the console.
Select the failover plan.

Figure 7-1: How to create begin customizing the failover plan in region 2

Task 7.2: Select add plan group

The first user-defined plan group will start OAC running at the standby region 2. This plan group will contain a single step that calls the oac-start-stop.sh bash script that was downloaded to the DR Control Node in Task 1.3.

Click on Add group to begin.

Figure 7-2: Begin adding plan group to start OAC

Task 7.2.1: Provide plan group name, order & add step

A DR Plan Group can contain many steps that are all executed in parallel. We are just adding a single step to execute a bash script to start OAC.

Give the plan group a simple but descriptive name. This is optional of course, but a best practice is to add a note about which region the plan group will execute the steps. In this case it is the standby region 2, so we have added “(Standby)” to the group name.
Select a position where the plan group will be inserted into the DR plan. In this case, we are going to insert our user-defined plan group after the built-in plan group that launches the replicated VMs at region 2
Select the built-in Launch Compute Instances (Standby) plan group
Click on Add step to open the dialog where we will specify the script to start OAC

Figure 7-3: Parameters to create plan group and add step to start OAC

Task 7.2.2: Provide step name & local script parameters

The Add plan group step dialog allows us to specify parameters about what this one step will perform and how it will behave during recovery. In this case, it will start OAC at region 2 as shown in Figure 7-4 below.

We will explain all fields in this dialog, but leave out this detail in all the remaining screenshots in subsequent steps since we are just performing the same process repeatedly.

A descriptive name explaining what task this step performs.
The DR plan should stop if the script fails to start OAC. This will allow anyone to see there is a problem and fix it. Full Stack DR provides the opportunity to continue running the switchover plan after fixing the problem.
The default value before Full Stack DR declares a failure is one hour. This value can be changed to 30 minutes or whatever is felt to be a more realistic timeout value.
Always select the region where the DR Control Node is running right now, not where it will be running during a switchover. Full Stack DR will keep track of where the VM is running, so you just need to specify where it is right now. In this case, the DR Control Node is running in region 1 (Ashburn).
Select Run local script to inform Full Stack DR that the script will be found on a compute instance. The bash scripts were downloaded to the DR Control Node in Task 1.3.
Select the correct compartment that contains the DR Control Node - it can be any compartment. The select the compute instance that was designated as the DR Control Node (it may be an application server or VM that was created just for this project/tutorial).
Paste in the absolute path where you installed the oac-start-stop.sh script on the DR Control Node. Add start as the first parameter and the OCI region ID as the second.
Specify opc as the user to execute the script.
Click Add step to add this step to the plan group.

Figure 7-4: Parameters to create the plan step for starting OAC at standby

Task 7.2.3: Complete adding plan group & step

The step to start OAC is now added to the DR plan group as shown in figure 7-5 below.

This shows the plan step that was just added.
Click on Add to add the DR plan group and step to the DR plan.

Figure 7-5: Finalize adding plan group and step to start OAC

Task 7.3: Create plan group to recover snapshot at region 2 (standby)

The second user-defined plan group will recover OAC at the standby region 2 using the periodic snapshot that is replicated from the object storage bucket in region 1 to region 2. This is the same task that was added to the switchover plan is Task 6.

Task 7.3.1: Select add plan group

As before, click Add group to begin.

Figure 7-6: Begin adding plan group to recover snapshot at standby

Task 7.3.2: Provide plan group name, order & add step

Create a DR plan group to recover OAC at the standby region 2.

Give the plan group a simple but descriptive name.
Select a position where the plan group will be inserted into the DR plan. In this case, insert the user-defined plan group after the user-defined plan group created in the previous task to start OAC.
Select the built-in Start OAC (Standby) plan group.
Click on Add step to open the dialog where we will specify the script to recover the OAC snapshot.

Figure 7-7: Parameters to create plan group and add step to recover the OAC snapshot at standby

Task 7.3.3: Provide step name & local script parameters

Everything in this task is the same as Task 7.3.2 except for the items show in Figure 7-8 below.

A descriptive name explaining what task this step performs.
Paste in the absolute path where you installed the oac-start-stop.sh script on the DR Control Node. Add OCI region ID as the only parameter (PHX in this example).
Click Add step to add this step to the plan group

Figure 7-8: Parameters to create the plan step for recovering snapshot at standby

Task 7.3.4: Complete adding plan group & step

The step to recover OAC is now added to the DR plan group as shown in figure 7-9 below.

This shows the plan step that was just added.
Click on Add to add the DR plan group and step to the DR plan.

Figure 7-9: Finalize adding plan group and step to recover snapshot at standby

Task 7.4: Create plan group to reverse snapshot at region 2 (standby)

The last user-defined plan group will change the cron job so the OAC snapshot will be saved to region 1 once it becomes accessible again. This is the same task that was added to the switchover plan in Task 6.

Task 7.4.1: Select add plan group

As before, click Add group to begin.

Figure 7-10: Begin adding plan group to reverse snapshot copy at standby

Task 7.4.2: Provide plan group name, order & add step

Create a DR plan group to reverse the OAC snapshot to region 1.

Give the plan group a simple but descriptive name. It’s always a good practice to add “(Standby)” to the group name so it is obvious which region the steps apply at a glance.
Select a position where the plan group will be inserted into the DR plan. In this case, we are going to insert our user-defined plan group after the built-in plan group that recovers the OAC snapshot at region 2.
Select the built-in Recover OAC snapshot (Standby) plan group.
Click on Add step to open the dialog where we will specify the script to start OAC.

Figure 7-11: Parameters to create plan group and add step to reverse snapshot copy at standby

Task 7.4.3: Provide step name & local script parameters

Everything in this task is the same as Task 7.2.2 except for the items show in Figure 6-19 below.

A descriptive name explaining what task this step performs.
Paste in the absolute path where you installed the oac-chg-cronjob.sh script on the DR Control Node. Add the OCI region key for region 1 (iad in this example) as the first parameter and region key for region 2 (phx in this example) as the second parameter.
Click Add step to add this step to the plan group.

Figure 7-12: Parameters to create the plan step for reversing snapshot copy at standby

Task 7.4.4: Complete adding plan group & step

The step to reverse OAC snapshot direction is now added to the DR plan group as shown in figure 7-13 below.

This shows the plan step that was just added.
Click on Add to add the DR plan group and step to the DR plan.

Figure 7-13: Finalize adding plan group and step to reverse snapshot copy at standby

The failover plan should now include the three DR Plan Groups for OAC as shown in the screenshot below. You may have additional plan groups if your protection group includes other applications or OCI services along with OAC.

Figure 7-14: Showing the three user-defined plan groups added to the failover plan

Task 8: Execute the switchover plan in region 2 (Phoenix)

Both switchover and failover DR plans have been completed in the standby region 2 (Phoenix). The DR plans in region 2 allow Full Stack DR to transition workloads from region 1 to region 2. The next task is to create switchover and failover plans in the protection group for region 1 (Ashburn) so Full Stack DR can transition workloads from region 2 back to region 1.

However, DR plans can only be created and modified in the protection group with the standby role. The DR protection group in region 1 is currently the primary, which means DR plans cannot be created in region 1.

Therefore, we need to reverse the roles of the protection groups so region 1 is the standby and region 2 is the primary. Execute the switchover plan that was just created to transition the workload from region 1 (Ashburn) to region 2 (Phoenix).

Task 8.1: Begin plan execution

Execute DR plan to begin the process of transitioning the OAC workload from region 1 to region 2.

Ensure the region context is still set to standby region 2 (Phoenix).
Use the breadcrumbs at the top of the console to help ensure DR protection group details is the current plan context.
Ensure the correct DR protection group in region 2 is selected; it should be the Standby role.
Ensure both the failover and switchover plans exist before proceeding; if not, go back to the previous steps to create both DR plans.
Click the Execute DR plan button.

Figure 8-1: Showing how to execute a switchover to standby region

Task 8.2: Select failover plan and execute

This task executes the switchover plan in region 2.

Select the switchover plan.
Ensure enable prechecks is selected.
Click the Execute DR plan button to begin.

Figure 8-2: Choose and execute the switchover plan

Task 8.3: Next steps

Monitor the failover plan until the OAC workload has been fully transitioned from region 1 to region 2. Full Stack DR will take care of cleaning up artifacts and changing the roles of primary and standby between the regions.

Region 2 (Phoenix) will be the primary region and region 1 (Ashburn) will be the standby region once Full Stack DR has completed the switchover.

Task 9: Create DR plans in region 1 (Ashburn)

Create the same basic switchover and failover plans in the DR Protection Group for region 1 (Ashburn) which is now the standby peer.

The purpose of each plan is to transition the workload from region 2 to region 1 whenever region 2 is the primary peer. The roles of the DR protection groups in both regions are automatically reversed as part of any DR operation, so the protection group in region 2 will become the standby and the protection group in region 1 will become primary after a failover or switchover.

The switchover plans are always created in the protection group with the standby role; region 1 is currently the standby protection group after executing the switchover plan in Task 8.

Task 9.1: Begin creating DR plans

Create a basic plans by selecting the DRPG in region 2 as shown in Figure 9-1 below.

Ensure the OCI region context is region 1 (Ashburn).
Select the standby DRPG in region 1.
Select Plans.
Click on Create Plan to begin the process.

Figure 9-1: How to begin creating basic DR plans in region 1

Task 9.1.1: Create a switchover plan

Creating a DR plan is simple as shown in Figure 9-2 below.

Make the name of the switchover plan simple but meaningful. The name should be as short as possible but easy to understand at a glance to help reduce confusion and human error during a crisis.
Choose the plan type. There are only two plan types at the time of this writing.

Figure 9-2: The parameters needed to create DR switchover plan

Task 9.2: Create a failover plan

Follow the same process to create a basic failover plan as shown in Figure 9-3 below.

Make the name of the failover plan simple but meaningful. The name should be as short as possible but easy to understand at a glance to help reduce confusion and human error during a crisis.
Choose the plan type. There are only two plan types at the time of this writing.
Click on Create to create basic failover plan prepopulated with basic built-in steps.

Figure 9-3: The parameters needed to create DR failover plan

The standby DR Protection Group in region 1 should now have the two DR plans as shown below. These will handle transitioning workloads from region 2 back to region 1.

Figure 9-4: Showing the two basic DR plans that must exist in region 2 before proceeding any further

Task 10: Customize the switchover plan in region 1 (Ashburn)

Everything about this task is almost exactly the same as what we did in Task 6 for region 2 except this is being done in region 1.

The basic DR plans created in Task 9 do not contain anything to manage recovery tasks specific to OAC. This task explains how to add custom, user-defined DR Plan Groups and steps to manage those things that need to be accomplished during a switchover for OAC:

Stop OAC at the current primary region 2 before stopping any VMs.
Start OAC at the current standby region 1 after launching any VMs.
Recover the periodic snapshot at the standby region 1. The periodic snapshot was set up as part of Task 1.4 above.
Change the snapshot cron job at the standby region 1. The cron job was set up as part of Task 1.4 above.

Task 10.1: Select the switchover plan

Begin by navigating to the switchover plan created in the previous step.

Figure 10-1: How to begin customizing the switchover plan in region 1

Task 10.2: Enable DR Plan Groups that terminate artifacts (optional)

These are the same steps performed for region 2 in an earlier step; the same process needs to be followed for region 1.

Two plan groups are disabled by default in switchover plans as shown in the screenshot below. They are disabled to provide a level of comfort during testing that nothing is actually being deleted, and you still have a viable copy of the artifacts as a backup in case something goes wrong during testing.

However, these two plan groups terminate (delete) artifacts that will never be used again as part of any DR operation in the future. The artifacts will simply continue to accumulate over time as you switch back and forth between the two regions causing confusion for humans about which compute instances and volume groups are the ones that should actually be active.

Optionally, these plan groups can be enabled now to avoid having to manually clean up the superfluous artifacts before going into production.

Figure 10-2: Plan groups disabled by default

Here is what the disabled plan groups do when they are enabled:

This plan group terminates artifacts of compute instances that are left behind at region 2 after the replicated versions of the VMs have been launched at region 1 during the OCI block storage operation to reverse the replication from region 1 back to region 2 as part of the switchover. The leftover VMs are not used during a switchback because the operation to reverse block volume replication creates all new VMs in completely new block volume groups.
This plan group terminates artifacts of block volume groups (VGs) that are left behind at region 2 after the replicated versions of the VGs have been activated at region 1 and volume group replication has been reversed during the switchover. The leftover block volume groups are never used again, not even as part of a switchover from region 1 back to region 2.

Task 10.2.1: Enable terminate compute plan group

Enable the plan group.

Select Enable all steps from the context menu to the right of the plan group name.

Figure 10-3: ow to enable terminate compute instances

Task 10.2.2 Enable terminate volume groups plan group

Enable the plan group.

Select Enable all steps from the context menu to the right of the plan group name.

Figure 10-4: How to enable terminate volume groups

Task 10.3: Create plan group to stop OAC at region 2 (primary)

Now begin adding custom, user-defined DR Plan Groups.

Task 10.3.1: Select add plan group

Begin the process to add a plan group.

Click on Add group to begin.

Figure 10-5: Begin adding plan group to stop OAC

Task 10.3.2: Provide plan group name, order & add step

A DR Plan Group can contain many steps that are all executed in parallel. We are just adding a single step to execute a bash script to stop OAC.

Give the plan group a simple but descriptive name. This is optional of course, but a best practice is to add a note about which region the plan group will execute the steps. In this case it is the primary region, so we’ve added “(Primary)” to the group name.
Select a position where the plan group will be inserted into the DR plan. In this case, we are going to insert our user-defined plan group before the built-in plan group that stops the VMs at region 2.
Select the built-in Stop Compute Instances (Primary) plan group.
Click on Add step to open the dialog where we will specify the script to stop OAC.

Figure 10-6: Parameters to create plan group and add step to stop OAC

Task 10.3.3: Provide step name & local script parameters

The Add plan group step dialog allows us to specify parameters about what this one step will perform and how it will behave during recovery. In this case, it will stop OAC at region 2.

We will explain all fields in this dialog, but leave out this detail in all the remaining screenshots in subsequent steps since we are just performing the same process repeatedly.

A descriptive name explaining what task this step performs.
The DR plan should stop if the script fails to stop OAC. This will allow anyone to see there is a problem and fix it. Full Stack DR provides the opportunity to continue running the switchover plan after fixing the problem.
The default value before Full Stack DR declares a failure is one hour. This value can be changed to 30 minutes or whatever is felt to be a more realistic timeout value.
Always select the region where the DR Control Node is running right now, not where it will be running during a switchover. Full Stack DR will keep track of where the VM is running, so you just need to specify where it is right now. In this case, the DR Control Node is running in region 2 (Phoenix).
Select Run local script to inform Full Stack DR that the script will be found on a compute instance. The bash scripts were downloaded to the DR Control Node in Task 1.3.
Select the correct compartment that contains the DR Control Node - it can be any compartment. The select the compute instance that was designated as the DR Control Node (it may be an application server or VM that was created just for this project/tutorial).
Paste in the absolute path where you installed the oac-start-stop.sh script on the DR Control Node. Add stop as the first parameter and the OCI region ID as the second.
Specify opc as the user to execute the script.
Click Add step to add this step to the plan group.

Figure 10-7: Parameters to create the plan step for stopping OAC

Task 10.3.4: Complete adding plan group & step

The step to stop OAC is now added to the DR plan group as shown in figure 10-8 below.

This shows the plan step that was just added. It is possible to add additional steps to a DR plan group, but this plan group will only include the step to stop OAC.
Click on Add to add the DR plan group and step to the DR plan.

Figure 10-8: Finalize adding plan group and step to stop OAC

Task 10.4: Create plan group to start OAC at region 1 (standby)

Task 10.4.1: Select add plan group

As before, click Add group to begin.

Figure 10-9: Begin adding plan group to start OAC at standby

Task 10.4.2: Provide plan group name, order & add step

Create a DR plan group to start OAC.

Give the plan group a simple but descriptive name. It’s always a good practice to add “(Standby)” to the group name so it is obvious which region the steps apply at a glance.
Select a position where the plan group will be inserted into the DR plan. In this case, we are going to insert our user-defined plan group after the built-in plan group that launches the replicated version of the DR control node at region 1.
Select the built-in Launch Compute Instances (Standby) plan group.
Click on Add step to open the dialog where we will specify the script to start OAC.

Figure 10-10: Parameters to create plan group and add step to start OAC at standby

Task 10.4.3: Provide step name & local script parameters

The Add plan group step dialog allows us to specify parameters about what this one step will perform and how it will behave during recovery. In this case, it will start OAC at region 1.

Everything in this task is the same as Task 10.3.3 except for the items show in Figure 10-11 below.

A descriptive name explaining what task this step performs.
Paste in the absolute path where you installed the oac-start-stop.sh script on the DR Control Node. Add start as the first parameter and the OCI region ID as the second.
Click Add step to add this step to the plan group.

Figure 10-11: Parameters to create the plan step for starting OAC at standby

Task 10.4.4: Complete adding plan group & step

The step to start OAC is now added to the DR plan group as shown in figure 10-12 below.

This shows the plan step that was just added.
Click on Add to add the DR plan group and step to the DR plan.

Figure 10-12: Finalize adding plan group and step to start OAC at standby

Task 10.5: Create plan group to recover snapshot at region 1 (standby)

The third user-defined plan group will recover OAC at the standby region 1 using the periodic snapshot that is replicated from the object storage bucket in region 2 to region 1. This plan group will contain a single step that calls the oac-register-snapshot.sh bash script that was downloaded to the DR Control Node in Task 1.3.

Task 10.5.1: Select add plan group

As before, click Add group to begin.

Figure 10-13: Begin adding plan group to recover snapshot at standby

Task 10.5.2: Provide plan group name, order & add step

Create a DR plan group to recover OAC at the standby region 1.

Give the plan group a simple but descriptive name.
Select a position where the plan group will be inserted into the DR plan. In this case, insert the user-defined plan group after the user-defined plan group created in the previous step to start OAC.
Select the built-in Start OAC (Standby) plan group.
Click on Add step to open the dialog where we will specify the script to recover the OAC snapshot.

Figure 10-14: Parameters to create plan group and add step to recover the OAC snapshot at standby

Task 10.5.3: Provide step name & local script parameters

The Add plan group step dialog allows us to specify parameters about what this one step will perform and how it will behave during recovery. In this case, it will recover the OAC snapshot at region 1. The snapshot is taken at the primary region during normal operations and stored in an object storage bucket at region 1.

Everything in this task is the same as Task 10.3.3 except for the items show in Figure 10-15 below.

A descriptive name explaining what task this step performs.
Paste in the absolute path where you installed the oac-start-stop.sh script on the DR Control Node. Add OCI region ID as the only parameter (PHX in this example).
Click Add step to add this step to the plan group.

Figure 10-15: Parameters to create the plan step for recovering snapshot at standby

Task 10.5.4: Complete adding plan group & step

The step to recover OAC is now added to the DR plan group as shown in figure 10-16 below.

This shows the plan step that was just added.
Click on Add to add the DR plan group and step to the DR plan.

Figure 10-16: Finalize adding plan group and step to recover snapshot at standby

Task 10.6: Create plan group to reverse snapshot at region 1 (standby)

The last user-defined plan group will change the cron job discussed in Task 10.5 above. Full Stack DR will call the oac-chg-cronjob.sh to modify the cron job so it saves the exported OAC snapshot to the storage bucket in region 2.

Task 10.6.1: Select add plan group

As before, click Add group to begin.

Figure 10-17: Begin adding plan group to reverse snapshot copy at standby

Task 10.6.2: Provide plan group name, order & add step

Create a DR plan group to reverse the OAC snapshot to region 2.

Give the plan group a simple but descriptive name. It’s always a good practice to add “(Standby)” to the group name so it is obvious which region the steps apply at a glance.
Select a position where the plan group will be inserted into the DR plan. In this case, we are going to insert our user-defined plan group after the built-in plan group that recovers the OAC snapshot at region 1.
Select the built-in Recover OAC snapshot (Standby) plan group.
Click on Add step to open the dialog where we will specify the script to start OAC.

Figure 10-18: Parameters to create plan group and add step to reverse snapshot copy at standby

Task 10.6.3: Provide step name & local script parameters

The Add plan group step dialog allows us to specify parameters about what this one step will perform and how it will behave during recovery. In this case, it will reverse the OAC snapshot so it is saved to region 2 which will automatically become the standby region after the switchover has completed.

Everything in this task is the same as Task 10.3.3 except for the items show in Figure 10-19 below.

A descriptive name explaining what task this step performs.
Paste in the absolute path where you installed the oac-chg-cronjob.sh script on the DR Control Node. Add the OCI region key for region 2 (phx in this example) as the first parameter and region key for region 1 (iad in this example) as the second parameter.
Click Add step to add this step to the plan group.

Figure 10-19: Parameters to create the plan step for reversing snapshot copy at standby

Task 10.6.4: Complete adding plan group & step

The step to reverse OAC snapshot direction is now added to the DR plan group as shown in figure 10-20 below.

This shows the plan step that was just added.
Click on Add to add the DR plan group and step to the DR plan.

3.
Figure 10-20: Finalize adding plan group and step to reverse snapshot copy at standby

Figure 10-21: Showing the four user-defined plan groups added to the switchover plan

Task 11: Customize the failover plan in region 1 (Ashburn)

This task explains how to add custom, user-defined DR Plan Groups and steps to manage those things that need to be accomplished during a failover for OAC at region 1 during an actual outage or loss of access to region 2. These will be a subset of the same steps that were just added to the switchover plan in Task 10 above. However, only steps that are executed at standby region 1 will be added to the failover plan since it is assumed region 2 is completely inaccessible during a failover.

Start OAC at the standby region 1 after launching any VMs.
Recover the periodic snapshot at the standby region 1. The periodic snapshot was set up as part of Task 1.4 above.
Change the snapshot cron job at the standby region 1. The cron job was set up as part of Task 1.4 above.

Task 11.1: Create plan group to start OAC at region 1 (standby)

Begin by navigating to the failover plan created in Task 9.

Ensure standby region 1 is still the current region context in the console.
Select the failover plan.

Figure 11-1: How to create begin customizing the failover plan in region 1

Task 11.2: Select add plan group

The first user-defined plan group will start OAC running at the standby region 1. This plan group will contain a single step that calls the oac-start-stop.sh bash script that was downloaded to the DR Control Node in Task 1.3.

Click on Add group to begin.

Figure 11-2: Begin adding plan group to start OAC

Task 11.2.1: Provide plan group name, order & add step

A DR Plan Group can contain many steps that are all executed in parallel. We are just adding a single step to execute a bash script to start OAC.

Give the plan group a simple but descriptive name. This is optional of course, but a best practice is to add a note about which region the plan group will execute the steps. In this case it is the standby region 1, so we’ve added “(Standby)” to the group name.
Select a position where the plan group will be inserted into the DR plan. In this case, we are going to insert our user-defined plan group after the built-in plan group that launches the replicated VMs at region 1
Select the built-in Launch Compute Instances (Standby) plan group
Click on Add step to open the dialog where we will specify the script to start OAC

Figure 11-3: Parameters to create plan group and add step to start OAC

Task 11.2.2: Provide step name & local script parameters

We will explain all fields in this dialog, but leave out this detail in all the remaining screenshots in subsequent steps since we are just performing the same process repeatedly.

A descriptive name explaining what task this step performs.
The DR plan should stop if the script fails to start OAC. This will allow anyone to see there is a problem and fix it. Full Stack DR provides the opportunity to continue running the switchover plan after fixing the problem.
The default value before Full Stack DR declares a failure is one hour. This value can be changed to 30 minutes or whatever is felt to be a more realistic timeout value.
Always select the region where the DR Control Node is running right now, not where it will be running during a switchover. Full Stack DR will keep track of where the VM is running, so you just need to specify where it is right now. In this case, the DR Control Node is running in region 2 (Phoenix).
Select Run local script to inform Full Stack DR that the script will be found on a compute instance. The bash scripts were downloaded to the DR Control Node in Task 1.3.
Select the correct compartment that contains the DR Control Node - it can be any compartment. The select the compute instance that was designated as the DR Control Node (it may be an application server or VM that was created just for this project/tutorial).
Paste in the absolute path where you installed the oac-start-stop.sh script on the DR Control Node. Add start as the first parameter and the OCI region ID as the second.
Specify opc as the user to execute the script.
Click Add step to add this step to the plan group

Figure 11-4: Parameters to create the plan step for starting OAC at standby

Task 11.2.3: Complete adding plan group & step

The step to start OAC is now added to the DR plan group as shown in figure 11-5 below.

This shows the plan step that was just added.
Click on Add to add the DR plan group and step to the DR plan.

Figure 11-5: Finalize adding plan group and step to start OAC

Task 11.3: Create plan group to recover snapshot at region 1 (standby)

The second user-defined plan group will recover OAC at the standby region 1 using the periodic snapshot that is replicated from the object storage bucket in region 2 to region 1. This is the same task that was added to the switchover plan is Task 9.

Task 11.3.1: Select add plan group

As before, click Add group to begin.

Figure 11-6: Begin adding plan group to recover snapshot at standby

Task 11.3.2: Provide plan group name, order & add step

Create a DR plan group to recover OAC at the standby region 1.

Give the plan group a simple but descriptive name.
Select a position where the plan group will be inserted into the DR plan. In this case, insert the user-defined plan group after the user-defined plan group created in the previous step to start OAC.
Select the built-in Start OAC (Standby) plan group.
Click on Add step to open the dialog where we will specify the script to recover the OAC snapshot.

Figure 11-7: Parameters to create plan group and add step to recover the OAC snapshot at standby

Task 11.3.3: Provide step name & local script parameters

The Add plan group step dialog allows us to specify parameters about what this one step will perform and how it will behave during recovery. In this case, it will recover the OAC snapshot at region 1. The snapshot is taken at the primary region during normal operations and stored in an object storage bucket at region 1.

Everything in this task is the same as Task 11.3.2 except for the items show in Figure 11-8 below.

A descriptive name explaining what task this step performs.
Paste in the absolute path where you installed the oac-start-stop.sh script on the DR Control Node. Add OCI region ID as the only parameter (PHX in this example).
Click Add step to add this step to the plan group.

Figure 11-8: Parameters to create the plan step for recovering snapshot at standby

Task 11.3.4: Complete adding plan group & step

The step to recover OAC is now added to the DR plan group as shown in figure 11-9 below.

This shows the plan step that was just added.
Click on Add to add the DR plan group and step to the DR plan.

Figure 11-9: Finalize adding plan group and step to recover snapshot at standby

Task 11.4: Create plan group to reverse snapshot at region 1 (standby)

The last user-defined plan group will change the cron job so the OAC snapshot will be saved to region 2 once it becomes accessible again. This is the same task that was added to the switchover plan in Task 10.

Task 11.4.1: Select add plan group

As before, click Add group to begin.

Figure 11-10: Begin adding plan group to reverse snapshot copy at standby

Task 11.4.2: Provide plan group name, order & add step

Create a DR plan group to reverse the OAC snapshot to region 2.

Give the plan group a simple but descriptive name. It’s always a good practice to add “(Standby)” to the group name so it is obvious which region the steps apply at a glance.
Select a position where the plan group will be inserted into the DR plan. In this case, we are going to insert our user-defined plan group after the built-in plan group that recovers the OAC snapshot at region 1.
Select the built-in Recover OAC snapshot (Standby) plan group.
Click on Add step to open the dialog where we will specify the script to start OAC.

Figure 11-11: Parameters to create plan group and add step to reverse snapshot copy at standby

Task 11.4.3: Provide step name & local script parameters

The Add plan group step dialog allows us to specify parameters about what this one step will perform and how it will behave during recovery. In this case, it will reverse the OAC snapshot so it is saved to region 2 which will automatically become the standby region after the switchover has completed.

Everything in this task is the same as Task 11.3.2 except for the items show in Figure 11-19 below.

A descriptive name explaining what task this step performs.
Paste in the absolute path where you installed the oac-chg-cronjob.sh script on the DR Control Node. Add the OCI region key for region 2 (phx in this example) as the first parameter and region key for region 1 (iad in this example) as the second parameter.
Click Add step to add this step to the plan group.

Figure 11-12: Parameters to create the plan step for reversing snapshot copy at standby

Task 11.4.4: Complete adding plan group & step

The step to reverse OAC snapshot direction is now added to the DR plan group as shown in figure 11-13 below.

This shows the plan step that was just added.
Click on Add to add the DR plan group and step to the DR plan.

Figure 11-13: Finalize adding plan group and step to reverse snapshot copy at standby

Figure 11-14: Showing the three user-defined plan groups added to the failover plan

Next Steps

Full Stack DR for OAC should be fully implemented at this point. However, full functionality should be validated before using Full Stack DR for production. All failover and switchover plans should be executed to validate that everything works as expected and the recovery team fully understands the entire process.

Testing switchover plans

Switchover plans are designed to clean up all artifacts and ensure all roles for built-in recovery steps such as load balancer, block storage, file systems, BaseDB, ExaCS and Autonomous Data Base are ready to recovered from the standby region without human intervention.

Testing failover plans

Failovers are different. Failovers by their very nature cannot clean up artifacts or ensure services and databases at the failed region are ready to transition workloads back to region 1. The recovery team needs to understand and perform tasks to ensure Data Guard is in the correct state, artifacts for storage and compute instances have been terminated, etc. Please read Resetting DR Configuration After a Failover in OCI Full Stack DR documentation to understand the process.

Validate all DR plans for final acceptance

The recovery team needs to perform a final validation to demonstrate the readiness of Full Stack DR protection groups and plans for production workloads. Region 2 (Phoenix) should be the primary region at this point in the process. Begin final validate of all plans by completing the following steps:

Test switchover from region 2 (primary) back to region 1 (standby).
Test failover from region 1 (primary) to region 2 (standby).
Prepare region 1 (primary) for failover from region 2.
Test failover from region 2 (primary) to region 1 (standby).
Prepare region 2 (primary) for either a failover or switchover to region 2.
The DR protection groups and application stack should be in a normal operational state and ready for a failover or switchover at this point.

Acknowledgments

Author - Bala Guddeti (NACE Cloud Solution Specialist)
Contributors - Greg King (Full Stack Disaster Recovery Product Manager), Suraj Ramesh (Full Stack Disaster Recovery Product Manager)

More Learning Resources

Explore other labs on docs.oracle.com/learn or access more free learning content on the Oracle Learning YouTube channel. Additionally, visit education.oracle.com/learning-explorer to become an Oracle Learning Explorer.

For product documentation, visit Oracle Help Center.

Title and Copyright Information

Automate Recovery for Oracle Analytics Cloud Using OCI Full Stack Disaster Recovery

F88861-01

December 2023