Note:
- This tutorial requires access to Oracle Cloud. To sign up for a free account, see Get started with Oracle Cloud Infrastructure Free Tier.
- It uses example values for Oracle Cloud Infrastructure credentials, tenancy, and compartments. When completing your lab, substitute these values with ones specific to your cloud environment.
Automate Switchover and Failover for PostgreSQL Database with Oracle Cloud Infrastructure Full Stack Disaster Recovery
Introduction
Oracle Cloud Infrastructure Full Stack Disaster Recovery (OCI Full Stack DR) orchestrates the transition of compute, database, and applications between Oracle Cloud Infrastructure (OCI) regions from around the globe with a single click. Customers can automate the steps needed to recover one or more business systems without redesigning or rearchitecting existing infrastructure, databases, or applications and without needing specialized management or conversion servers.
OCI Full Stack DR is a highly flexible service, we can take advantage of user-defined plan groups in the Disaster Recovery (DR) plan to handle OCI Object Storage bucket DR operations.
Deployment Architecture
Note: The primary region is Frankfurt and the DR region is London.
Objectives
- Automate switchover and failover for PostgreSQL database with OCI Full Stack DR.
Prerequisites
-
Administrator privileges or configure the required Oracle Cloud Infrastructure Identity and Access Management (OCI IAM) policies for OCI Full Stack DR. For more information, see Configuring Identity and Access Management (IAM) policies to use OCI Full Stack DR and Policies for OCI Full Stack DR.
-
You need to enable Compute Instance Run Command plugin in both primary and standby as OCI Full Stack DR will execute
sudo
commands to mount and unmount commands during switchover. For more information about run commands on OCI Compute instance, see Running Commands on OCI Compute instance. -
Use run command with OCI Full Stack DR. For more information, see Invoke custom scripts using the run command with Oracle Cloud Infrastructure Full Stack Disaster Recovery.
-
Establish a Remote Peering Connection (RPC) between two Dynamic Routing Gateways (DRG) located in the same tenancies. For more information, see Remote VCN Peering through an Upgraded DRG.
-
Make sure that both machines can communicate on PostgreSQL port
5432
for database replication, configuring the respective security list or network security group. -
Use Bastion as a Service or a Bastion host to connect to the PostegreSQL Servers.
-
OCI Object Storage bucket (one in each region).
Task 1: Install and Configure PostgreSQL
-
To install PostgreSQL, connect to the host using your favorite SSH client and run the following command.
sudo dnf install -y postgresql postgresql-server postgresql-contrib sudo /usr/bin/postgresql-setup initdb sudo systemctl enable postgresql sudo systemctl start postgresql
-
Enable the Linux firewall to receive connections.
sudo firewall-cmd --zone=public --add-port=5432/tcp --permanent sudo firewall-cmd --reload
-
Edit the
postgresql.conf
(found at xxxxx) file to allow connection and change thewal
parameters as follows.... listen_addresses = '*' ... wal_level = logical wal_log_hints = on ...
-
Connect to the database and then create the replication user.
sudo su postgres psql
CREATE ROLE replica_user WITH REPLICATION LOGIN PASSWORD 'YourUserPassword';
-
Add this user to the
pg_hba.conf
(found at xxxxx) file.host all all all md5 host replication replica_user <IP Host 1> md5 host replication replica_user <IP Host 2> md5
-
Restart the PostgreSQL database to apply the changes.
sudo systemctl restart postgresql
-
Repeat this process in the second server.
Task 2: Replicate the PostgreSQL Database
In this task, we will configure the native PostgreSQL database replication.
-
Connect to the primary database and create the replica slot.
sudo su postgres psql
select pg_create_physical_replication_slot('replica_0');
-
Connect to the secondary database and run the following commands to create the replication.
sudo systemctl stop postgresql sudo rm -rv /var/lib/pgsql/data/ sudo pg_basebackup -d "host=<IP Host 1> port=5432 user=replica_user password=YourUserPassword" -w -X stream -S replica_0 -v -R -W -D /var/lib/pgsql/data/ --no-password sudo chown postgres -R /var/lib/pgsql/data/ sudo systemctl start postgresql
-
Check the replication is flowing on the primary database.
select * from pg_stat_replication;
You should be able to see one line with the streaming process.
Task 3: Create the Switchover Plan
Task 3.1: Create and Associate Disaster Recovery Protection Groups (DRPG)
-
In the primary region (Frankfurt), click the Hamburger menu, navigate to Migration & Disaster Recovery, Disaster Recovery, Disaster Recovery Protection Group and Create Disaster Recovery protection Group.
-
In the standby region (London), click the Hamburger menu, navigate to Migration & Disaster Recovery, Disaster Recovery, Disaster Recovery Protection Group and Create Disaster Recovery protection Group.
-
In primary region (Frankfurt), enter the following information and click Associate.
- Role:
Primary
. - Peer region:
UK South (London)
. - Peer DR protection group:
postgresql-lon
.
- Role:
Task 3.2: Add Members to Primary and Standby DRPG
-
In the primary DRPG (
postgresql-fra
), select Members and add the compute VM (postgresql01
) as a member.Note: While adding compute as member, select Non-Moving Instance in the compute instance type, as we have active-passive setup for the PostgreSQL database.
-
In the standby DRPG (
postgresql-lon
), select Members, and add the compute VM (postgresql02
) as a member.Note: While adding compute as members, select Non-Moving Instance in the compute instance type, as we have active-passive setup for the PostgreSQL database.
Task 3.3: Create a Switchover plan in the Standby DRPG (London)
Switchover Plan: A type of DR plan that performs a planned transition of services from the primary DRPG to the standby DRPG.
-
Create a switchover plan in standby DRPG
postgresql-lon
. -
Select Plans and click Create plan.
-
Enter the Name as
postgresql-switchover-frankfurt-london
and select the Plan type asSwitchover (planned)
. -
Click Create.
Note: Make sure to create the switchover plan from the standby DRPG (London).
Task 3.4: Customize the Switchover Plan with Various User-defined Plan Groups
We have added Compute (non-moving instance) as members in both the DR Protection Groups. We need to customize the switchover plan with additional user-defined plan groups. We also need to create the following bash scripts and upload them to the OCI Object Storage, so we can invoke those scripts during the switchover.
-
stop_postgresql.sh
.#!/bin/bash sudo systemctl stop postgresql
-
promote_postgresql.sh
.#!/bin/bash sudo -u postgres /usr/bin/pg_ctl promote -D /var/lib/pgsql/data/ sudo -u postgres psql -c "select pg_create_physical_replication_slot('replica_1');"
-
change_replication_direction.sh
.#!/bin/bash sudo rm -rv /var/lib/pgsql/data/ sudo sudo pg_basebackup -d "host=<IP Host 2> port=5432 user=replica_user password=YourUserPassword" -w -X stream -S replica_1 -v -R -W -D /var/lib/pgsql/data/ --no-password sudo chown postgres -R /var/lib/pgsql/data/ sudo systemctl start postgresql sudo -u postgres psql -c "select pg_create_physical_replication_slot('replica_0');"
Note: These scripts should be uploaded to the bucket at London region.
A typical workflow for DR switchover is as follows:
-
Stop PostgreSQL Database at Frankfurt region: Plan group will have steps to stop the PostgreSQL database in the Frankfurt region.
-
Promote PostgreSQL Database at London region: Plan group will have steps to promote the PostgreSQL database in the London region.
-
Change PostgreSQL Database replication direction: Plan group will have steps to change the PostgreSQL database replication direction (LON-FRA).
Note: Make sure the steps are executed in the same order.
Create the following user-defined plan group.
-
Create Stop PostgreSQL Database at Frankfurt region plan group.
-
Under Plan Groups, click Add group to create a custom plan group to stop PostgreSQL database in the Frankfurt region. Enter Group name as
Stop PostgreSQL Database at Frankfurt Region
. -
Click Add Step and enter the following information. We will add one step in the plan group.
-
Make sure to choose the Region as
Frankfurt
, and the Target instance aspostgresql01
. Select the Run object storage script and the location of the script chooses the OCI Object storage bucketposgres-dr-lon
. -
Enter the script
stop_postgresql.sh
details with the script location. The script will shut down the PostgreSQL database in the FRA region. -
Click Add Step and verify the added step.
-
-
-
Create Promote PostgreSQL Database at London region plan group.
-
Under Plan Groups, click Add group to create a custom plan group to promote PostgreSQL database in London. Enter Group name as
Promote PostgreSQL Database at London Region
. -
Click Add Step and enter the following details. We will add one step in the plan group.
-
Make sure to choose the Region as
London
, the target instance aspostgresql02
. Select the Run object storage script and location of the script chooses the OCI Object Storage bucketposgres-dr-lon
. -
Enter the script
promote_postgresql.sh
details with the script location. The script will shut down the PostgreSQL database in the FRA region. -
Click Add Step and verify the added step.
-
-
-
Create Change PostgreSQL Database replication direction plan group.
-
Under Plan Groups, click Add group to create a custom plan group to change PostgreSQL database replication direction. Enter Group name as
Promote PostgreSQL Database at London Region
. -
Click Add Step and enter the following information. We will add one step in the plan group.
-
Make sure to choose the Region as
Frankfurt
, and the Target instance aspostgresql01
. Select the Run object storage script and location of the script choose the OCI Object Storage bucketposgres-dr-lon
. -
Enter the script
change_replication_direction.sh
details with the script location. The script will shut down the PostgreSQL database in the FRA region. -
Click Add Step and verify the added step.
-
-
Task 4: Execute Prechecks and Run the Switchover Plan
The Run Prechecks option performs a quick validation of all steps in a DR plan and the members associated with the steps.
-
From the switchover plan
postgresql-switchover-frankfurt-london
, click Run Prechecks to run prechecks.Verify for the successful completion.
-
From the switchover plan
postgresql-switchover-frankfurt-london
, click Execute Disaster Recovery Plan to execute the plan. -
Check the status of the plan and ensure all the steps in the plan succeed.
-
The switchover plan performs all the plan groups in a specific order. After the successful plan execution, the PostgreSQL database will be promoted to master in the London region.
-
Automatically the role will change in the DRPG. Now, London will have primary and Frankfurt will have the standby role.
-
Verify the PostgreSQL database in London.
Task 5: Create Switchover Plan in the Standby DRPG (Frankfurt) to Switchback
Task 5.1: Create a Switchover Plan in the Standby DRPG (London)
-
Create a switchover plan in Standby DRPG
postgresql-fran
. -
Select Plans and click Create plan.
-
Enter the Name as
postgresql-switchover-london-frankfurt
and select the Plan type asSwitchover (planned)
. -
Click Create.
Note: Make sure to create the switchover plan from the standby DRPG (Frankfurt).
Task 5.2: Customize the Switchover Plan with Various User-defined Plan Groups
We need to customize the switchover plan with additional user-defined plan groups. We also need to create the following bash scripts and upload to the OCI Object Storage, so we are able to invoke those scripts during the switchover.
-
rewind_postgresql.sh
.#!/bin/bash sudo systemctl stop postgresql sudo rm -rv /var/lib/pgsql/data/ sudo -u postgres pg_basebackup -d "host=<IP Host 2> port=5432 user=replica_user password=YourUserPassword" -v -R -D /var/lib/pgsql/data/ --no-password sudo chown postgres -R /var/lib/pgsql/data/ sudo systemctl start postgresql sudo -u postgres /usr/bin/pg_ctl promote -D /var/lib/pgsql/data/ sudo -u postgres psql -c "select pg_create_physical_replication_slot('replica_0');"
-
replicate_postgresql.sh
.#!/bin/bash sudo systemctl stop postgresql sudo rm -rv /var/lib/pgsql/data/ sudo pg_basebackup -d "host=<IP Host 2> port=5432 user=replica_user password=YourUserPassword" -w -X stream -S replica_0 -v -R -W -D /var/lib/pgsql/data/ --no-password sudo chown postgres -R /var/lib/pgsql/data/ sudo systemctl start postgresql
Note: These scripts should be uploaded to the bucket at Frankfurt region.
A typical workflow for DR switchover is as follows:
-
Rewind PostgreSQL Databases: Plan group to rewind the database. Frankfurt will sync the content from London database.
-
Re-establish PostgreSQL Database replication to London: Plan group will have step to start replicating the PostgreSQL database from Frankfurt to the London region.
Note: Make sure the steps are executed in the same order.
Create the following two user-defined plan groups
-
Create Rewind PostgreSQL Databases plan group.
-
Under Plan Groups, click Add group to create a custom plan group to rewind PostgreSQL database in the Frankfurt region. Enter Group name as
Rewind PostgreSQL Databases at Frankfurt Region
. -
Click Add Step and enter the following information. We will add one step to the plan group.
-
Make sure to choose the Region as
Frankfurt
, and the Target instance aspostgresql01
. Select the Run object storage script and location of the script chooses the OCI Object Storage bucketposgres-dr-lon
. -
Enter the script
rewind_postgresql.sh
details with the script location. The script will shut down the PostgreSQL database in the FRA region. -
Click Add Step and verify the added step.
-
-
-
Create Re-establish PostgreSQL Database replication to London plan group.
-
Under Plan Groups, click Add group to create a custom plan group to re-establish PostgreSQL database replication to London. Enter Group name as
Promote PostgreSQL Database at London Region
. -
Click Add Step and enter the following information. We will add one step in the plan group.
-
Make sure to choose the Region as
London
, and the Target instance aspostgresql02
. Select the Run object storage script and the location of the script chooses the OCI Object Storage bucketposgres-dr-lon
. -
Enter the script
replicate_postgresql.sh
details with the script location. The script will shut down the PostgreSQL database in the FRA region. -
Click Add Step and verify the added step.
-
-
Task 6: Execute Prechecks and Run the Switchover Plan
The Run Prechecks performs a quick validation of all steps in a DR plan and the members associated with the steps.
-
From the switchover plan
postgresql-switchover-london-frankfurt
, click Run Prechecks to run prechecks.Verify for the successful completion.
-
From the switchover plan
postgresql-switchover-london-frankfurt
, click Execute Disaster Recovery Plan to execute the plan. -
Check the status of the plan and ensure all the steps in the plan succeed.
-
The switchover plan performs all the plan groups in a specific order. After the successful plan execution, the PostgreSQL database will be promoted to master in the Frankfurt region.
-
Automatically the role will change in the DRPG. Now, Frankfurt will have primary and London will have the standby role.
-
Verify the PostgreSQL database in Frankfurt.
Next Steps
You have seen how to automatically perform the PostgreSQL database switchover and switch back Disaster Recovery operations using OCI Full Stack DR. For more information, see the OCI Full Stack DR documentation in the Related Links section.
Related Links
Acknowledgments
-
Author - Raphael Teixeira (Principal Member of Technical Staff for FSDR )
-
Contributors - Ricardo Malhado (Hyperscaler to OCI Specialist), Suraj Ramesh (Principal Product Manager for MAA)
More Learning Resources
Explore other labs on docs.oracle.com/learn or access more free learning content on the Oracle Learning YouTube channel. Additionally, visit education.oracle.com/learning-explorer to become an Oracle Learning Explorer.
For product documentation, visit Oracle Help Center.
Automate Switchover and Failover for PostgreSQL Database with Oracle Cloud Infrastructure Full Stack Disaster Recovery
F96158-01
April 2024