Note:

Deploy a highly available Postgres cluster on Oracle Cloud Infrastructure

Introduction

This tutorial outlines the design and implementation of Postgres in a HA configuration using Patroni and other components. PostgreSQL lacks inbuilt automatic failover and automatic mechanism to add failed master back to the cluster. Patroni is the new age High Availability (HA) solution for PostgreSQL with cloud-native features and advanced options for failover/switchover and automated bootstrap and replica setup.

Patroni is a template to create your own customized HA solution using Python and for maximum accessibility, a distributed configuration store like etcd.

Below are some limitations with native Postgres replication and its solution with Patroni:

Limitations in native Postgres replication Patroni based solutions
The default replication mechanism does not support automatic failover. Patroni provides automatic failover.
Using external tools for failover may need additional effort to keep them up and running. Patroni takes care of failover.
Monitoring Postgres can also be a challenge. Patroni has an in-built mechanism which monitors the Postgres service.
Automatically adding the failed node back to the cluster requires advanced scripting skills. Patroni has built-in automation for bringing back a failed node to the cluster.
Cannot handle the Split Brain scenarios. Patroni with the help of ETCD will be able to elect a new Leader.

Other PostgreSQL HA solutions include:

However, using Patroni with Postgres implementation simplifies the overall cluster management lifecycle significantly.

Objective

This tutorial lists a viable high performance solution with potential alternatives for customers to migrate Postgres databases (from AWS or other cloud vendors) to OCI. The key requirements being HA and real-time data migration.

Architecture

The following architecture consists of 3 ETCD servers, 3 (Postgres + Patroni + Pgbackrest) servers, Object Storage bucket and Network Load Balancer.

Architecture Diagram

Recommendations

  1. For HA: use 3 ETCD Servers, 3 nodes for PostgreSQL and place them in different Availability Domains.
  2. For better throughput, create separate block volumes for data, temp, wal and log files.
  3. Define all custom parameters in patroni.yaml bootstrap section at the time of cluster creation.

Configure and install the PostgreSQL HA components

The setup is divided into two parts:

Task 1: Provision the infrastructure

  1. Create compute VMs:

    • 3 ETCD Servers
    • 3 Servers for Postgres + Patroni (1 Leader and 2 Replica)
  2. Create an Object Storage Bucket for storing backups.

  3. Create an OCI tenancy user with READ/WRITE access on the above Object Store bucket.

Task 2: Install and configure the software

  1. Configure ETCD.

    • Install ETCD on 3 servers.

      cd /tmp
      wget https://github.com/etcd-io/etcd/releases/download/v3.5.2/etcd-v3.5.2-linux-amd64.tar.gz
      tar xzvf /tmp/etcd-v3.5.2-linux-amd64.tar.gz
      cd etcd-v3.5.2-linux-amd64
      cp etcdutl etcdctl etcd /usr/local/bin/
      mkdir -p /etc/etcd
      mkdir -p /var/lib/etcd
      groupadd -f -g 1501 etcd
      useradd -c "etcd user" -d /var/lib/etcd -s /bin/false -g etcd -u 1501 etcd
      chown -R etcd:etcd /var/lib/etcd
      
    • Configure and start ETCD on all 3 servers.

      Below are the ETCD server details:

      | Hostname | IP Address | Availability Domain |
      | --- | --- | --- |    
      | pg-etcd-01 | 192.0.2.4  | AD1 |
      | pg-etcd-02 | 192.0.2.5  | AD2 |
      | pg-etcd-03 | 192.0.2.6  | AD3 |
      

      Note: Modify IP’s as per the requirement.

      pg-etcd-01

      vi /etc/etcd/etcd.conf
      
      ###Node1
      ##192.0.2.4
      
      ETCD_NAME="pg-etcd-01"
      ETCD_INITIAL_CLUSTER="pg-etcd-01=http://192.0.2.4:2380"
      ETCD_LISTEN_CLIENT_URLS="http://192.0.2.4:2379,http://127.0.0.1:2379"
      ETCD_ADVERTISE_CLIENT_URLS="http://192.0.2.4:2379"
      ETCD_LISTEN_PEER_URLS="http://192.0.2.4:2380,http://127.0.0.1:7001"
      ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.0.2.4:2380"
      ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
      ETCD_INITIAL_CLUSTER_STATE="new"
      ETCD_DATA_DIR="/var/lib/etcd"
      ETCD_ELECTION_TIMEOUT="5000"
      ETCD_HEARTBEAT_INTERVAL="1000"
      ETCD_ENABLE_V2="true"
      
      

      pg-etcd-02

      vi /etc/etcd/etcd.conf
      
      ##Node2
      ##192.0.2.5
      ETCD_NAME=" pg-etcd-02"
      ETCD_INITIAL_CLUSTER="pg-etcd-01=http://192.0.2.4:2380,pg-etcd-02=http://192.0.2.5:2380"
      ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
      ETCD_INITIAL_CLUSTER_STATE="existing"
      ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.0.2.5:2380"
      ETCD_LISTEN_PEER_URLS="http://192.0.2.5:2380,http://127.0.0.1:7001"
      ETCD_LISTEN_CLIENT_URLS="http://192.0.2.5:2379,http://127.0.0.1:2379"
      ETCD_ADVERTISE_CLIENT_URLS="http://192.0.2.5:2379"
      ETCD_DATA_DIR="/var/lib/etcd"
      ETCD_ELECTION_TIMEOUT="5000"
      ETCD_HEARTBEAT_INTERVAL="1000"
      ETCD_ENABLE_V2="true"
      

      pg-etcd-03

      vi /etc/etcd/etcd.conf
      
      ##Node3
      ##192.0.2.6
      ETCD_NAME="pg-etcd-03"
      ETCD_INITIAL_CLUSTER="pg-etcd-01=http://192.0.2.4:2380,pg-etcd-02=http://192.0.2.5:2380,pg-etcd-03=http://192.0.2.6:2380"
      ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
      ETCD_INITIAL_CLUSTER_STATE="existing"
      ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.0.2.6:2380"
      ETCD_LISTEN_PEER_URLS="http://192.0.2.6:2380,http://127.0.0.1:7001"
      ETCD_LISTEN_CLIENT_URLS="http://192.0.2.6:2379,http://127.0.0.1:2379"
      ETCD_ADVERTISE_CLIENT_URLS="http://192.0.2.6:2379"
      ETCD_DATA_DIR="/var/lib/etcd"
      ETCD_ELECTION_TIMEOUT="5000"
      ETCD_HEARTBEAT_INTERVAL="1000"
      ETCD_ENABLE_V2="true"
      
    • Add members to pg-etcd-01.

      etcdctl member add pg-etcd-02 --peer-urls=http://192.0.2.5:2380
      etcdctl member add pg-etcd-03 --peer-urls=http://192.0.2.6:2380
      
    • Run the following command to display the member list.

      etcdctl member list
      
    • Create a service.

      vi /etc/systemd/system/etcd.service
      
      [Unit]
      Description=Etcd Server
      After=network.target
      After=network-online.target
      Wants=network-online.target
      
      [Service]
      Type=notify
      WorkingDirectory=/var/lib/etcd
      EnvironmentFile=-/etc/etcd/etcd.conf
      User=etcd
      # set GOMAXPROCS to number of processors
      ExecStart=/bin/bash -c "GOMAXPROCS=$(nproc) /usr/local/bin/etcd"
      Restart=on-failure
      LimitNOFILE=65536
      IOSchedulingClass=best-effort
      IOSchedulingPriority=0
      
      [Install]
      WantedBy=multi-user.target
      
      systemctl daemon-reload
      
      systemctl enable etcd
      
  2. Install Postgres on all nodes. Run the following script to install Postgres:

    • Postgres Installation Script

      Note: For Replica, stop postgresql and delete data directory as it will be copied from the leader once the Patroni configuration is completed.

      /opt/pgsql/bin/pg_ctl -D /opt/pgsql/data/$MAJORVERSION stop
      cd /opt/pgsql/data/
      rm *
      
  3. Install extensions: pg_squeeze and pgaudit.

    • Install PGAUDIT

      cd /usr/local/src/postgresql-12.6/contrib/
      wget https://github.com/pgaudit/pgaudit/archive/refs/heads/REL_12_STABLE.zip
      unzip REL_12_STABLE.zip
      make install USE_PGXS=1 PG_CONFIG=/opt/pgsql/bin/pg_config
      
    • Install PG_SQUEEZE

      cd /usr/local/src/postgresql-12.6/contrib/
      wget https://github.com/cybertec-postgresql/pg_squeeze/archive/refs/heads/master.zip
      unzip master.zip
      make install USE_PGXS=1 PG_CONFIG=/opt/pgsql/bin/pg_config
      
  4. Install Patroni on all nodes.

    yum update –y
    yum -y install epel-release
    yum -y install python3
    yum install -y python3-devel
    yum install -y psutils
    yum install -y gcc
    yum install https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm
    yum install python3-psycopg2
    pip3 install python-etcd
    pip3 install patroni
    
  5. Install and configure pgBackrest on all the nodes.

    • Install Pgbackrest

      yum install https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm
      sudo yum install pgbackrest -y
      
    • Configure Pgbackrest on all nodes

      /etc/pgbackrest.conf
      [global]
      repo1-type=s3
      repo1-path=/backup
      repo1-s3-uri-style=path
      repo1-s3-region=us-ashburn-1
      repo1-s3-endpoint=https://xxxxx0011.compat.objectstorage.us-ashburn-1.oraclecloud.com
      repo1-s3-key=0000000a90d57eddexxxxxa3dc7c4f700000ed2e6
      repo1-s3-key-secret=1UhXj69xxxx1e6OyF+00000cccccyyyuuu=
      repo1-s3-bucket=pg_backup
      repo1-retention-full=10
      log-level-console=info
      log-level-file=debug
      log-path=/var/log/pgbackrest/
      
      [stanza-name]
      pg1-path=/opt/pgsql/data
      pg1-user=Postgres
      
    • Create a bootstrap file on all nodes

      vi /etc/patroni/boot_pgbackrest.sh
      
      #!/ usr / bin /env bash
      while getopts ": -:" optchar ; do
              [[ "${ optchar }" == "-" ]] || continue
              case "${ OPTARG }" in
              datadir =* )
                      DATA_DIR =${ OPTARG #*=}
                      ;;
              scope =* )
                      SCOPE =${ OPTARG #*=}
                      ;;
              esac
      done
      /bin/pgbackrest --stanza=$SCOPE --link-all restore
      
  6. Configure Patroni on all nodes.

    Hostname IP Address Availability Domain
    pg-db-01 192.0.2.1 AD1
    pg-db-02 192.0.2.2 AD2
    pg-db-03 192.0.2.3 AD3

    Note: Modify IP’s as per the requirement.

    mkdir -p /etc/patroni
    mkdir -p /opt/pgsql/patroni
    chown postgres:postgres -R /opt/pgsql/patroni
    chmod 700 /opt/pgsql/patroni
    
    vi /etc/patroni/patroni.yml
    
    scope:pg-ha-cluster
    name:pg-db-01
    namespace:/opt/pgsql/patroni/
    ###
    restapi:
      listen: "192.0.2.1:8008"
      connect_address: "192.0.2.1:8008"
    ###ETCD Configuration
     etcd:
       hosts: "192.0.2.4:2379,192.0.2.5:2379, 192.0.2.6:2379"
     ###Bootstrap
     bootstrap:
       dcs:
         ttl: 120
         loop_wait: 10
         retry_timeout: 10
         maximum_lag_on_failover: 1048576
         postgresql:
           use_pg_rewind: true
           use_slots: true
         parameters:
           archive_command: "pgbackrest --stanza=<stanza-name> archive-push %p"
           archive_mode: on
           archive_timeout: 900s
           log_file_mode: "0640"
           log_filename: postgresql-%u.log
           log_rotation_age: 1d
           log_truncate_on_rotation: "on"
           logging_collector: "on"
           max_connections: 2000
           max_replication_slots: 10
           max_wal_senders: 10
           max_wal_size: 5GB
           max_worker_processes: 40
           min_wal_size: 1GB
           wal_level: "replica"
           password_encryption: scram-sha-256
           superuser_reserved_connections: 200
           create_replica_methods:
             - pgbackrest
           pgbackrest:
             command: "/usr/bin/pgbackrest --stanza=<stanza-name>  restore --delta --link-all"
             keep_data: True
             no_params: True
         recovery_conf:
           recovery_target_timeline: latest
           restore_command: '/usr/bin/pgbackrest --stanza=<stanza-name>  archive-get %f %P'
        method: pgbackrest
        pgbackrest:
          command: /etc/patroni/boot_pgbackrest.sh
          keep_existing_recovery_conf: False
          recovery_conf:
            recovery_target_timeline: latest
            restore_command: '/usr/bin/pgbackrest --stanza=<stanza-name>  archive-get %f %P'
       initdb:
         - encoding: UTF8
         - data-checksums
       pg_hba:
         - host replication replicator 127.0.0.1/32 md5
         - host replication replicator 192.0.2.1/32 md5
         - host replication replicator 192.0.2.2/32 md5
         - host replication replicator 192.0.2.3/32 md5
         - host all all x.x.x.0/0 md5
       users:
         admin:
           password: admin
           options:
             - createrole
             - createdb
     #####Local Postgresql Parameters
     postgresql:
     listen: "192.0.2.1:5432"
     connect_address: "192.0.2.1:5432"
     data_dir: /opt/pgsql/data
     pgpass: /opt/pgsql/patroni/pgpass
     authentication:
       replication:
         username: replicator
         password: password
       superuser:
         username: postgres
         password: password
     #    rewind:
     #      username: replicator
     #      password: password
     parameters:
       unix_socket_directories: "/var/run/postgresql, /tmp"
     ###Any Tags
     tags:
       nofailover: false
       noloadbalance: false
       clonefrom: false
       nosync: false
    

    Note: Change the IP for each node respectively.

    chmod 640 /etc/patroni/patroni.yml
    chown postgres:postgres -R /etc/patroni/patroni.yml
    
    • Create Patroni Service on all nodes
    vi /etc/systemd/system/patroni.service
    
    [Unit]
    Description=Runners to orchestrate a high-availability PostgreSQL - patroni
    After=syslog.target network.target
    
    [Service]
    Type=simple
    
    User=postgres
    Group=postgres
    
    # Read in configuration file if it exists, otherwise proceed
    EnvironmentFile=-/etc/patroni_env.conf
    
    WorkingDirectory=~
    
    # Where to send early-startup messages from the server
    # This is normally controlled by the global default set by systemd
    # StandardOutput=syslog
    
    # Pre-commands to start watchdog device
    # Uncomment if watchdog is part of your patroni setup
    #ExecStartPre=-/usr/bin/sudo /sbin/modprobe softdog
    #ExecStartPre=-/usr/bin/sudo /bin/chown postgres /dev/watchdog
    
    # Start the patroni process
    ExecStart=/usr/local/bin/patroni /etc/patroni/patroni.yml
    
    # Send HUP to reload from patroni.yml
    ExecReload=/bin/kill -s HUP $MAINPID
    
    # Only kill the patroni process, not its children, so it will gracefully stop postgres
    KillMode=process
    
    # Give a reasonable amount of time for the server to start up/shut down
    TimeoutSec=60
    
    # Do not restart the service if it crashes, we want to manually inspect database on failure
    Restart=no
    
    [Install]
    WantedBy=multi-user.target
    
    sudo systemctl daemon-reload
    sudo systemctl enable patroni
    sudo systemctl start patroni
    sudo systemctl status patroni
    

    Note: On Replica nodes, configure Patroni, however don’t start the services.

    • Run the following command to see Patroni cluster status on Master.

      patronictl -c /etc/patroni/patroni.yml list
      
      [root@pg-db-01 ~]# /usr/local/bin/patronictl -c /etc/patroni/patroni.yml list
      +--------------+-------------+---------+---------+----+-----------+
      | Member       | Host        | Role    | State   | TL | Lag in MB |
      + Cluster: pg-ha-cluster (7089354141421597068) --+----+-----------+
      | pg-db-01 | 192.0.2.1 | Leader  | running | 1 |           |
      +--------------+-------------+---------+---------+----+-----------+
      
  7. Create Stanza and take full backup.

    pgbackrest --stanza=<stanza-name> stanza-create
    pgbackrest  --type=full --stanza= pgha --process-max=10 backup
    pgbackrest info
    
  8. Start Patroni on Replica nodes and it should start catching up automatically from the Leader.

    sudo systemctl start patroni
    sudo systemctl status patroni
    
    • Run the following command to see cluster status:
    patronictl -c /etc/patroni/patroni.yml list
    

    Status should be as below:

    [root@pg-db-01 ~]# /usr/local/bin/patronictl -c /etc/patroni/patroni.yml list
    +--------------+-------------+---------+---------+----+-----------+
    | Member       | Host        | Role    | State   | TL | Lag in MB |
    + Cluster: pg-ha-cluster (7089354141421597068) --+----+-----------+
    | pg-db-01 | 192.0.2.1 | Leader  | running | 2 |           |
    | pg-db-02 | 192.0.2.2  | Replica | running |2 |         0 |
    | pg-db-03 | 192.0.2.3  | Replica | running | 2 |         0 |
    +--------------+-------------+---------+---------+----+-----------+
    
  9. Finally setup the cron jobs for log cleanup, backups, and so on.

    crontab –e
    #Pgbackrest FULL Backup on Sunday @1AM
    00 01 * * 0 postgres pgbackrest  --type=full --stanza=ash1-adeprod1-pgcluster --process-max=10 backup  &> /dev/null
    #Pgbackrest INC Backup on Mon-Sat @1AM
    00 01 * * 1-6 postgres pgbackrest  --type=diff --stanza=ash1-adeprod1-pgcluster --process-max=10 backup &> /dev/null
    #Delete PostgreSQL logs
    08 * * * find /opt/pgsql/log/* -mtime +15 -exec rm {} \; &>/dev/null
    

Additional Patroni Commands

Acknowledgments

More Learning Resources

Explore other labs on docs.oracle.com/learn or access more free learning content on the Oracle Learning YouTube channel. Additionally, visit education.oracle.com/learning-explorer to become an Oracle Learning Explorer.

For product documentation, visit Oracle Help Center.