Note:
- This tutorial requires access to Oracle Cloud. To sign up for a free account, see Get started with Oracle Cloud Infrastructure Free Tier.
- It uses example values for Oracle Cloud Infrastructure credentials, tenancy, and compartments. When completing your lab, substitute these values with ones specific to your cloud environment.
Set up a Linux Virtual IP Failover on Oracle Cloud Infrastructure managed by Pacemaker
Introduction
In many environments, it is still essential to use infrastructures with an active or passive Linux cluster, which require the use of floating IP(s). In cloud infrastructure, the secondary IP address must be managed not only by the operating system but also by the cloud infrastructure.
In this tutorial, we will see how the floating IP of a Linux cluster can be managed as an integrated resource by Pacemaker, in a simple manner and without custom code. For more information, see Task 3: Set up the Samba Cluster and Automatic Virtual IP Failover on Oracle Cloud Infrastructure.
Architecture Design
Objectives
- Deploy a reliable active or passive Ubuntu Linux cluster in high availability (HA) with the Oracle Cloud Infrastructure (OCI) floating IP directly managed by Pacemaker.
Prerequisites
-
Access to an OCI tenancy.
-
Two OCI Compute instances with Linux image installed (Ubuntu).
-
Install OCI Command Line Interface (CLI). For more information, see Installing the CLI.
-
Install
jq
. -
Secondary private IP address on the compute instances configured on node1. For more information, see Assigning a New Secondary Private IP to a VNIC.
-
OCI dynamic group with a policy attached. For more information, see Managing Dynamic Groups. The policy must include the following statement:
allow dynamic-group <GROUP_NAME> to use virtual-network-family in compartment id <COMPARTMENT_ID>.
-
Additional
OCIVIP
resource for Pacemaker. For more information, see ocivip resource agent on GitHub.
Task 1: Set up the Environment
-
Launch two compute instances, select Ubuntu 22 as the operating system for each instance.
-
Assign a secondary private IP address to the Virtual Network Interface Card (VNIC) to node1. For more information, see Assigning a New Secondary Private IP to a VNIC. This will be the floating IP. For example,
10.10.1.115
. -
Create a dynamic group.
-
Log in to the OCI console, navigate to Identity & Security, Dynamic Groups and click Create Dynamic Group.
-
Enter the following information.
- Name: Enter
OCIVIP
. -
Add the following rule to include instances in the specified compartment.
All {instance.compartment.id = 'Your compartment OCI ID'}
- Name: Enter
-
-
Add a policy to the dynamic group.
-
Navigate to Identity & Security, Policies and click Create Policy.
-
Enter the following information.
-
Name: Enter
OCIVIP_policy
. -
Add the following statement to allow the dynamic group to use the virtual network family:
allow dynamic-group OracleIdentityCloudService/OCIVIP to use virtual-network-family in compartment id 'Your compartment OCI ID'
-
-
Task 2: Configure the Cluster and the Floating IP
After the environment is set up, we can proceed with configuring Pacemaker and integrating the OCIVIP
resource agent. Connect to the instances using SSH and perform the cluster installation operations on both nodes up to and including step 10.
-
Update the Operating System.
sudo apt update sudo apt upgrade
-
Install the OCI CLI and verify its functionality.
bash -c "$(curl -L https://raw.githubusercontent.com/oracle/oci-cli/master/scripts/install/install.sh)"
Set up the OCI CLI.
oci setup config
Verify the OCI CLI installation.
oci os ns get
-
For a test environment, you can remove the reject rule at line 6 in the INPUT section of iptables, and then make it persistent to allow instance communication. Remember to configure iptables securely and appropriately in production environments.
sudo iptables -D INPUT 6 sudo su sudo iptables-save > /etc/iptables/rules.v4 sudo ip6tables-save > /etc/iptables/rules.v6
-
Update the
/etc/hosts
file with the private IP addresses assigned to your two instances: node1 and node2.Run the following command to edit the file.
sudo nano /etc/hosts
Add your node names and IP addresses.
10.10.1.111 node1 10.10.1.118 node2
-
Install the packages related to the cluster including jq.
sudo apt install -y pacemaker corosync pcs jq
-
Back up the
corosync.conf
file.sudo cp /etc/corosync/corosync.conf /etc/corosync/corosync.conf.bk
Edit the
corosync.conf
file.sudo nano /etc/corosync/corosync.conf
Copy the following content into the
corosync.conf
file.# Please read the corosync.conf.5 manual page system { # This is required to use transport=knet in an unprivileged # environment, such as a container. See man page for details. allow_knet_handle_fallback: yes } totem { version: 2 # Corosync itself works without a cluster name, but DLM needs one. # The cluster name is also written into the VG metadata of newly # created shared LVM volume groups, if lvmlockd uses DLM locking. cluster_name: ha_cluster transport: udpu secauth: off # crypto_cipher and crypto_hash: Used for mutual node authentication. # If you choose to enable this, then do remember to create a shared # secret with "corosync-keygen". # enabling crypto_cipher, requires also enabling of crypto_hash. # crypto works only with knet transport crypto_cipher: none crypto_hash: none } logging { # Log the source file and line where messages are being # generated. When in doubt, leave off. Potentially useful for # debugging. fileline: off # Log to standard error. When in doubt, set to yes. Useful when # running in the foreground (when invoking "corosync -f") to_stderr: yes # Log to a log file. When set to "no", the "logfile" option # must not be set. to_logfile: yes logfile: /var/log/corosync/corosync.log # Log to the system log daemon. When in doubt, set to yes. to_syslog: yes # Log debug messages (very verbose). When in doubt, leave off. debug: off # Log messages with time stamps. When in doubt, set to hires (or on) #timestamp: hires logger_subsys { subsys: QUORUM debug: off } } quorum { # Enable and configure quorum subsystem (default: off) # see also corosync.conf.5 and votequorum.5 provider: corosync_votequorum two_node: 1 wait_for_all: 1 last_man_standing: 1 auto_tie_breaker: 0 } nodelist { # Change/uncomment/add node sections to match cluster configuration node { # Hostname of the node. # name: node1 # Cluster membership node identifier nodeid: 101 # Address of first link ring0_addr: node1 # When knet transport is used it's possible to define up to 8 links #ring1_addr: 192.168.1.1 } # ... node { ring0_addr: node2 nodeid: 102 } }
-
Add the resource that Pacemaker will use to manage OCI floating IP natively into the
/usr/lib/ocf/resource.d/heartbeat/
directory. Download the content of the file from here: ocivip.txt.Note: This resource is not developed by Oracle, but by third-party developers.
This is the content of the
ocivip
file.#!/bin/sh # # # Manage Secondary Private IP in Oracle Cloud Infrastructure with Pacemaker # # # Copyright 2016-2018 Lorenzo Garuti <garuti.lorenzo@gmail.com> # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # # # # Prerequisites: # # - OCI CLI installed (https://docs.oracle.com/en-us/iaas/Content/API/SDKDocs/climanualinst.htm) # - jq installed # - dynamic group with a policy attached # - the policy must have this statement: # allow dynamic-group <GROUP_NAME> to use virtual-network-family in compartment id <COMPARTMENT_ID> # - a reserved secondary private IP address for Compute Instances high availability # ####################################################################### # Initialization: : ${OCF_FUNCTIONS_DIR=${OCF_ROOT}/lib/heartbeat} . ${OCF_FUNCTIONS_DIR}/ocf-shellfuncs ####################################################################### # # Defaults # OCF_RESKEY_ocicli_default="/usr/local/bin/oci" OCF_RESKEY_api_delay_default="3" OCF_RESKEY_cidr_netmask_default="24" OCF_RESKEY_interface_alias_default="0" export OCI_CLI_AUTH=instance_principal : ${OCF_RESKEY_ocicli=${OCF_RESKEY_ocicli_default}} : ${OCF_RESKEY_api_delay=${OCF_RESKEY_api_delay_default}} : ${OCF_RESKEY_cidr_netmask=${OCF_RESKEY_cidr_netmask_default}} : ${OCF_RESKEY_interface_alias=${OCF_RESKEY_interface_alias_default}} meta_data() { cat <<END <?xml version="1.0"?> <!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd"> <resource-agent name="ocivip"> <version>1.0</version> <longdesc lang="en"> Resource Agent for OCI Compute instance Secondary Private IP Addresses. It manages OCI Secondary Private IP Addresses for Compute instances with oci cli. See https://docs.oracle.com/en-us/iaas/Content/API/Concepts/cliconcepts.htm for more information about oci cli. Prerequisites: - OCI CLI installed (https://docs.oracle.com/en-us/iaas/Content/API/SDKDocs/climanualinst.htm) - jq installed - dynamic group with a policy attached - the policy must have this statement: allow dynamic-group GROUP_NAME to use virtual-network-family in compartment id COMPARTMENT_ID - a reserved secondary private IP address for Compute Instances high availability </longdesc> <shortdesc lang="en">OCI Secondary Private IP Address for Compute instances Resource Agent</shortdesc> <parameters> <parameter name="ocicli" unique="0"> <longdesc lang="en"> OCI Command line interface (CLI) tools </longdesc> <shortdesc lang="en">OCI cli tools</shortdesc> <content type="string" default="${OCF_RESKEY_ocicli_default}" /> </parameter> <parameter name="secondary_private_ip" unique="1" required="1"> <longdesc lang="en"> reserved secondary private ip for compute instance </longdesc> <shortdesc lang="en">reserved secondary private ip for compute instance</shortdesc> <content type="string" default="" /> </parameter> <parameter name="cidr_netmask" unique="0"> <longdesc lang="en"> netmask for the secondary_private_ip </longdesc> <shortdesc lang="en">netmask for the secondary_private_ip</shortdesc> <content type="integer" default="${OCF_RESKEY_cidr_netmask_default}" /> </parameter> <parameter name="interface_alias" unique="0"> <longdesc lang="en"> numeric alias for the interface </longdesc> <shortdesc lang="en">numeric alias for the interface</shortdesc> <content type="integer" default="${OCF_RESKEY_interface_alias_default}" /> </parameter> <parameter name="api_delay" unique="0"> <longdesc lang="en"> a short delay between API calls, to avoid sending API too quick </longdesc> <shortdesc lang="en">a short delay between API calls</shortdesc> <content type="integer" default="${OCF_RESKEY_api_delay_default}" /> </parameter> </parameters> <actions> <action name="start" timeout="30s" /> <action name="stop" timeout="30s" /> <action name="monitor" timeout="30s" interval="20s" depth="0" /> <action name="migrate_to" timeout="30s" /> <action name="migrate_from" timeout="30s" /> <action name="meta-data" timeout="5s" /> <action name="validate" timeout="10s" /> <action name="validate-all" timeout="10s" /> </actions> </resource-agent> END } ####################################################################### ocivip_usage() { cat <<END usage: $0 {start|stop|monitor|migrate_to|migrate_from|validate|validate-all|meta-data} Expects to have a fully populated OCF RA-compliant environment set. END } ocivip_start() { ocivip_monitor && return $OCF_SUCCESS $OCICLI network vnic assign-private-ip --vnic-id $VNIC_ID \ --unassign-if-already-assigned \ --ip-address ${SECONDARY_PRIVATE_IP} RETOCI=$? ip addr add ${SECONDARY_PRIVATE_IP}/${CIDR_NETMASK} dev ${PRIMARY_IFACE} label ${PRIMARY_IFACE}:${INTERFACE_ALIAS} RETIP=$? # delay to avoid sending request too fast sleep ${OCF_RESKEY_api_delay} if [ $RETOCI -ne 0 ] || [ $RETIP -ne 0 ]; then return $OCF_NOT_RUNNING fi ocf_log info "secondary_private_ip has been successfully brought up (${SECONDARY_PRIVATE_IP})" return $OCF_SUCCESS } ocivip_stop() { ocivip_monitor || return $OCF_SUCCESS $OCICLI network vnic unassign-private-ip --vnic-id $VNIC_ID \ --ip-address ${SECONDARY_PRIVATE_IP} RETOCI=$? ip addr del ${SECONDARY_PRIVATE_IP}/${CIDR_NETMASK} dev ${PRIMARY_IFACE}:${INTERFACE_ALIAS} RETIP=$? # delay to avoid sending request too fast sleep ${OCF_RESKEY_api_delay} if [ $RETOCI -ne 0 ] || [ $RETIP -ne 0 ]; then return $OCF_NOT_RUNNING fi ocf_log info "secondary_private_ip has been successfully brought down (${SECONDARY_PRIVATE_IP})" return $OCF_SUCCESS } ocivip_monitor() { $OCICLI network private-ip list --vnic-id $VNIC_ID | grep -q "${SECONDARY_PRIVATE_IP}" RETOCI=$? if [ $RETOCI -ne 0 ]; then return $OCF_NOT_RUNNING fi return $OCF_SUCCESS } ocivip_validate() { check_binary ${OCICLI} check_binary jq if [ -z "${VNIC_ID}" ]; then ocf_exit_reason "vnic_id not found. Is this a Compute instance?" return $OCF_ERR_GENERIC fi return $OCF_SUCCESS } case $__OCF_ACTION in meta-data) meta_data exit $OCF_SUCCESS ;; esac OCICLI="${OCF_RESKEY_ocicli}" SECONDARY_PRIVATE_IP="${OCF_RESKEY_secondary_private_ip}" CIDR_NETMASK="${OCF_RESKEY_cidr_netmask}" INTERFACE_ALIAS="${OCF_RESKEY_interface_alias}" VNIC_ID="$(curl -s -H "Authorization: Bearer Oracle" -L http://169.254.169.254/opc/v2/vnics/ | jq -r '.[0].vnicId')" PRIMARY_IFACE=$(ip -4 route ls | grep default | grep -Po '(?<=dev )(\S+)' | head -n1) case $__OCF_ACTION in start) ocivip_validate || exit $? ocivip_start ;; stop) ocivip_stop ;; monitor) ocivip_monitor ;; migrate_to) ocf_log info "Migrating ${OCF_RESOURCE_INSTANCE} to ${OCF_RESKEY_CRM_meta_migrate_target}." ocivip_stop ;; migrate_from) ocf_log info "Migrating ${OCF_RESOURCE_INSTANCE} from ${OCF_RESKEY_CRM_meta_migrate_source}." ocivip_start ;; reload) ocf_log info "Reloading ${OCF_RESOURCE_INSTANCE} ..." ;; validate|validate-all) ocivip_validate ;; usage|help) ocivip_usage exit $OCF_SUCCESS ;; *) ocivip_usage exit $OCF_ERR_UNIMPLEMENTED ;; esac rc=$? ocf_log debug "${OCF_RESOURCE_INSTANCE} $__OCF_ACTION : $rc" exit $rc
-
Edit the
ocivip
file and change the path of the OCI CLI executable in theOCF_RESKEY_ocicli_default
variable with your OCI CLI path.If you kept the default path during the OCI CLI installation on Ubuntu, the variable will be
/home/ubuntu/bin/oci
.OCF_RESKEY_ocicli_default="/home/ubuntu/bin/oci"
Create the file and copy the downloaded code from step 7 with the updated variable.
sudo nano /usr/lib/ocf/resource.d/heartbeat/ocivip
Change the permission and the owner of the file.
sudo chown root /usr/lib/ocf/resource.d/heartbeat/ocivip sudo chmod 755 /usr/lib/ocf/resource.d/heartbeat/ocivip
-
Enable at boot and restart the services, as well as check that they are functioning correctly.
sudo systemctl enable corosync sudo systemctl enable pacemaker sudo systemctl enable pcsd sudo systemctl restart pcsd sudo systemctl restart corosync sudo systemctl restart pacemaker sudo systemctl status pcsd sudo systemctl status corosync sudo systemctl status pacemaker
-
Set the password for the user
ocicluster
.sudo passwd ocicluster
-
Run the following command to authenticate the nodes.
sudo pcs cluster auth node1 node2 -u ocicluster -p YOUR_PASSWORD
-
Create the cluster.
sudo pcs cluster setup ha_cluster node1 node2
-
Start and enable the cluster at boot on all nodes.
sudo pcs cluster start --all sudo pcs cluster enable --all
-
Check that the cluster is active and functioning.
sudo pcs status
-
Add the
OCIVIP
resource for managing the floating IP.Note: Change the virtual IP address to the one assigned as secondary to your VNIC in step 2 of this tutorial.
sudo pcs resource create OCIVIP ocf:heartbeat:ocivip secondary_private_ip="10.10.1.115" cidr_netmask="24" op monitor timeout="30s" interval="20s" OCF_CHECK_LEVEL="0"
-
Verify that the resource has been added correctly and functioning correctly.
sudo pcs status
-
Verify that the secondary IP address can migrate between instances, for example by restarting node1, and check the OCI Console that it is assigned to the other instance and vice versa.
Before restarting node1, you can also ping the floating address from a third virtual machine and check that it continues to respond after node1 is shut down. A brief interruption of a few hops is normal.
Your active and passive cluster is up and running. You can now add the services that require business continuity.
Related Links
Acknowledgments
-
Author - Marco Santucci (EMEA Enterprise Cloud Solution Architect)
-
Contributor - Lorenzo Garuti (Developer of ocivip resource file)
More Learning Resources
Explore other labs on docs.oracle.com/learn or access more free learning content on the Oracle Learning YouTube channel. Additionally, visit education.oracle.com/learning-explorer to become an Oracle Learning Explorer.
For product documentation, visit Oracle Help Center.
Set up a Linux Virtual IP Failover on Oracle Cloud Infrastructure managed by Pacemaker
G12923-01
August 2024