部署自动虚拟 IP 故障转移

在与私有云一体机集成和测试辅助 IP 故障转移之前,必须在集群上安装 Corosync 和 Pacemaker。

在集群上安装

  1. 安装 Corosync 和 Pacemaker 软件包。
    yum install corosync pacemaker pcs
    yum install python36-oci-cli
    根据实验室要求设置代理以访问 yum 存储库。
  2. 备份集群的回退配置。
    sudo cp /usr/lib/ocf/resource.d/heartbeat/IPaddr2 /usr/lib/ocf/resource.d/heartbeat/IPaddr2.bck
  3. 运行 firewalld,根据客户对集群的安全需求进行定义。
  4. 更新 /etc/hosts 文件,为所有群集节点添加节点的 IP 和主机名信息,以设置本地 DNS 系统。
    示例:
    <Instance-Node1 Private IP> node1-name
    <Instance-Node2 Private IP> node2-name
  5. 创建用户并设置密码。
    sudo passwd hacluster
  6. 启动并启用群集服务。
    sudo systemctl start pcsd
    sudo systemctl enable pacemaker
    sudo systemctl enable corosync
    sudo systemctl enable pcsd
  7. 验证群集验证。
    sudo pcs cluster auth <node1-name> <node2-name>
    <node1-name> : Cluster Node1/Hostname
    <node2-name> : Cluster Node1/Hostname
    示例:
    sudo pcs cluster auth a-node1 a-node2
    Username: hacluster
    Password: 
    a-node1: Authorized
    a-node2: Authorized
  8. 群集设置。
    sudo pcs cluster setup --name <clustername> <node1-name> <node2-name>
    <clustername> : Cluster Name
    <node1-name> : Cluster Node1/Hostname
    <node2-name> : Cluster Node1/Hostname
    示例:
    sudo pcs cluster auth a-node1 a-node2
    Username: hacluster
    Password: 
    a-node1: Authorized
    a-node2: Authorized
    [root@a-node1 opc] # sudo pcs cluster setup HACluster a-node1 a-node2
    Error: A cluster name (--name <name>) is required to setup a cluster
    [root@a-node1 opc] # sudo pcs cluster setup --name HACluster a-node1 a-node2
    Destroying cluster on nodes: a-node1, a-node2...
    a-node1: Stopping Cluster (pacemaker)...
    a-node2: Stopping Cluster (pacemaker)...
    a-node2: Successfully destroyed cluster
    a-node1: Successfully destroyed cluster
    Sending 'pacemaker_remote authkey' to 'a-node1', 'a-node2'
    a-node1: successful distribution of the file 'pacemaker_remote authkey'
    a-node2: successful distribution of the file 'pacemaker_remote authkey'
    Sending cluster config files to the nodes...
    a-node1: Succeeded
    a-node2: Succeeded
    Synchronizing pcsd certificates on nodes a-node1, a-node2...
    a-node1: Success
    a-node2: Success
    Restarting pcsd on the nodes in order to reload the certificates...
    a-node1: Success
    a-node2: Success
  9. 从任何群集节点为所有群集节点启动群集。
    sudo pcs cluster start --name clustername -–all
    示例:
    sudo pcs cluster start --name HACluster --all
    a-node1: Starting Cluster (corosync)...
    a-node2: Starting Cluster (corosync)...
    a-node2: Starting Cluster (pacemaker)...
    a-node1: Starting Cluster (pacemaker)...
  10. 设置快捷键属性。
    sudo pcs property set stonith-enabled=false
    sudo pcs property set no-quorum-policy=ignore
  11. 验证正在运行的群集状态。
    sudo pcs cluster status
    示例:
    sudo pcs cluster status
    Cluster Status:
    Stack: corosync
    Current DC: a-node2 (version 1.1.23-1.0.1.el7_9.1-9acf116022) - partition with quorum
    Last updated: Fri Aug 19 03:07:25 2022
    Last change: Fri Aug 19 03:06:13 2022 by root via cibadmin on a-node1
    2 nodes configured
    0 resource instances configured
    PCSD Status:
    a-node1: Online
    a-node2: Online
  12. 根据所有集群节点上的私有云一体机设置设置 OCI 配置。设置配置概要信息对于连接到私有云一体机非常重要。
    示例:i.e. /root/.oci/config
    [DEFAULT]
    user=<User-ocid1>
    fingerprint=<fingerprint>
    key_file=<Key-Location>
    tenancy=<Tenancy ocid1>
    region=<PCA FQDN>

在集群上定义心跳设置并集成到用于 VIP 故障转移的 Private Cloud Appliance X9-2 实例

  1. 对于专用云设备,请运行:
    sudo sed -i '633i\export OCI_CLI_CERT_BUNDLE=/root/.oci/ca-chain.cert.pem\' /usr/lib/ocf/resource.d/heartbeat/IPaddr2
    您可以改用 assign-private-ip 命令传递私有云一体机证书。例如:--cert-bundle <Certification Location for Private Cloud Appliance> 选项。
    下面是 Oracle Linux 7。9 的示例。对于 Oracle Linux 8 和 Centos 操作系统,请在 IPaddr2 中查找 add_interface () 函数行号,并在更新 Linux HA IPaddr2 资源条目时相应地更改行号。
    sudo sed -i '628i\server="`hostname -s`"\' /usr/lib/ocf/resource.d/heartbeat/IPaddr2
    sudo sed -i '629i\node1vnic="ocid1.vnic.pca.NODE1-vNIC-OCID"\' 
    /usr/lib/ocf/resource.d/heartbeat/IPaddr2
    sudo sed -i '630i\node2vnic="ocid1.vnic.pca.NODE2-vNIC-OCID"\' 
    /usr/lib/ocf/resource.d/heartbeat/IPaddr2
    sudo sed -i '631i\vnicip="10.212.15.13"\' /usr/lib/ocf/resource.d/heartbeat/IPaddr2
    sudo sed -i '632i\export LC_ALL=C.UTF-8\' /usr/lib/ocf/resource.d/heartbeat/IPaddr2
    sudo sed -i '633i\export LANG=C.UTF-8\' /usr/lib/ocf/resource.d/heartbeat/IPaddr2
    sudo sed -i '633i\export OCI_CLI_CERT_BUNDLE=/root/.oci/ca-chain.cert.pem\' 
    /usr/lib/ocf/resource.d/heartbeat/IPaddr2
    sudo sed -i '634i\touch /tmp/error.log\' /usr/lib/ocf/resource.d/heartbeat/IPaddr2
    sudo sed -i '635i\##### OCI/IPaddr Integration\' /usr/lib/ocf/resource.d/heartbeat/IPaddr2
    sudo sed -i '636i\ if [ $server = "node1" ]; then\' 
    /usr/lib/ocf/resource.d/heartbeat/IPaddr2
    sudo sed -i '637i\ oci network vnic assign-private-ip --unassign-if-already-assigned --vnic-id $node1vnic --ip-address $vnicip >/tmp/error.log 2>&1\' 
    /usr/lib/ocf/resource.d/heartbeat/IPaddr2
    sudo sed -i '638i\ else \' /usr/lib/ocf/resource.d/heartbeat/IPaddr2
    sudo sed -i '639i\ oci network vnic assign-private-ip --unassign-if-already-assigned --vnic-id $node2vnic --ip-address $vnicip >/tmp/error.log 2>&1\' 
    /usr/lib/ocf/resource.d/heartbeat/IPaddr2
    sudo sed -i '640i\ fi \' /usr/lib/ocf/resource.d/heartbeat/IPaddr2
    
  2. IPaddr2 中实施代码修改。
    • ocid1.vnic.pca.NODE1-vNIC-OCIDocid1.vnic.pca.NODE2-vNIC-OCID 替换为您自己的 OCI VNIC(虚拟网络接口卡)OCID。
    • node1node2 主机名条目替换为您自己的集群节点主机名。
    • OCI_CLI_CERT_BUNDLE 中,定义专用云设备的 CERT 捆绑包位置。
    • 对于 VNIC IP,请根据您的配置、子网、VCN 定义 VNIC IP,并确保这是唯一的 IP,未分配给任何其他 VNIC。

设置群集资源

要设置群集资源,请运行:

pcs resource create <Cluster-Resource-Name> ocf:heartbeat:IPaddr2 ip=10.212.15.13 cidr_netmask=24 op monitor interval=20

注:

  • Pacemaker 命令中的 cidr_netmask=24 取决于子网大小为 /24。
  • ip=10.212.15.13 是辅助专用 IP。

示例:

pcs status
Cluster name: HACluster
Stack: corosync
Current DC: a-node2 (version 1.1.23-1.0.1.el7_9.1-9acf116022) - partition with quorum
Last updated: Fri Aug 19 03:34:51 2022
Last change: Fri Aug 19 03:34:44 2022 by root via cibadmin on a-node1
2 nodes configured
1 resource instance configured
Online: [ a-node1 a-node2 ]
Full list of resources:
HAFailover (ocf::heartbeat:IPaddr2): Started a-node1
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabled

测试辅助 IP 的故障转移

要测试辅助 IP 运行的故障转移,请执行以下操作:

sudo pcs resource move <cluster-name> <node2-name>
<clustername> : Cluster Name
<node2-name> : Cluster Node1/Hostname

例如:

sudo pcs resource move HAFailover a-node2

验证成功的故障转移

应在 node2 上启动资源。要验证故障转移是否成功,请运行:

# pcs status