6 Working With Quorum Devices
A quorum device acts as a third-party arbitrator in the event where standard quorum rules might not adequately cater for node failure. A quorum device is typically used where there may be an even number of nodes in a cluster. For example, in a cluster that contains two nodes failure of the nodes to communicate can result in a split-brain issue where both nodes function as primary at the same time, which results in possible data corruption. By using a quorum device, quorum arbitration can be achieved and a selected node survives.
A quorum device is a service that ideally runs on a separate physical network to the cluster itself. It should run on a system that's not a node in the cluster. Although the quorum device can service multiple clusters at the same time, it should be the only quorum device for each cluster that it serves. Each node in the cluster is configured for the quorum device. The quorum device is installed and run as a network bound service on a system outside of the cluster network.
Installing and Enabling a Quorum Device
Installation of the quorum device requires that you install the pcs
and
corosync-qnetd
packages on the system where you intend to run the quorum
device service and then install the corosync-qdevice
package on each of the
nodes in the existing cluster.
- On the system assigned to run the quorum device service,
run:
sudo dnf install -y pcs corosync-qnetd
- Enable and start the systemd
pcsd
service by running:sudo systemctl enable --now pcsd
- If you're running a firewall on the quorum device service host, you must open the firewall
ports to allow the host to communicate with the cluster. For example,
run:
sudo firewall-cmd --permanent --add-service=high-availability sudo systemctl restart firewalld
- On the quorum device service host, enable and start the quorum device service by
setting the Pacemaker configuration for the node to use the
net
model. Run:sudo pcs qdev setup model net --enable --start
This command creates a configuration for the host and names the node
qdev
. It sets the model tonet
and enables and starts the node. The command triggers thecorosync-qnetd
daemon to load and run at boot. - On each of the nodes within the existing cluster, install the
corosync-qdevice
package by running:sudo dnf install -y corosync-qdevice
Configuring the Cluster for a Quorum Device
The node running the quorum device service must be authenticated to the rest of the cluster and must then be added to the cluster. When you add the quorum device service node, you can set configuration options such as which algorithm to use to determine quorum. After the quorum device is added to the cluster you can verify the quorum device status to check that the device is functioning correctly.
- Authenticate the quorum device service node to the cluster. On a node within the existing
cluster, to authenticate the node named qdev,
run:
sudo pcs host auth qdev
You're prompted for the cluster username and password.
- Check that no quorum device is already configured for the cluster. A cluster must never
have more than one quorum device configured. On a node within the existing cluster,
run:
sudo pcs quorum status
Note that the output includes membership information:Under the Qdevice column, the value NR is displayed. The NR value indicates that no quorum devices are registered with any of the nodes within the cluster. If any other value is displayed, don't proceed with adding another quorum device to the cluster without removing the existing device first.... Membership information ---------------------- Nodeid Votes Qdevice Name 1 1 NR node1 (local) 2 1 NR node2
- Add the quorum device to the cluster. On one of the nodes within the existing cluster,
run:
Note that you specify the host to match the host where you're running the quorum device service, in this case named qdev; and the algorithm that you want to use to determine quorum, in this case ffsplit.sudo pcs quorum device add model net host=qdev algorithm=ffsplit
Algorithm options are:
ffsplit
: is a fifty-fifty split algorithm that favors the partition with the highest number of active nodes in the cluster.lms
: is a last-man-standing algorithm that returns a vote for the nodes that are still able to connect to the quorum device service node. If a single node is still active and it can connect to the quorum device service, the cluster remains quorate. If none of the nodes can connect to the quorum device service and any one node loses connection with the rest of the cluster, the cluster becomes inquorate.
See the
corosync-qdevice(8)
manual page for more information. - Verify that the quorum device is configured within the cluster. On any node in the
existing cluster, run:
The output displays that a quorum device is configured and indicates the algorithm that is in use:sudo pcs quorum config
You can also query the quorum status for the cluster by running:Options: Device: Model: net algorithm: ffsplit host: qdev
The output displays the quorum status.sudo pcs quorum status
Note that the membership information now displays values A,V,NMW for the Qdevice field. Values for this field can be equal to any of the following:Quorum information ------------------ Date: Fri Jul 15 14:19:07 2022 Quorum provider: corosync_votequorum Nodes: 2 Node ID: 1 Ring ID: 1/8272 Quorate: Yes Votequorum information ---------------------- Expected votes: 3 Highest expected: 3 Total votes: 3 Quorum: 2 Flags: Quorate Qdevice Membership information ---------------------- Nodeid Votes Qdevice Name 1 1 A,V,NMW node1 (local) 2 1 A,V,NMW node2 0 1 Qdevice
A/NA
: indicates that the quorum device is alive or not alive to each node in the cluster.V/NV
: indicates whether the quorum device has provided a vote to a node. In the case where the cluster is split, one node would be set to V and the other to NV.-
MW/NMW
: indicates whether the quorum devicemaster_wins
flag is set. Any node with an active quorum device that also has themaster_wins
flag set becomes quorate regardless of the node votes of the cluster. By default the option is unset.
Managing Quorum Devices
The quorum device service must be managed from the host system where the quorum device service is running.
Quorum configuration for the cluster and the configuration of the quorum device on the cluster nodes is performed by running operations on any of the nodes within the cluster itself.
Controlling the Quorum Device Service
You can perform various operations to directly control the quorum device service. Commands that control the quorum device service must be run on the host where the quorum device service is running.
- To view the full status for the service,
run:
Output similar to the following is displayed:sudo pcs qdevice status net --full
QNetd address: *:5403 TLS: Supported (client certificate required) Connected clients: 2 Connected clusters: 1 Maximum send/receive size: 32768/32768 bytes Cluster "test": Algorithm: ffsplit Tie-breaker: Node with lowest node ID Node ID 2: Client address: ::ffff:192.168.2.25:33526 HB interval: 8000ms Configured node list: 1, 2 Ring ID: 1.16 Membership node list: 1, 2 TLS active: Yes (client certificate verified) Vote: ACK (ACK) Node ID 1: Client address: ::ffff:192.168.2.26:48786 HB interval: 8000ms Configured node list: 1, 2 Ring ID: 1.16 Membership node list: 1, 2 TLS active: Yes (client certificate verified) Vote: ACK (ACK)
- To start the service, run:
sudo pcs qdevice start net
- To stop the service, run:
sudo pcs qdevice stop net
- To enable the service so that it runs at boot time,
run:
sudo pcs qdevice enable net
- To disable the service to prevent it from restarting at boot,
run:
sudo pcs qdevice disable net
- To force the service to stop if the normal stop process is not working,
run:
sudo pcs qdevice kill net
Updating Quorum Device Settings
The quorum device can be updated in the cluster configuration at any time. Modifications to the quorum device configuration must be performed on a node within the cluster. Typically modifications to the quorum device involve changing the algorithm, however you can modify other options that are available for a quorum device in the same way.
To update the algorithm used for the quorum device, run:
sudo pcs quorum device update model algorithm=lms
The example changes the algorithm to use the lms
or last-man-standing
algorithm.
Note:
You can't update the host for a quorum device. You must remove the device and add it back into the cluster if you need to change the host.Removing the Quorum Device From the Cluster
sudo pcs quorum device remove
Removing the quorum device updates the cluster configuration to remove any configuration entries for the quorum device, reloads the cluster configuration into the cluster and then disables and stops the quorum device on each node.
Because you might use the same quorum device service across multiple clusters, removing the quorum device from the cluster doesn't affect the quorum device service in any way. The service continues to run on the service host, but no longer serves the cluster where it has been removed.
Destroying the Quorum Device Service
sudo pcs qdevice destroy net
Note:
Remove the quorum device from any clusters that it services before destroying the quorum device service.