5.9 Desiging Cluster Heartbeat Networks

Oracle VM uses OCFS2 as its underlying clustering file system to manage its storage repositories and provide access to shared storage.

A cluster heartbeat is an essential component in any OCFS2 cluster. It is charged with accurately designating nodes (in this case, nodes are Oracle VM Servers) as dead or alive. There are two types of heartbeats used in OCFS2:

The quorum is the group of Oracle VM Servers in a cluster that is allowed to operate on the shared storage. When there is a failure in the cluster, Oracle VM Servers may be split into groups that can communicate within their groups and with the shared storage, but not between groups. In this case, OCFS2 determines which group is allowed to continue and initiates fencing of the other group(s). Fencing is the act of forcefully removing an Oracle VM Server from a cluster. An Oracle VM Server with OCFS2 mounted will fence itself when it realizes that it does not have quorum in a degraded cluster. It does this so that other Oracle VM Servers are not stuck trying to access the cluster's resources. When an Oracle VM Server is fenced, it is rebooted and rejoins the cluster. If an Oracle VM Server is fenced, the virtual machines running on the fenced Oracle VM Server are migrated and restarted on other Oracle VM Servers if the virtual machines are HA enabled (virtual machines that are not HA enabled are not migrated).

The cluster heartbeat is sensitive to network interruptions and therefore the Cluster Heartbeat network should be given special attention and be treated separately to make sure that:

For more information on the implementation of OCFS2 in Oracle VM, see Section 6.2, “Server Pool Clusters”.