6.8.1 Clustering for x86 Server Pools

Oracle VM works in concert with Oracle OCFS2 to provide shared access to server pool resources residing in an OCFS2 file system. This shared access feature is crucial in the implementation of high availability (HA) for virtual machines running on x86 Oracle VM Servers that belong to a server pool with clustering enabled.

OCFS2 is a cluster file system developed by Oracle for Linux, which allows multiple nodes (Oracle VM Servers) to access the same disk at the same time. OCFS2, which provides both performance and HA, is used in many applications that are cluster-aware or that have a need for shared file system facilities. With Oracle VM, OCFS2 ensures that Oracle VM Servers belonging to the same server pool access and modify resources in the shared repositories in a controlled manner.

The OCFS2 software includes the core file system, which offers the standard file system interfaces and behavioral semantics and also includes a component which supports the shared disk cluster feature. The shared disk component resides mostly in the kernel and is referred to as the O2CB cluster stack. It includes:

  • A disk heartbeat to detect live servers.

  • A network heartbeat for communication between the nodes.

  • A Distributed Lock Manager (DLM) which allows shared disk resources to be locked and released by the servers in the cluster.

OCFS2 also offers several tools to examine and troubleshoot the OCFS2 components. For detailed information on OCFS2, see the OCFS2 documentation at:

http://oss.oracle.com/projects/ocfs2/documentation/

Oracle VM decouples storage repositories and clusters so that if a storage repository is taken off-line, the cluster is still available. A loss of one heartbeat device does not force an Oracle VM Server to self fence.

When you create a server pool, you have a choice to activate the cluster function which offers these benefits:

  • Shared access to the resources in the repositories accessible by all Oracle VM Servers in the cluster.

  • Protection of virtual machines in the event of a failure of any Oracle VM Server in the server pool.

You can choose to configure the server pool cluster and enable HA in a server pool, when you create or edit a server pool within Oracle VM Manager. See Create Server Pool and Edit Server Pool in the Oracle VM Manager User's Guide for more information on creating and editing a server pool.

During server pool creation, the server pool file system specified for the new server pool is accessed and formatted as an OCFS2 file system. This formatting creates several management areas on the file system including a region for the global disk heartbeat. Oracle VM formats the server pool file system as an OCFS2 file system whether the file system is accessed by the Oracle VM Servers as an NFS share, a FC LUN or iSCSI LUN. See Section 3.8, “How is Storage Used for Server Pool Clustering?” for more information on how storage is used for the cluster file system and the requirements for a stable cluster heartbeat function.

As Oracle VM Servers are added to a newly created server pool, Oracle VM:

  1. Selects a Master Oracle VM Server.

  2. Configures the Virtual IP address selected during pool creation as a virtual network interface on top of the management interface for the Master Oracle VM Server.

  3. Creates the cluster configuration file and the cluster time-out file.

  4. Pushes the configuration files to all Oracle VM Servers in the server pool.

  5. Starts the cluster.

On each Oracle VM Server in the cluster, the cluster configuration file is located at /etc/ocfs2/cluster.conf, and the cluster time-out file is located at /etc/sysconfig/o2cb. Cluster timeout can be configured during server pool creation. The recommended approach to setting timeout values is to use the functionality provided within the Oracle VM Manager when creating or editing a server pool. See Create Server Pool and Edit Server Pool in the Oracle VM Manager User's Guide. for more information on setting the timeout value.

Starting the cluster activates several services and processes on each of the Oracle VM Servers in the cluster. The most important processes and services are discussed in Table 6.1, “Cluster services ”.

Table 6.1 Cluster services

Service

Description

o2net

The o2net process creates TCP/IP intra-cluster node communication channels on port 7777 and sends regular keep-alive packages to each node in the cluster to validate if the nodes are alive. The intra-cluster node communication uses the network with the Cluster Heartbeat role. By default, this is the Server Management network. You can however create a separate network for this function. See Section 5.6, “How are Network Functions Separated in Oracle VM?” for information about the Cluster Heartbeat role. Make sure the firewall on each Oracle VM Server in the cluster allows network traffic on the heartbeat network. By default, the firewall is disabled on Oracle VM Servers after installation.

o2hb-diskid

The server pool cluster also employs a disk heartbeat check. The o2hb process is responsible for the global disk heartbeat component of cluster. The heartbeat feature uses a file in the hidden region of the server pool file system. Each pool member writes to its own block of this region every two seconds, indicating it is alive. It also reads the region to maintain a map of live nodes. If a server pool member's block is no longer updated, the Oracle VM Server is considered dead. If an Oracle VM Server dies, the Oracle VM Server is fenced. Fencing forcefully removes dead members from the server pool to make sure active pool members are not obstructed from accessing the fenced Oracle VM Server's resources.

o2cb

The o2cb service is central to cluster operations. When an Oracle VM Server boots, the o2cb service starts automatically. This service must be up for the mount of shared repositories to succeed.

ocfs2

The OCFS2 service is responsible for the file system operations. This service also starts automatically.

ocfs2_dlm and ocfs2_dlmfs

The DLM modules (ocfs2_dlm, ocfs2_dlmfs) and processes (user_dlm, dlm_thread, dlm_wq, dlm_reco_thread, and so on) are part of the Distributed Lock Manager.

OCFS2 uses a DLM to track and manage locks on resources across the cluster. It is called distributed because each Oracle VM Server in the cluster only maintains lock information for the resources it is interested in. If an Oracle VM Server dies while holding locks for resources in the cluster, for example, a lock on a virtual machine, the remaining Oracle VM Servers in the server pool gather information to reconstruct the lock state maintained by the dead Oracle VM Server.


Warning

Do not manually modify the cluster configuration files, or start and stop the cluster services. Oracle VM Manager automatically starts the cluster on Oracle VM Servers that belong to a server pool. Manually configuring or operating the cluster may lead to cluster failure.

When you create a repository on a physical disk, an OCFS2 file system is created on the physical disk. This occurs for local repositories as well. The resources in the repositories, for example, virtual machine configuration files, virtual disks, ISO files, templates and assemblies, can then be shared safely across the server pool. When a server pool member stops or dies, the resources owned by the departing server are recovered, and the change in status of the server pool members is propagated to all the remaining Oracle VM Servers in the server pool.

Figure 6.1, “Server Pool clustering with OCFS2 features” illustrates server pool clustering, the disk and network heartbeats, and the use of the DLM feature to lock resources across the cluster.

Figure 6.1 Server Pool clustering with OCFS2 features

This figure shows an Oracle VM configuration with a clustered server pool. It has shared attached storage on a fibre channel disk subsystem and a server pool file system on an NFS server. The surrounding text explains the functions and features of clustering and OCFS2.

Figure 6.1, “Server Pool clustering with OCFS2 features” represents a server pool with three Oracle VM Servers. The server pool file system associated with this server pool resides on an NFS share. During server pool creation, the NFS share is accessed, a disk image is created on the NFS share and the disk image is formatted as an OCFS2 file system. This technique allows all Oracle VM Server pool file systems to be accessed in the same manner, using OCFS2, whether the underlying storage element is an NFS share, an iSCSI LUN or a Fibre Channel LUN.

The network heartbeat, which is illustrated as a private network connection between the Oracle VM Servers, is configured before creating the first server pool in your Oracle VM environment. After the server pool is created, the Oracle VM Servers are added to the server pool. At that time, the cluster configuration is created, and the cluster state changes from off-line to heartbeating. Finally, the server pool file system is mounted on all Oracle VM Servers in the cluster and the cluster state changes from heartbeating to DLM ready. As seen in Figure 6.1, “Server Pool clustering with OCFS2 features”, the heartbeat region is global to all Oracle VM Servers in the cluster, and resides on the server pool file system. Using the network heartbeat, the Oracle VM Servers establish communication channels with other Oracle VM Servers in the cluster, and send keep-alive packets to detect any interruption on the channels.

For each newly added repository on a physical storage element, an OCFS2 file system is created on the repository, and the repository is usually presented to all Oracle VM Servers in the pool. Figure 6.1, “Server Pool clustering with OCFS2 features” shows that Repository 1 and Repository 2 are accessible by all of the Oracle VM Servers in the pool. While this is the usual configuration, it is also feasible that a repository is accessible by only one Oracle VM Server in the pool. This is indicated in the figure by Repository 3, which is accessible by Oracle VM Server 1 only. Any virtual machine whose resources reside on this repository cannot take advantage of the high availability feature afforded by the server pool.

Note that repositories built on NFS shares are not formatted as OCFS2 file systems. See Section 3.9, “Where are Virtual Machine Resources Located?” for more information on repositories.

Figure 6.1, “Server Pool clustering with OCFS2 features” shows several virtual machines with resources in shared Repositories 1 and 2. As virtual machines are created, started, stopped, or migrated, the resources for these virtual machines are locked by the Oracle VM Servers needing these resources. Each Oracle VM Server ends up managing a subset of all the locked resources in the server pool. A resource may have several locks against it. An exclusive lock is requested when anticipating a write to the resource while several read-only locks can exist at the same time on the same resource. Lock state is kept in memory on each Oracle VM Server as shown in the diagram. The distributed lock manager (DLM) information kept in memory is exposed to user space in the synthetic file system called dlmfs, mounted under /dlm. If an Oracle VM Server fails, its locks are recovered by the other Oracle VM Servers in the cluster and virtual machines running on the failed Oracle VM Server are restarted on another Oracle VM Server in the cluster. If an Oracle VM Server is no longer communicating with the cluster via the heartbeat, it can be forcibly removed from the cluster. This is called fencing. An Oracle VM Server can also fence itself if it realizes that it is no longer part of the cluster. The Oracle VM Server uses a machine reset to fence. This is the quickest way for the Oracle VM Server to rejoin the cluster.