High Availability and Optimization
You can configure Oracle Linux Virtualization Manager so that your cluster is optimized and your hosts and virtual machine are highly available. You can also enable or disable devices (hot plug) while a virtual machine is running.
For more information about high availability and optimization, see Deployment Optimization in the Oracle Linux Virtualization Manager: Administration Guide.
Clusters
- Virtual machines run on hosts up to the specified overcommit threshold. Higher values conserve memory at the expense of great CPU usage.
- Hosts can run virtual machines with a total number of CPU cores greater than the number of cores in the host.
- Memory overcommitment on virtual machines running on the hosts in the cluster.
- Memory Overcommitment Manager (MoM) runs Kernel Same-page Merging (KSM) when it can yield a memory saving benefit. To use KSM, you have to explicitly enable it at the cluster level.
You can set cluster optimization for the MoM to start ballooning where and when possible, with a limitation of the guaranteed memory size of every virtual machine. To have a ballooning running, a virtual machine needs to have a balloon device with relevant drivers. Each virtual machine includes a balloon device unless specifically removed. Each host in the cluster receives a balloon policy update when its status changes to Up. If necessary, you can manually update the balloon policy on a KVM host without having to change the status.
Hosts
If you want a cluster to be responsive when unexpected host failures happen, you should configure fencing. Fencing keeps hosts in a cluster highly available by enforcing any associated policies for power saving, load balancing, and virtual machine availability. If you want highly available virtual machines on a particular host:
- You must also enable and configure power management for the host
- The host must have access to the Power Management interface via the
            ovirtmgmtnetwork
Important:
For power management operations, you need at least two KVM hosts in a cluster or data center that are in Up or Maintenance status.
The Manager does not communicate directly with fence agents. Instead, the Engine uses a proxy to send power management commands to a host power management device. The Engine uses VDSM to execute power management device actions, so another host in the environment is used as a fencing proxy. You can select between:
- Any host in the same cluster as the host requiring fencing.
- Any host in the same data center as the host requiring fencing.
Each KVM host in a cluster has limited resources. If a KVM host becomes overutilized, there is an adverse impact on the virtual machines that are running on the host. To avoid or mitigate overutlization, you can use scheduling, load balancing, and migration policies to ensure the performance of virtual machines. If a KVM host becomes overutilized, virtual machines are migrated to another KVM host in the cluster.
Virtual Machines
A highly available virtual machine automatically migrates to and restarts on another host in the cluster if the host crashes or becomes non-operational. If a virtual machine is not configured for high availability it will not restart on another available host. If a virtual machine's host is manually shut down, the virtual machine does not automatically migrate to another host.
Note:
Virtual machines do not live migrate unless you are using shared storage and have explicitely configured your environment for live migration in the event of host failures. Policies, such as power saving or distribution, as well as maintenance events trigger live migrations of virtual machines.
Using the Resource Allocation tab when creating or editing a virtual machine, you can:
- Set the maximum amount of processing capability a virtual machine can access on its host.
- Pin a virtual CPU to a specific physical CPU.
- Guarantee an amount of memory for the virtual machine.
- Enable the memory balloon device for the virtual machine. For this feature to work, memory balloon optimization must also be enabled for the cluster.
- Improve the speed of disks that have a VirtIO interface by pinning them to a thread separate from the virtual machine's other functions.
When a KVM host goes into maintenance mode, all virtual machines are migrated to other servers in the cluster. This mean there is no downtime for virtual machines during planned maintenance windows.
If a virtual machine is unexpectedly terminated, it is automatically restarted, either on the same KVM host or another host in the cluster. This is achieved through monitoring of the hosts and storage to detect any hardware failures. If you configure a virtual machine for high availability and its host fails, the virtual machine automatically restarts on another KVM host in the cluster.
Policies
Load balancing, scheduling, and resiliency policies, enable critical virtual machines to be restarted on another KVM host in the event of hardware failure with three levels of priority.
Scheduling policies enable you to specify the usage and distribution of virtual machines between available hosts. You can define the scheduling policy to enable automatic load balancing across the hosts in a cluster. Regardless of the scheduling policy, a virtual machine does not start on a host with an overloaded CPU. By default, a host’s CPU is considered overloaded if it has a load of more than 80% for 5 minutes, but these values can be changed using scheduling policies.
There are five default scheduling policies:
- Evenly_Distributed - evenly distributes the memory and CPU processing load across all
          hosts in a cluster. 
                        Note: All virtual machines must have the latest qemu-guest-agent installed and its service running.
- Cluster_Maintenance - during maintenance tasks activity in a cluster is limited.
- Power_Saving - reduces power consumption on underutilized hosts by distributing memory and CPU processing load across a subset of available hosts.
- VM_Evenly_Distributed - evenly distributes virtual machines between hosts.
- None
Migration policies enable you to define the conditions for live migrating virtual machines in the event of KVM host failure. These conditions include how long a virtual machine can be down during migration, how much network bandwidth is used, and how the virtual machines are prioritized.
Resilience policies enable you to define how the virtual machines are prioritized in migration. You can configure the policy so that all or no virtual machines migrate, or that only highly available virtual machines migrate which helps to prevent overloading hosts.
For more information on policies, refer to the Administration Guide in oVirt Documentation.