High Availability Policies

The high availability (HA) policy for an orchestration affects how the orchestration is managed by the system.

Values for the Policy

The high availability policy can take one of the following values:

  • Active: Enables HA for instances. Monitors the state of the VM and automatically restarts the VM if it stops unexpectedly due to one of the following causes:

    • Node power outage

    • Node network failure

    • Hypervisor failing for any reason

    Note:

    The instance might be restarted on a different node.

    Note:

    • If an instance fails to launch successfully or it does not reach the running state, then the orchestration reflects the error state and the instance cannot be restarted. To restart the orchestration, first stop it, resolve the error, and then start the orchestration again.

    • When a compute node goes down or the network connectivity is lost, the instance is in unreachable state and it is not restarted. If the network connectivity is restored, the compute node is up and the VM is still running, then the instance state transitions to running. If the compute node goes down and the VM is not running, then the instance goes to error state and it is restarted on other compute node.

    • When the compute node is unreachable, the instances with HA policy set to active are not restarted automatically.

    • An instance will not be automatically restarted unless it is verified that the corresponding virtual machine is not running. Check with your cloud administrator to verify that the virtual machine of the instance is not running. A cloud administrator can forcefully shut down the instance and restart it.

    • There are no agents running in the instance reporting on the instance's health. Things like kernel panics must be handled by the instance itself.

  • Monitor: Monitors the state of the instance and notifies administrators if the instance stops, fails, or becomes unreachable. The components are not restarted. This state is set only for objects of type launchplan and storagevolume. Only instances and storage volumes can be monitored. Monitoring certain components in an orchestration allows an aggregation of status for the set of monitored components. For example, if you want to integrate the aggregate status of a set of instances, create them in a single launch plan within an orchestration and set the ha_policy on the launch plan to monitor. You can then query the status of the orchestration, and if any of the instances are not in a running state, it will be reflected in the orchestration status. 

    Note:

    Launch plans are always monitored by default regardless of the ha_policy element setup in the orchestration.

If you do not want the components to be restarted or monitored, do not include the ha_policy element for the components configuration defined in the orchestration.