7 Backing up and Restoring a Kubernetes Cluster

This chapter discusses how to back up and restore a Kubernetes cluster in Oracle Cloud Native Environment.

Backing up Control Plane Nodes

Adopting a back up strategy to protect a Kubernetes cluster against control plane node failures is important, especially for clusters with only one control plane node. High availability clusters with many control plane nodes also need a fallback plan if the resilience provided by the replication and failover functionality has been exceeded.

You don't need to bring down the cluster to perform a back up as part of a disaster recovery plan. On the operator node, use the olcnectl module backup command to back up the key containers and manifests for all the control plane nodes in the cluster.

Important:

Only the key containers required for the Kubernetes control plane node are backed up. No application containers are backed up.

For example:

olcnectl module backup \
--environment-name myenvironment \
--name mycluster

The back up files are stored in the /var/olcne/backups directory on the operator node. The files are saved to a timestamped folder that follows the pattern:

/var/olcne/backups/environment-name/kubernetes/module-name/timestamp

Restoring Control Plane Nodes

These restore steps are intended for use when a Kubernetes cluster must be reconstructed as part of a planned disaster recovery scenario. Unless a total cluster failure occurs, you don't need to manually recover individual control plane nodes in a high availability cluster as it can self-heal with replication and failover.

To restore a control plane node, you must have an existing Oracle Cloud Native Environment, and have deployed the Kubernetes module. You can't restore to an environment that doesn't exist.

To restore a control plane node:

  1. Ensure the Platform Agent is running correctly on the control plane nodes before proceeding:

    systemctl status olcne-agent.service
  2. On the operator node, use the olcnectl module restore command to restore the key containers and manifests for the control plane nodes in the cluster. For example:

    olcnectl module restore \
    --environment-name myenvironment \
    --name mycluster

    The files from the latest timestamped folder from /var/olcne/backups/environment-name/kubernetes/module-name/ are used to restore the cluster to its previous state.

    You might be prompted by the Platform CLI to perform extra set up steps on the control plane nodes to fulfill the prerequisite requirements. Follow any instructions and run the olcnectl module restore command again.

  3. You can verify the restore operation was successful using the kubectl command on a control plane node. For example, to list the nodes, use:

    kubectl get nodes

    And to list the pods running in the kube-system namespace, use:

    kubectl get pods --namespace kube-system