- Kubernetes clusters restore based on etcd snapshots
- Configure for Disaster Recovery
Configure for Disaster Recovery
etcd
snapshot in a primary Kubernetes cluster and restore it in another
(secondary) Kubernetes cluster or in the source cluster itself. It's
important to plan the configuration and understand the requirements before
downloading and using the scripts to configure your
snapshot.
Note:
This solution assumes that both Kubernetes clusters, including the control plane and worker nodes, already exist.Plan the Configuration
Note:
This solution assumes that both Kubernetes clusters, including the control plane and worker nodes, already exist. The recommendations and utilities provided in this playbook do not check resources, control plane, or worker node capacity and configuration.Restore, as described here, can be applied in clusters that “mirror” primary (same
number of control plane nodes, same number of worker nodes). The procedures assume
that a primary Kubernetes cluster created with kubeadm exists. Host
names in the secondary system are configured to mimic primary, as described in the
next paragraphs. Then, the secondary cluster is created also with
kubeadm (again only AFTER the required host name resolution is taken
care of).
Complete the following requirements for Restore when planning your configuration:
- Confirm that the required worker nodes and resources in the primary are
available in secondary. This includes shared storage mounts, load balancers, and databases used by the pods and systems used in the namespaces that will be restored.
- Configure your host name resolution so that the host names used by the control
and worker plane are valid in secondary.
For example, if your primary site resolves the cluster similar to the following:
[opc@olk8-m1 ~]$ kubectl get nodes -A NAME STATUS ROLES AGE VERSION olk8-m1 Ready control-plane 552d v1.25.12 olk8-m2 Ready control-plane 552d v1.25.12 olk8-m3 Ready control-plane 2y213d v1.25.12 olk8-w1 Ready <none> 2y213d v1.25.12 olk8-w2 Ready <none> 2y213d v1.25.12 olk8-w3 Ready <none> 2y213d v1.25.12 [opc@olk8-m1 ~]$ nslookup olk8-m1 Server: 169.254.169.254 Address: 169.254.169.254#53 Non-authoritative answer: Name: olk8-m1.k8dbfrasubnet.k8dbvcn.oraclevcn.com Address: 10.11.0.16Then, your secondary site must use the same node names. In the previous example node in the control plane, the host name in region 2 will be the same mapped to a different IP.[opc@k8dramsnewbastion ~]$ nslookup olk8-m1 Server: 169.254.169.254 Address: 169.254.169.254#53 Non-authoritative answer: Name: olk8-m1.sub01261629121.k8drvcnams.oraclevcn.com Address: 10.5.176.144 [opc@k8dramsnewbastion ~]$The resulting configuration in secondary (after usingkubeadmto create the cluster and adding the worker nodes) will use the exact same node names, even if internal IPs and other values defer.[opc@k8dramsnewbastion ~]$ kubectl get nodes -A NAME STATUS ROLES AGE VERSION olk8-m1 Ready control-plane 552d v1.25.11 olk8-m2 Ready control-plane 552d v1.25.11 olk8-m3 Ready control-plane 2y213d v1.25.11 olk8-w1 Ready <none> 2y213d v1.25.11 olk8-w2 Ready <none> 2y213d v1.25.11 olk8-w3 Ready <none> 2y213d v1.25.11 - Use a similar “host name aliasing” for the
kube-apifront end address.Note:
Your primary Kubernetes cluster should NOT use IP addresses for the front-end
kube-api. You must use a host name so that this front-end can be aliased in the secondary system. See the maak8s-kube-api-alias.sh script for an example on how to add a host name alias to your existing primarykube-apisystem.For example, if the primary’skube-apiaddress resolution is as follows:[opc@olk8-m1 ~]$ grep server .kube/config server: https://k8lbr.paasmaaoracle.com:6443 [opc@olk8-m1 ~]$ grep k8lbr.paasmaaoracle.com /etc/hosts 132.145.247.187 k8lbr.paasmaaoracle.com k8lbrThen, the secondary’skube-apishould use the same host name (you can map it to a different IP):[opc@k8dramsnewbastion ~]$ grep server .kube/config server: https://k8lbr.paasmaaoracle.com:6443 [opc@k8dramsnewbastion ~]$ grep k8lbr.paasmaaoracle.com /etc/hosts 144.21.37.81 k8lbr.paasmaaoracle.com k8lbrYou can achieve this by using virtual hosts, local/etc/hostsresolution, or a different DNS servers in each location. To determine the host name resolution method used by a particular host, search for the value of the hosts parameter in the/etc/nsswitch.conffile on the host.-
If you want to resolve host names locally on the host, then make the files entry the first entry for the
hostsparameter. Whenfilesis the first entry for the hosts parameter, entries in the host/etc/hostsfile are used first to resolve host names.Specifying the Use of Local Host Name Resolution in
/etc/nsswitch.conffile:hosts: files dns nis -
If you want to resolve host names by using DNS on the host, then make the
dnsentry the first entry for the hosts parameter. Whendnsis the first entry for thehostsparameter, DNS server entries are used first to resolve host names.Specifying the Use of DNS Host Name Resolution
/etc/nsswitch.conffile:hosts: dns files nis
For simplicity and consistency, Oracle recommends that all the hosts within a site (production site or standby site) use the same host name resolution method (resolving host names locally or resolving host names using separate DNS servers or a global DNS server).
The “host name aliasing” technique has been used for many years in Disaster Protection for Middleware systems. You can find details and examples in Oracle’s documentation, including the Oracle Fusion Middleware Disaster Recovery Guide and other documents pertaining to Oracle Cloud Disaster Protection, such as Oracle WebLogic Server for Oracle Cloud Infrastructure Disaster Recovery and SOA Suite on Oracle Cloud Infrastructure Marketplace Disaster Recovery.
-
- Create the secondary cluster using the same host name for the front end
kube-apiload balancer as in primary.Perform this step after your host name resolution is ready. See the Kuberneteskubeadmtool documentation. Use the samekubeadmand Kubernetes versions as in primary. Container runtimes may defer, but you should use the same versions of Kubernetes infrastructure in both regions.For example, if the primary cluster was created with the following:kubeadm init --control-plane-endpoint $LBR_HN:$LBR_PORT --pod-network-cidr=10.244.0.0/16 --node-name $mnode1 --upload-certs --v=9Then, use the exact same
$LBR_HN:$LBR_PORTand CIDR values in secondary as in primary. The same applies if you use other cluster creation tools, such as kOps and kubesparay. - When adding additional control plane or worker nodes, ensure that you use the
same node names in primary and secondary.
kubeadm join $LBR_HN:$LBR_PORT --token $token --node-name $host --discovery-token-ca-cert-hash $token_ca --control-plane --certificate-key $cp_ca - Once the secondary cluster is configured, the same host names should appear
when retrieving the node information from kubernetes.
The $host variables used in secondary for each control plane and worker nodes must be the same as those used in primary.
Primary Cluster
Run the following command on primary to confirm the control plane and worker node status, role, age, version, internal IP, external IP, OS image, kernel version, and container runtime:The following is example output.[opc@olk8-m1 ~]$ kubectl get nodes -o wide[opc@olk8-m1 ~]$ kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME olk8-m1 Ready control-plane 578d v1.25.12 10.11.0.16 <none> Oracle Linux Server 7.9 4.14.35-1902.302.2.el7uek.x86_64 cri-o://1.26.1 olk8-m2 Ready control-plane 578d v1.25.12 10.11.210.212 <none> Oracle Linux Server 7.9 5.4.17-2136.301.1.3.el7uek.x86_64 cri-o://1.26.1 olk8-m3 Ready control-plane 2y238d v1.25.12 10.11.0.18 <none> Oracle Linux Server 7.9 4.14.35-2047.527.2.el7uek.x86_64 cri-o://1.26.1 olk8-w1 Ready <none> 2y238d v1.25.12 10.11.0.20 <none> Oracle Linux Server 7.9 4.14.35-1902.302.2.el7uek.x86_64 cri-o://1.26.1 olk8-w2 Ready <none> 2y238d v1.25.12 10.11.0.21 <none> Oracle Linux Server 7.9 4.14.35-1902.302.2.el7uek.x86_64 cri-o://1.26.1 olk8-w3 Ready <none> 2y238d v1.25.12 10.11.0.22 <none> Oracle Linux Server 7.9 4.14.35-1902.302.2.el7uek.x86_64 cri-o://1.26.1 [opc@olk8-m1 ~]$Run the following command on primary to identify where the Kubernetes control plane and the Core DNS are running.[opc@olk8-m1 ~]$ kubectl cluster-infoSecondary Cluster
Run the following command on secondary to confirm the control plane and worker node status, role, age, version, internal IP, external IP, OS image, kernel version, and container runtime:[opc@k8dramsnewbastion ~]$ kubectl get node -o wideThe following is example output.[opc@k8dramsnewbastion ~]$ kubectl get node -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME olk8-m1 Ready control-plane 579d v1.25.11 10.5.176.144 <none> Oracle Linux Server 8.7 5.15.0-101.103.2.1.el8uek.x86_64 containerd://1.6.21 olk8-m2 Ready control-plane 579d v1.25.11 10.5.176.167 <none> Oracle Linux Server 8.7 5.15.0-101.103.2.1.el8uek.x86_64 containerd://1.6.21 olk8-m3 Ready control-plane 2y239d v1.25.11 10.5.176.154 <none> Oracle Linux Server 8.7 5.15.0-101.103.2.1.el8uek.x86_64 containerd://1.6.21 olk8-w1 Ready <none> 2y239d v1.25.11 10.5.176.205 <none> Oracle Linux Server 8.7 5.15.0-101.103.2.1.el8uek.x86_64 containerd://1.6.22 olk8-w2 Ready <none> 2y239d v1.25.11 10.5.176.247 <none> Oracle Linux Server 8.7 5.15.0-101.103.2.1.el8uek.x86_64 containerd://1.6.22 olk8-w3 Ready <none> 2y239d v1.25.11 10.5.176.132 <none> Oracle Linux Server 8.7 5.15.0-101.103.2.1.el8uek.x86_64 containerd://1.6.22 [opc@k8dramsnewbastion ~]$ kubectl cluster-info Kubernetes control plane is running at https://k8lbr.paasmaaoracle.com:6443 CoreDNS is running at https://k8lbr.paasmaaoracle.com:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'. [opc@k8dramsnewbastion ~]$Run the following command on secondary to identify where the Kubernetes control plane and the Core DNS are running.[opc@k8dramsnewbastion ~]$ kubectl cluster-infoWith the default settings in
kubeadmcluster creation,etcdwill use the same ports in primary and secondary. If the cluster in secondary needs to use different ports, then you must modify the scripts to handle it. You can use different storage locations in primary and secondary for theetcdsdatabase. The scripts will take care of restoring in the appropriate location that the secondary cluster is using foretcd. - Install
etcdctlboth in the primary and secondary locations (nodes executing the backup and restore scripts).The scripts for backup and restore will useetcdctlto obtain information from the cluster and to create and applyetcdsnapshots. To installetcdctlsee the https://github.com/etcd-io/etcd/releases documentation. - Ensure that the appropriate firewall and security rules are in place so that
the node executing the backup and restore operations are enabled for this type
of access.The scripts will also need to access the cluster with
kubectland reach out the different nodes through SSH and HTTP (for shell commands andetcdctloperations).
Configure
Configure for disaster recovery.
The steps for a restore involve the following:
- Take an
etcdbackup in a primary location. - Ship the backup to the secondary location.
- Restore that
etcdbackup in a secondary cluster.
Perform the following steps:
- Create an
etcdbackup in a primary Kubernetes cluster.- Download ALL the of the scripts for
etcdsnapshot DR from the "Download Code" section of this document.Note:
All of the scripts must be in the same path because the main scripts use other auxiliary scripts. - Obtain the advert_port from a control plane node
etcdconfiguration.[opc@olk8-m1 ~]$ sudo grep etcd.advertise-client-urls /etc/kubernetes/manifests/etcd.yaml | awk -F ":" '{print $NF}' 2379And the same for theinit_port:[opc@olk8-m1 ~]$ sudo grep initial-advertise-peer-urls /etc/kubernetes/manifests/etcd.yaml | awk -F ":" '{print $NF}' 2380These ports are the default ones and are used by all of the control plane’s
etcdpods. In the rare situations whereetcdhas been customized to use a differentinitandadvertiseport in each node, you must customize the scripts to consider those. You can also customize the value for theinfra_pod_listif other network plugins are used or other relevant pods or deployments must be restarted after restore in your particular case. However, in general, it can be defaulted to the values provided in the file. - Edit the
maak8s.envscript and update the variables according to your environment.The following is an examplemaak8s.envfile:[opc@olk8-m1 ~]$ cat maak8s.env #sudo ready user to ssh into the control plane nodes export user=opc #ssh key for the ssh export ssh_key=/home/opc/KeyMAA.ppk #etcdctl executable's location export etcdctlhome=/scratch/etcdctl/ #etcd advertise port export advert_port=2379 #etcd init cluster port export init_port=2380 #infrastructure pods that will be restarted on restore export infra_pod_list="flannel proxy controller scheduler" - Run the
maak8-etcd-backup.shscript and provide as arguments the following fields in this order:- The directory where the backup will be stored
- A “LABEL/TEXT” describing the backup
- The location of the cluster configuration to run
kubectloperations
For example:[opc@olk8-m1 ~]$ ./maak8-etcd-backup.sh /backup-volumes/ "ETCD Snapshot after first configuration " /home/opc/.kubenew/config
The script performs the following tasks:
- Creates an
etcdsnapshot from theetcdmaster node - Creates a copy of the current configuration of each control plane node (manifests and certs for each control plane node), including the signing keys for the cluster
- Records the list of nodes, pods, services, and cluster configuration
- Stores all the information above in a directory labeled with the date.
If the directory specified in the command line argument is
/backup-volume, then the backup is stored under/backup-volume/etcd_snapshot_dateof the backup. For example,/backup-volume/etcd_snapshot_2022-08-29_15-56-59.
- Download ALL the of the scripts for
- Copy the entire directory
(
/backup-volume/etcd_snapshot_date) to the secondary cluster.- Use an
sftptool or create a tar with the directory and send it to the secondary location. - Untar or unzip the file to make it available in the secondary system, as it was in primary.
- Make a note of the date label in the backup (in the example above it would be 2022-08-29_15-56-59).
For example,[opc@olk8-m1 ~]$ scp -i KeyMAA.ppk -qr /backup-volume/etcd_snapshot_2022-08-29_15-56-59 154.21.39.171:/restore-volume [opc@olk8-m1 ~]$ ssh -i KeyMAA.ppk 154.21.39.171 "ls -lart /restore-volume" total 4 drwxrwxrwt. 6 root root 252 Aug 30 15:11 .. drwxrwxr-x. 3 opc opc 47 Aug 30 15:12 . drwxrwxr-x. 5 opc opc 4096 Aug 30 15:12 etcd_snapshot_2022-08-29_15-56-59 - Use an
- Once the backup is available in the secondary location, follow these steps to
restore it:
- Download ALL the scripts for
etcdsnapshot DR from the "Download Code" section to the secondary region node that will run the restore.Remember that this node must also haveetcdctlinstalled andkubectlaccess to the secondary cluster.Note:
Because the main scripts use other auxiliary scripts, you must have all scripts in the same path when executing the different steps. - Edit the
maak8s.envscript and update the variables according to your environment.You can alter the user, ssh key andetcdctllocation, accordingly to your secondary nodes, but theadvertandinitports should be the same as those that are used in primary.The following is an examplemaak8s.envfile:[opc@olk8-m1 ~]$ cat maak8s.env #sudo ready user to ssh into the control plane nodes export user=opc #ssh key for the ssh export ssh_key=/home/opc/KeyMAA.ppk #etcdctl executable's location export etcdctlhome=/scratch/etcdctl/ #etcd advertise port export advert_port=2379 #etcd init cluster port export init_port=2380 #infrastructure pods that will be restarted on restore export infra_pod_list="flannel proxy controller scheduler" - Run the restore using the
maak8-etcd-restore.shscript. Provide, as arguments, the root directory where the backup was copied from primary to standby, the timestamp of the backup, and the location of thekubectlconfiguration for the cluster.For example,[opc@k8dramsnewbastion ~]$ ./maak8-etcd-restore.sh /restore-volume 2022-08-29_15-56-59 /home/opc/.kube/configThe script looks in the
/restore-volumedirectory for a subdirectory namedetcd_snapshot_date. Using the example, it will use/restore-volume/etcd_snapshot_2022-08-29_15-56-59.The restore performs the following tasks:- Force stops the control plane in secondary, if it is running
- Restores the
etcdsnapshot in all of the control plane nodes - Replaces the cluster signing keys in all of the control plane nodes
- Starts the control plane
- Recycles all infrastructure pods (proxy, scheduler, controllers) and deployments in the cluster (to bring it to a consistent state)
At the end of the restore, a report displays the status of the pods and
etcdsubsystem. For example,NAMESPACE NAME READY STATUS RESTARTS AGE default dnsutils 1/1 Running 0 27d default nginx-deployment-566ff9bd67-6rl7f 1/1 Running 0 19s default nginx-deployment-566ff9bd67-hnx69 1/1 Running 0 17s default nginx-deployment-566ff9bd67-hvrwq 1/1 Running 0 15s default test-pd 1/1 Running 0 26d kube-flannel kube-flannel-ds-4f2fz 1/1 Running 3 (22d ago) 35d kube-flannel kube-flannel-ds-cvqzh 1/1 Running 3 (22d ago) 35d kube-flannel kube-flannel-ds-dmbhp 1/1 Running 3 (22d ago) 35d kube-flannel kube-flannel-ds-skhz2 1/1 Running 3 (22d ago) 35d kube-flannel kube-flannel-ds-zgkkp 1/1 Running 4 (22d ago) 35d kube-flannel kube-flannel-ds-zpbn7 1/1 Running 3 (22d ago) 35d kube-system coredns-8f994fbf8-6ghs4 0/1 ContainerCreating 0 15s kube-system coredns-8f994fbf8-d79h8 1/1 Running 0 19s kube-system coredns-8f994fbf8-wcknd 1/1 Running 0 12s kube-system coredns-8f994fbf8-zh8w4 1/1 Running 0 19s kube-system etcd-olk8-m1 1/1 Running 22 (89s ago) 44s kube-system etcd-olk8-m2 1/1 Running 59 (88s ago) 44s kube-system etcd-olk8-m3 1/1 Running 18 (88s ago) 26s kube-system kube-apiserver-olk8-m1 1/1 Running 26 (89s ago) 44s kube-system kube-apiserver-olk8-m2 1/1 Running 60 (88s ago) 42s kube-system kube-apiserver-olk8-m3 1/1 Running 18 (88s ago) 27s kube-system kube-controller-manager-olk8-m1 1/1 Running 19 (89s ago) 10s kube-system kube-controller-manager-olk8-m2 1/1 Running 18 (88s ago) 10s kube-system kube-controller-manager-olk8-m3 1/1 Running 18 (88s ago) 10s kube-system kube-flannel-ds-62dcq 1/1 Running 0 19s kube-system kube-flannel-ds-bh5w7 1/1 Running 0 19s kube-system kube-flannel-ds-cc2rk 1/1 Running 0 19s kube-system kube-flannel-ds-p8kdk 1/1 Running 0 19s kube-system kube-flannel-ds-vj8r8 1/1 Running 0 18s kube-system kube-flannel-ds-wz2kv 1/1 Running 0 18s kube-system kube-proxy-28d98 1/1 Running 0 14s kube-system kube-proxy-2gb99 1/1 Running 0 15s kube-system kube-proxy-4dfjd 1/1 Running 0 14s kube-system kube-proxy-72l5q 1/1 Running 0 14s kube-system kube-proxy-s8zbs 1/1 Running 0 14s kube-system kube-proxy-tmqnm 1/1 Running 0 14s kube-system kube-scheduler-olk8-m1 0/1 Pending 0 5s kube-system kube-scheduler-olk8-m2 1/1 Running 18 (88s ago) 5s kube-system kube-scheduler-olk8-m3 1/1 Running 18 (88s ago) 5s newopns weblogic-operator-5d74f56886-mtjp6 0/1 Terminating 0 26d newopns weblogic-operator-webhook-768d9f6f79-tdt8b 0/1 Terminating 0 26d soans soaedgdomain-adminserver 0/1 Running 0 22d soans soaedgdomain-soa-server1 0/1 Running 0 22d soans soaedgdomain-soa-server2 0/1 Running 0 22d +--------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS | +--------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | olk8-m1:2379 | 63c63522f0be24a6 | 3.5.6 | 146 MB | true | false | 2 | 1195 | 1195 | | | olk8-m2:2379 | 697d3746d6f10842 | 3.5.6 | 146 MB | false | false | 2 | 1195 | 1195 | | | olk8-m3:2379 | 7a23c67093a3029 | 3.5.6 | 146 MB | false | false | 2 | 1195 | 1195 | | +--------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ +------------------+---------+---------+----------------------+---------------------------+------------+ | ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER | +------------------+---------+---------+----------------------+---------------------------+------------+ | 7a23c67093a3029 | started | olk8-m3 | https://olk8-m3:2380 | https://10.5.176.154:2379 | false | | 63c63522f0be24a6 | started | olk8-m1 | https://olk8-m1:2380 | https://10.5.176.144:2379 | false | | 697d3746d6f10842 | started | olk8-m2 | https://olk8-m2:2380 | https://10.5.176.167:2379 | false | +------------------+---------+---------+----------------------+---------------------------+------------+ Restore completed at 2023-08-30_15-18-22 [opc@k8dramsnewbastion ~]$
- Download ALL the scripts for
Verify
maak8DR-apply.sh script, verify
that all of your artifacts which existed in the primary cluster have been replicated to
the secondary cluster. Look at the secondary cluster and verify that the pods in the
secondary site are running without error.
- Check the status of the secondary until the required pods match the state in
primary. By default, the pods and deployments are started in the secondary region. At the end of the restore, the status of the secondary cluster is shown. Some pods might take additional time to reach RUNNING state.
- Check the
restorelog in the secondary for possible errors.The log location is reported at the beginning of the restore. By default the log is created under the directory where the backup itself was located, at/backup_dir/etcd_snapshot_backup-date/restore_attempted_restore-date/restore.log. Another log is created specifically for theetcdsnapshot restore operation/backup_dir/etcd_snapshot_backup-date/restore_attempted_restore-date/etcd_op.log. - (Optional) Revert back.
In addition to the restore logs, a backup of the previous
/etc/kubernetesconfiguration is created for each one of the control planes nodes under the/backup_dir/etcd_snapshot_backup-date/restore_attempted_restore-date/current_etc_kubernetesdirectory. Similarly, theetcddatabases in each node BEFORE the restore are copied to/backup_dir/etcd_snapshot_backup-date/restore_attempted_restore-date/node_name. You can use these to revert back to the cluster configuration that existed before the restore was executed.