Node Failure Guide
A Node failure can occur for many reasons, including hardware failures and network outages. This guide provides information about what to
expect when a Node failure occurs and how to recover from a Node failure. Recovery depends on the Storage Provisioner and the type of storage that you use.
NOTE
This guide assumes that the storage provided in the cluster is physically separate from theNode and is recoverable. It does not apply to a local storage on the Node.
What to expect
By default, when a Node fails:
-
It may take up to a minute for the failure to reflect in the Kubernetes API server and update the
Nodestatus toNotReady. -
After about five minutes of the
Nodestatus beingNotReady, the status of thePodson thatNodewill be changed toUnknownorNodeLost. -
The status of the
Podswith controllers, likeDaemonsets,Statefulsets, andDeployments, will be changed toTerminating.NOTE:
Podswithout a controller, started with aPodSpec, will not be terminated. They must be manually deleted and recreated. -
New
Podswill start on theNodesthat remain withReadystatus.NOTE:
Statefulsetsare a special case. TheStatefulsetcontroller maintains an ordinal list ofPods, one each for a given name. TheStatefulsetcontroller will not start a newPodwith the name of an existingPod. -
The
Podsthat have associatedPersistent Volumesof typeReadWriteOnce, do not becomeReady. This is because thePodstry to attach to the existing volumes that are still attached to the oldPod, which is stillTerminating. This happens because, at a given point, thePersistent Volumesof typeReadWriteOnce, can be associated only with a singleNode, and the newPodresides on anotherNode. -
If multiple
Availability Domainsare used in the Kubernetes cluster and the failedNodeis the last in thatAvailability Domain, then the existing volumes will no longer be reachable by newPodsin a separateAvailability Domain.
About recovery
After a Node fails, if the Node can be recovered within five minutes, then the Pods will return to a Running state. If the Node is not recovered after five minutes,
then the Pods will complete termination and are deleted from the Kubernetes API server. New Pods that have Persistent Volumes of type ReadWriteOnce,
will now be able to mount the Persistent Volumes and change to Running.
If a Node cannot be recovered and is replaced, then deleting the Node from the Kubernetes API server will terminate
the old Pods and release the Persistent Volumes of type ReadWriteOnce, to be mounted by any new Pods.
If multiple Availability Domains are used in the Kubernetes cluster, then the replacement Node should be added to the same Availability Domain
that the deleted Node occupied. This allows the Pods to be scheduled on the replacement Node that can reach the Persistent Volumes in that
Availability Domain, and then the Pod status is changed to Running.
Do not forcefully delete Pods or Persistent Volumes in a failed Node that you plan to recover or replace. If you force delete Pods or Persistent Volumes in a failed Node,
it may lead to loss of data and, in the case of Statefulsets, it may lead to split-brain scenarios. For more information about Statefulsets,
see Force Delete StatefulSet Pods in the Kubernetes documentation.
You can force delete Pods and Persistent Volumes when a failed Node cannot be recovered or replaced in the same Availability Domain as
the original Node.
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.