6 Scaling a Kubernetes Cluster

A Kubernetes cluster might consist of either a single or many control plane node and worker nodes. The more applications you run in a cluster, the more resources (nodes) you need. So, what do you do if you need more resources to handle a high amount of workload or traffic, or to deploy more services to the cluster? You add extra nodes to the cluster. Or, what happens if nodes are faulty in the cluster? You remove them.

Scaling a Kubernetes cluster is updating the cluster by adding nodes to it or removing nodes from it. When you add nodes to a Kubernetes cluster, you're scaling up the cluster, and when you remove nodes from the cluster, you're scaling down the cluster.

To replace a node in a cluster, first scale up the cluster (add the new node) and then scale down the cluster (remove the old node).

Note:

We recommend that you don't scale the cluster up and down at the same time. Scale up, then scale down, in two separate commands. To avoid split-brain scenarios, scale the Kubernetes cluster control plane nodes in odd numbers. For example, 3, 5, or 7 control plane nodes ensures the reliability of the cluster.

If you used the --apiserver-advertise-address option when you created a Kubernetes module, then you can't scale up from a cluster with a single control plane node to a highly available (HA) cluster with many control plane nodes. However, if you used the --virtual-ip or the --load-balancer options, then you can scale up, even if you have only a single control plane node cluster.

Important:

The --apiserver-advertise-address option has been deprecated. Use the --control-plane-nodes option.

When you scale a Kubernetes cluster, the following actions are completed:

  1. A back up is taken of the cluster. In case something goes wrong during scaling up or scaling down, you can revert to the previous state so that you can restore the cluster. For more information about backing up and restoring a Kubernetes cluster, see Backing up and Restoring a Kubernetes Cluster.

  2. Any nodes that you want to add to the cluster are validated. If the nodes have any validation issues, such as firewall issues, then the update to the cluster can't proceed, and the nodes can't be added to the cluster. You're prompted for what to do to resolve the validation issues so that the nodes can be added to the cluster.

  3. The control plane node and worker nodes are added to or removed from the cluster.

  4. The cluster is checked to ensure all nodes are healthy. After validation of the cluster is completed, the cluster is scaled and you can access it.

Tip:

The examples in this chapter show you how to scale up and down by changing the control plane node and worker nodes at the same time by providing all the nodes to be included in the cluster using the --control-plane-nodes and --worker-nodes options. If you only want to scale control plane nodes, you only need to provide the list of control plane nodes to include in the cluster using the --control-plane-nodes option (you don't need to provide all worker nodes). Similarly, if you only want to scale worker nodes, you only need to provide the list of worker nodes using the --worker-nodes option.

Scaling Up a Kubernetes Cluster

Before you scale up a Kubernetes cluster, set up the new nodes so they can be added to the cluster.

To prepare a node:

  1. Set up the node so it can be added to a Kubernetes cluster. For information on setting up a Kubernetes node see Getting Started.

  2. If you're using private X.509 certificates for nodes, you need to copy the certificates to the node. You don't need to do anything if you're using Vault to provide certificates for nodes. For information using X.509 certificates see Getting Started.

  3. Start the Platform Agent service. For information on starting the Platform Agent, see Getting Started.

After completing these actions, use the instructions in this procedure to add nodes to a Kubernetes cluster.

To scale up a Kubernetes cluster:

  1. From a control plane node of the Kubernetes cluster, use the kubectl get nodes command to see the control plane node and worker nodes of the cluster.

    kubectl get nodes

    The output looks similar to:

    NAME                   STATUS   ROLE             AGE     VERSION
    control1.example.com    Ready    control-plane   26h     version  
    control2.example.com    Ready    control-plane   26h     version
    control3.example.com    Ready    control-plane   26h     version
    worker1.example.com     Ready    <none>          26h     version
    worker2.example.com     Ready    <none>          26h     version
    worker3.example.com     Ready    <none>          26h     version

    In this example, three control plane nodes are in the Kubernetes cluster:

    • control1.example.com

    • control2.example.com

    • control3.example.com

    Three worker nodes are also in the cluster:

    • worker1.example.com

    • worker2.example.com

    • worker3.example.com

  2. Use the olcnectl module update command to scale up a Kubernetes cluster.

    In this example, the Kubernetes cluster is scaled up so that it has four control plane nodes and five worker nodes. This example adds a new control plane node (control.example.com) and two new workers nodes (worker4.example.com and worker5.example.com) to the Kubernetes module named mycluster. From the operator node run:

    olcnectl module update \
    --environment-name myenvironment \  
    --name mycluster \
    --control-plane-nodes control1.example.com:8090,control2.example.com:8090,control3.example.com:8090,\
    control4.example.com:8090 \
    --worker-nodes worker1.example.com:8090,worker2.example.com:8090,worker3.example.com:8090,\
    worker4.example.com:8090,worker5.example.com:8090

    Ensure that if you're scaling up from a single control plane node to a highly available cluster, you have specified a load balancer for the cluster. If you don't specify a load balancer, then you can't scale up the control plane nodes. This means that you can't move from a single control plane node to a highly available cluster without a load balancer.

    You can optionally include the --generate-scripts option. This option generates scripts you can run for each node in the event of any validation failures during scaling. A script is created for each node in the module, saved to the local directory, and named hostname:8090.sh.

    You can also optionally included the --force option to suppress the prompt displayed to confirm you want to continue with scaling the cluster.

  3. On a control plane node of the Kubernetes cluster, use the kubectl get nodes command to verify the cluster is scaled up to include the new control plane node and worker nodes.

    kubectl get nodes

    The output looks similar to:

    NAME                   STATUS   ROLE            AGE     VERSION
    control1.example.com    Ready   control-plane   26h     version  
    control2.example.com    Ready   control-plane   26h     version
    control3.example.com    Ready   control-plane   26h     version
    control4.example.com    Ready   control-plane   2m38s   version
    worker1.example.com     Ready   <none>          26h     version
    worker2.example.com     Ready   <none>          26h     version
    worker3.example.com     Ready   <none>          26h     version
    worker4.example.com     Ready   <none>          2m38s   version
    worker5.example.com     Ready   <none>          2m38s   version

Scaling Down a Kubernetes Cluster

This procedure shows you how to remove nodes from a Kubernetes cluster.

Attention:

Be careful if you're scaling down the control plane nodes of the cluster. If you have two control plane nodes and you scale down to have only one control plane node, then you would have only a single point of failure.

To scale down a Kubernetes cluster:

  1. From a control plane node of the Kubernetes cluster, use the kubectl get nodes command to see the control plane node and worker nodes of the cluster.

    kubectl get nodes

    The output looks similar to:

    NAME                   STATUS   ROLE            AGE     VERSION
    control1.example.com   Ready    control-plane   26h     version  
    control2.example.com   Ready    control-plane   26h     version
    control3.example.com   Ready    control-plane   26h     version
    control4.example.com   Ready    control-plane   2m38s   version
    worker1.example.com    Ready    <none>          26h     version
    worker2.example.com    Ready    <none>          26h     version
    worker3.example.com    Ready    <none>          26h     version
    worker4.example.com    Ready    <none>          2m38s   version
    worker5.example.com    Ready    <none>          2m38s   version

    In this example, four control plane nodes are in the Kubernetes cluster:

    • control1.example.com

    • control2.example.com

    • control3.example.com

    • control4.example.com

    Five worker nodes are also in the cluster:

    • worker1.example.com

    • worker2.example.com

    • worker3.example.com

    • worker4.example.com

    • worker5.example.com

  2. Use the olcnectl module update command to scale down a Kubernetes cluster.

    In this example, the Kubernetes cluster is scaled down so that it has three control plane nodes and three worker nodes. This example removes a control plane node (control4.example.com) and two workers nodes (worker4.example.com and worker5.example.com) from the Kubernetes module named mycluster. As the nodes are no longer listed in the --control-plane-nodes or --worker-nodes options, they're removed from the cluster. From the operator node run:

    olcnectl module update \
    --environment-name myenvironment \  
    --name mycluster \
    --control-plane-nodes control1.example.com:8090,control2.example.com:8090,control3.example.com:8090 \
    --worker-nodes worker1.example.com:8090,worker2.example.com:8090,worker3.example.com:8090
  3. On a control plane node of the Kubernetes cluster, use the kubectl get nodes command to verify the cluster is scaled down to remove the control plane node and worker nodes.

    kubectl get nodes

    The output looks similar to:

    NAME                   STATUS   ROLE            AGE     VERSION
    control1.example.com   Ready    control-plane   26h     version  
    control2.example.com   Ready    control-plane   26h     version
    control3.example.com   Ready    control-plane   26h     version
    worker1.example.com    Ready    <none>          26h     version
    worker2.example.com    Ready    <none>          26h     version
    worker3.example.com    Ready    <none>          26h     version