Patching the Kubernetes Cluster

Caution:

Ensure that all preparation steps for system patching have been completed. For instructions, see Prepare for Patching.

The Kubernetes container orchestration environment patching is also kept separate from the operating system. With a single command, all Kubernetes packages, such as kubeadm, kubectl and kubelet, are patched on the three management nodes and all the compute nodes. Note that this patching does not include the microservices running in Kubernetes containers.

Note:

In software version 3.0.2-b892153 or later all patch operations are based on the upgrade plan, which is generated when the pre-upgrade command is executed. For more information, see Prepare for Patching. When a component is already at the required version, the patch operation is skipped. However, patching with the same version can be forced using the Service Web UI or Service CLI command option (force=True), if necessary.

Ensure synchronization of the mirror on the shared storage is complete prior to Kubernetes patching by issuing the syncUpstreamUlnMirror command. For more information, see Prepare for Patching.

About the Kubernetes Upgrade Process

To ensure compatibility and continuation of service, Kubernetes must be upgraded one version at a time. Skipping versions – major or minor – is not supported. The Private Cloud Appliance Upgrader manages this process by upgrading or patching all parts of the Kubernetes cluster to the next available version, repeating the same sequence of operations until the entire environment runs the latest Kubernetes version available from the appliance software repositories.

Upgrading or patching the Kubernetes cluster is a time-consuming process that involves the Private Cloud Appliancemanagement nodes and compute nodes. Each additional compute node extends the process by appoximately 10 minutes for each incremental version of Kubernetes.

With appliance software version 3.0.2-b925538, the container orchestration environment is upgraded or patched from Kubernetes version 1.20.x to version 1.25.y, meaning the entire process must run 5 times. After each successful run, the repository is synchronized to retrieve the next required version. However, with this version of the appliance software the repository is reconfigured to allow multiple versions of the Kubernetes packages, so the resync will no longer be required.

Each individual Kubernetes node upgrade is expected to take around 10 minutes. Testing indicates that upgrading or patching the Private Cloud Appliance Kubernetes cluster from version 1.20 to version 1.25 takes approximately 4-5 hours for a base rack configuration with 3 management nodes and 3 compute nodes. On a full rack with 20 compute nodes the entire process requires at least 9 hours and may take up to 18 hours to complete. The estimated time for the rack's specific configuration is reported in the upgrade plan.

To monitor the upgrade or patching progress, periodically check the job status or the logs.

  • Check job status through the Service CLI: getUpgradeJob upgradeJobId=<id>

  • View Upgrader logs on a management node: tail -f /nfs/shared_storage/pca_upgrader/log/pca-upgrader_kubernetes_cluster_<time_stamp>.log.

During Kubernetes upgrade or patching, certain services could be temporarily unavailable.

  • The Compute Web UI, Service Web UI, OCI CLI, and Service CLI can all become temporarily unavailable. Users should wait a few minutes before attempting their operations again. Administrative operations in the Service Enclave (UI or CLI) must be avoided during upgrade or patching.

  • When the Kubernetes upgrade is initiated, the Kubernetes Workload Monitoring Operator (Sauron service) is taken down. As a result, the Grafana, Prometheus, and other Sauron ingress endpoints cannot be accessed. They become available again after both the Kubernetes cluster and the containerized microservices (platform layer) upgrade or patching processes have been completed.

Managing Unprovisioned Compute Nodes

If you upgrade or patch the Kubernetes cluster on a Private Cloud Appliance that contains unprovisioned compute nodes, there could be provisioning issues later. Because those compute nodes were not part of the Kubernetes cluster when the newer version was applied, you may need to rediscover them first.

If compute node provisioning fails after upgrading or patching the Kubernetes cluster, log on to one of the management nodes using ssh. Rediscover the unprovisioned compute nodes by running the following command with the appropriate host names:

# pca-admin compute node rediscover --hostname pcacn000

When the compute nodes have been rediscovered, provisioning is expected to work as intended.

For more information about provisioning, refer to "Performing Compute Node Operations" in the chapter Hardware Administration of the Oracle Private Cloud Appliance Administrator Guide.

Patch the Kubernetes Cluster Using the Service Web UI

  1. In the navigation menu, click Upgrade & Patching.

  2. In the top-right corner of the Upgrade Jobs page, click Create Upgrade or Patch.

    The Create Request window appears. Choose Patch as the Request Type.

  3. Select the appropriate patch request type: Patch Kubernetes.

  4. If required, fill out the patch parameters:

    • ULN: Enter the fully qualified domain name of the ULN mirror in your data center. This parameter is deprecated in software version 3.0.2-b892153 and later.

    • Advanced Options JSON: Not available.

    • Log Level: Optionally, select a specific log level for the upgrade log file. The default log level is "Information". For maximum detail, select "Debug".

  5. Click Create Request.

    The new patch request appears in the Upgrade Jobs table.

Patch the Kubernetes Cluster Using the Service CLI

  1. Enter the patch command.

    PCA-ADMIN> patchKubernetes
    Command: patchKubernetes
    Status: Success
    Time: 2022-01-18 20:02:05,408 UTC
    Data: Service request has been submitted.  Upgrade Job ID = 1642509549088-kubernetes-51898 \
    Upgrade Request ID = UWS-4f0d9e99-a515-4170-ab35-9f8bdcbdb2b5
  2. Use the request ID and the job ID to check the status of the upgrade process.

    PCA-ADMIN> getupgradejobs
    Command: getupgradejobs
    Status: Success
    Time: 2023-01-22 19:52:16,398 UTC
    Data:
      id                               upgradeRequestId                           commandName   result
      --                               ----------------                           -----------   ------
      1642509549088-kubernetes-51898   UWS-4f0d9e99-a515-4170-ab35-9f8bdcbdb2b5   kubernetes    Passed
      1642492793827-oci-12162          UWS-6e06bbb7-16b8-49ba-9c33-f42fffbe1323   oci           Passed
    
    PCA-ADMIN> getUpgradeJob upgradeJobId=1642509549088-kubernetes-51898
    Command: getUpgradeJob upgradeJobId=1642509549088-kubernetes-51898
    Status: Success
    Time: 2023-01-22 20:11:43,804 UTC
    Data:
      Upgrade Request Id = UWS-4f0d9e99-a515-4170-ab35-9f8bdcbdb2b5
      Name = kubernetes
    [...]