Upgrading the Kubernetes Cluster
Caution:
Ensure that all preparation steps for system upgrade have been completed. For instructions, see Preparing the Upgrade Environment.
The Kubernetes container orchestration environment upgrade is also kept separate from the operating system. With a single command, all Kubernetes packages, such as kubeadm, kubectl and kubelet, are upgraded on the three management nodes and all the compute nodes. Note that this upgrade does not include the microservices running within the Kubernetes cluster.
For dependency reasons, Kubernetes must be upgraded after the management node host operating system. The Kubernetes upgrade command has no mandatory parameters.
About the Kubernetes Upgrade Process
To ensure compatibility and continuation of service, Kubernetes must be upgraded one version at a time. Skipping versions – major or minor – is not supported. The Private Cloud Appliance Upgrader manages this process by upgrading or patching all parts of the Kubernetes cluster to the next available version, repeating the same sequence of operations until the entire environment runs the latest Kubernetes version available from the appliance software repositories.
Upgrading or patching the Kubernetes cluster is a time-consuming process that involves the Private Cloud Appliancemanagement nodes and compute nodes. Each additional compute node extends the process by appoximately 10 minutes for each incremental version of Kubernetes.
With appliance software version 3.0.2-b925538, the container orchestration environment is upgraded or patched from Kubernetes version 1.20.x to version 1.25.y, meaning the entire process must run 5 times. After each successful run, the repository is synchronized to retrieve the next required version. However, with this version of the appliance software the repository is reconfigured to allow multiple versions of the Kubernetes packages, so the resync will no longer be required.
Each individual Kubernetes node upgrade is expected to take around 10 minutes. Testing indicates that upgrading or patching the Private Cloud Appliance Kubernetes cluster from version 1.20 to version 1.25 takes approximately 4-5 hours for a base rack configuration with 3 management nodes and 3 compute nodes. On a full rack with 20 compute nodes the entire process requires at least 9 hours and may take up to 18 hours to complete. The estimated time for the rack's specific configuration is reported in the upgrade plan.
To monitor the upgrade or patching progress, periodically check the job status or the logs.
-
Check job status through the Service CLI:
getUpgradeJob upgradeJobId=<id>
-
View Upgrader logs on a management node:
tail -f /nfs/shared_storage/pca_upgrader/log/pca-upgrader_kubernetes_cluster_<time_stamp>.log
.
During Kubernetes upgrade or patching, certain services could be temporarily unavailable.
-
The Compute Web UI, Service Web UI, OCI CLI, and Service CLI can all become temporarily unavailable. Users should wait a few minutes before attempting their operations again. Administrative operations in the Service Enclave (UI or CLI) must be avoided during upgrade or patching.
-
When the Kubernetes upgrade is initiated, the Kubernetes Workload Monitoring Operator (Sauron service) is taken down. As a result, the Grafana, Prometheus, and other Sauron ingress endpoints cannot be accessed. They become available again after both the Kubernetes cluster and the containerized microservices (platform layer) upgrade or patching processes have been completed.
Managing Unprovisioned Compute Nodes
If you upgrade or patch the Kubernetes cluster on a Private Cloud Appliance that contains unprovisioned compute nodes, there could be provisioning issues later. Because those compute nodes were not part of the Kubernetes cluster when the newer version was applied, you may need to rediscover them first.
If compute node provisioning fails after upgrading or patching the Kubernetes cluster, log on to one of the management nodes using ssh. Rediscover the unprovisioned compute nodes by running the following command with the appropriate host names:
# pca-admin compute node rediscover --hostname pcacn000
When the compute nodes have been rediscovered, provisioning is expected to work as intended.
For more information about provisioning, refer to "Performing Compute Node Operations" in the chapter Hardware Administration of the Oracle Private Cloud Appliance Administrator Guide.
Upgrade the Kubernetes Cluster Using the Service Web UI
-
In the navigation menu, click Upgrade & Patching.
-
In the top-right corner of the Upgrade Jobs page, click Create Upgrade or Patch.
The Create Request window appears. Choose Upgrade as the Request Type.
-
Select the appropriate upgrade request type: Upgrade Kubernetes.
-
If required, fill out the upgrade request parameters:
-
Advanced Options JSON: Optionally, add a JSON string to provide additional command parameters.
-
Image Location: This parameter is deprecated.
-
ISO Checksum: This parameter is deprecated.
-
Log Level: Optionally, select a specific log level for the upgrade log file. The default log level is "Information". For maximum detail, select "Debug".
-
-
Click Create Request.
The new upgrade request appears in the Upgrade Jobs table.
Upgrade the Kubernetes Cluster Using the Service CLI
-
Enter the upgrade command.
PCA-ADMIN> upgradeKubernetes Command: upgradeKubernetes Status: Success Time: 2021-09-26 17:20:09,423 UTC Data: Service request has been submitted. Upgrade Job Id = 1632849609034-kubernetes-35545 Upgrade Request Id = UWS-edfa3b32-c32a-4b67-8df5-2357096052bf
-
Use the request ID and the job ID to check the status of the upgrade process.
PCA-ADMIN> getUpgradeJobs id upgradeRequestId commandName result -- ---------------- ----------- ------ 1632849609034-kubernetes-35545 UWS-edfa3b32-c32a-4b67-8df5-2357096052bf kubernetes Passed 1632826770954-etcd-26973 UWS-fec15d32-fc2b-48bd-9ae0-62f49587a284 etcd Passed 1632850933353-vault-16966 UWS-352df3d1-c21f-441b-8f6e-9381ac075906 vault Passed PCA-ADMIN> getUpgradeJob upgradeJobId=1632849609034-kubernetes-35545 Command: getUpgradeJob upgradeJobId=1632849609034-kubernetes-35545 Status: Success Time: 2021-09-26 17:43:38,443 UTC Data: Upgrade Request Id = UWS-edfa3b32-c32a-4b67-8df5-2357096052bf Name = kubernetes Start Time = 2021-09-26T17:20:09 End Time = 2021-09-26T17:21:52 Pid = 35545 Host = pcamn02 Log File = /nfs/shared_storage/pca_upgrader/log/pca-upgrader_kubernetes_cluster_2021_09_26-17.20.09.log Arguments = {"verify_only":false,"upgrade":false,"diagnostics":false,"host_ip":null,"result_override":null,"log_level":null,"switch_type":null,"precheck_status":false,"task_time":0,"fail_halt":false,"fail_upgrade":null,"component_names":null,"upgrade_to":null,"image_location":"http://host.example.com/pca-3.0.1-b535176.iso","epld_image_location":null,"expected_iso_checksum":null,"checksum":"240420cfb9478f6fd026f0a5fa0e998e086275fc45e207fb5631e2e99732e192e8e9d1b4c7f29026f0a5f58dadc4d792d0cfb0279962838e95a0f0a5fa31dca7","composition_id":null,"request_id":"UWS-edfa3b32-c32a-4b67-8df5-2357096052bf","display_task_plan":false,"dry_run_tasks":false} Status = Passed Execution Time(sec) = 249 Tasks 1 - Name = Retrieving Cluster Status Tasks 1 - Description = Retrieving cluster status and upgrade data from the kubernetes nodes Tasks 1 - Time = 2021-09-26T17:20:10 [...]