Managing Patches for ODH Clusters

You can list patches, install patches, and view patch history for a Big Data Service cluster on the Cluster details page.

For more information, see:

Planning ODH Patch Installation Downtime

Plan required downtime for Big Data Service ODH cluster patch installation.

This section explains how to quantify the downtime and how to measure when it's complete. For information on patch installation stages and required downtime, see Monitoring ODH Patching Workflow Steps.

Gauging downtime

An ODH patch application to HA and secure cluster with 7 to 25 nodes can take anywhere between 60 to 100 minutes. However, the downtime is expected to take 20 to 40 minutes at the end.

Monitoring ODH Patching Workflow Steps

After you start the Installation of the ODH patch, you can see work requests in your Console. In the work request log of the ODH patch, there are messages describing the last completed stage. A series of steps corresponds to the completing stages.
Note

When you see PREPARE_UPGRADE, the patch is about to be applied, and downtime is about to start.
Step Patch Stage Downtime Information
1 DOWNLOAD patch No downtime required
2 PROCESS_PATCH_METADATA patch No downtime required
3 PATCH_AMBARI_SERVER_JAR patch Requires no downtime/Ambari restart
4 REGISTER_PATCH patch No downtime required
5 CREATE_PATCH_REPO patch No downtime required
6 APPLY_CUSTOM_PATCH patch Requires no downtime/Ambari restart
7 INSTALL_PATCH patch No downtime required
8 PREPARE_UPGRADE patch Required downtime
9 APPLY_UPGRADE patch Required downtime

High-level patch time line

T0: Click Apply Patch on OCI Console.

T1: The cluster health is checked for patch readiness, and the ODH patch bundle is downloaded to the cluster nodes (no downtime) (stages 1 and 2 in the previous table).

T2: While the ODH patch is getting prepared on the cluster, the Ambari Server can be restarted. If you're signed in to Ambari, you must sign out and sign back in. (no downtime for your Hadoop jobs) (stages 4 to 7 in the previous table).

T3: Downtime starts: All ODH/Hadoop services are stopped. The ODH patch is applied to all the nodes in the cluster, and all Hadoop services are started. (stages 8 and 9 in the previous table)

T4: Patch application is completed, and downtime ends. (stage 9 in the previous table)

Measuring when downtime starts and how much time is between T0 and T3

For HA and secure ODH cluster with 7 to 25 nodes, downtime is expected to start 40 to 50 minutes after starting the patch.