Troubleshooting Exadata Database Server Updates

8.11.1 Troubleshooting Exadata Database Server Updates

You can use the log files generated by the update utility to troubleshoot updates.

The update utility orchestrates updating the Exadata database servers. Updating database nodes with the patchmgr tool is less verbose because it prints only minimal information to the screen. If additional information is required, you can view the patchmgr logs and the dbnodeupdate.sh logs that patchmgr copies over from other servers, if available. The log file (dbnodeupdate.log) and the diag file (dbnodeupdate.<runid>.diag) will eventually exist on two locations:

On each updated database server, in the /var/log/cellos directory
Consolidated on the node running the update utility.

On the node running the update utility, if the -log_dir flag was set to auto, the log files will be stored in the log/<directory based on contents of nodes in list file> directory, relative from the directory where the update utility is started from. For example, if the update utility is located in /u01/dbserver.patch, then the log directory may be /u01/dbserver.patch/dm01db01_dm01db02_e8f1f753.

Important files found in the log directory are:

patchmgr.log contains the consolidated screen output from running the remote update commands on the different database servers.
<hostname>_dbnodeupdate.<runid>.diag is the diag file for the specific run on a database server.
<hostname>_dbnodeupdate.log contains dbnodeupdate.log output appended from /var/log/cellos from the remote database server.

When a prerequisite check, backup, update, or rollback fails, error messages on screen should provide information on which step failed on which node. Consult the log files mentioned above if more information is required. Search the log file for the start of a new run (search for zzz).

Check if the time matches your run. If it matches, note the runid for further reference. Then search for ERROR.

If an update action fails before the actual YUM update, you can retry the update after resolving the error. If the update failed half-way, it is recommended that you roll back, resolve the error, and retry.

In rare cases, patchmgr may be unable to determine the status of an update, whether the update was successful or not. In such cases, it displays a message that the update failed. However, it is possible that the update still completed successfully. To determine the actual status of the update:

Check the image status of the (database) node. You can do this by running the imageinfo command. The Image status line displays the status.
Check the version of the Exadata software. This can also be determined from the imageinfo command.

If the image status is success, and the Exadata version is the new expected version, then the update was successful and you can ignore the update failed message. Then:

Run dbnodeupdate.sh -c manually on the particular node to perform the completion steps of the update.
Remove the completed node from the (database) node file.
Rerun patchmgr to perform the update on the remaining nodes.

Other things to check if the update fails include:

The correct syntax for using patchmgr to update database nodes can be found in the patchmgr online help.
SSH equivalence must be configured before using patchmgr.
Download the latest dbserver.patch.zip from My Oracle Support note 1553103.1.
Open a service request with Oracle Support Services to analyze why the patchmgr orchestration failed.