This chapter describes common OPatchAuto problems that may occur during usage.
This chapter covers the following:
In order for OPatchAuto to fully automate the patching process, it accesses various tools/utilities to carry out different patching phases. The four primary tools/utilities are:
OPatch - Applies patches to product (e.g., Fusion Middleware) homes.
rootcrs - Controls GI Home access by unlocking files so they are patchable, as well as stopping and starting the GI stack.
patchgen - Records the patch level.
datapatch - Applies SQL changes to database instances.
These tools/utilities are accessed during the patching process. Troubleshooting OPatchAuto, therefore, involves diagnosing issues with the individual tools.
When using OPatchAuto, problems may arise where it is not clear as to how to proceed with the resolution. The following use cases illustrate common patching scenarios where you may encounter such problems and general procedures you can use to resolve the problems.
The rootcrs.pl script performs the operations necessary to configure the Grid Infrastructure stack on a cluster. During an OPatchAuto session, you may encounter errors stemming from the rootcrs.pl script.
Command issued by OPatchAuto: $GRID_HOME/crs/rootcrs.pl -prepatch
If rootcrs.pl fails, error codes and their associated messages will be generated, as shown in the following example
CRS-1159: The cluster cannot be set to rolling patch mode because Grid Infrastructure is not active on at least one remote node.
If the message is not clear, you can obtain additional help by running the OERR utility to obtain cause and recommended action information.
Running OERR for a specific error code will generate both the cause and action for the specified error code.
$GRID_HOME/bin/oerr crs 1159 Cause: The cluster could not be set to rolling patch mode because Grid Infrastructure was not active on any of the remote nodes. Action: Start Grid Infrastructure on at least one remote node and retry the 'crsctl start rollingpatch' command, or retry patching using the non-rolling option.
CLSRSC-400: A system reboot is required to continue installing. oerr clsrsc 400 Cause: The installation of new drivers requires a node reboot to continue the install. Action: Reboot the node and rerun the install or patch configuration step.
The following table list the common error codes that you may encounter during a patching session. For an exhaustive list, see the Oracle® Database Error Messages manual.
Error Code | Console Message |
---|---|
1153 |
There was an error setting Grid Infrastructure to rolling patch mode. |
1154 |
There was an error setting Oracle ASM to rolling patch mode. |
1156 |
Rejecting the rolling patch mode change because the cluster is in the middle of an upgrade. |
1157 |
Rejecting the rolling patch mode change because the cluster was forcibly upgraded. |
1158 |
There was an error setting the cluster to rolling patch mode. |
1159 |
The cluster cannot be set to rolling patch mode because Grid Infrastructure is not active on at least one remote node. |
1162 |
Rejecting rolling patch mode change because the patch level is not consistent across all nodes in the cluster. The patch level on nodes <patch_list> is not the same as the expected patch level <patch_level> found on nodes <node_list>. |
1163 |
There was an error resetting Grid Infrastructure rolling patch mode. |
1164 |
There was an error resetting Oracle ASM rolling patch mode. |
1166 |
Rejecting rolling patch mode change because Oracle ASM is in <current_state> state. |
1168 |
There was an error resetting the cluster rolling patch mode. |
1171 |
Rejecting rolling patch mode change because the patch level is not consistent across all nodes in the cluster. The patch level on nodes <node_list> is not the same as the patch level <patch_level> found on nodes <node_list>. |
1181 |
There was an error retrieving the Grid Infrastructure release patch level. |
1183 |
Grid Infrastructure release patch level is <patch_level> and an incomplete list of patches <patch_list> have been applied on the local node. |
1191 |
There was an error retrieving the Grid Infrastructure software patch level. |
Issue 1: Non-rollable Patch is Applied in Rolling Mode
You have a two-node (node 1 and node 2) configuration and are attempting to apply a non-rollable patch in rolling mode.
Note:
By default, OPatchAuto applies patches in rolling mode.Because you are applying the patch in rolling mode, you have not shut down all databases and stacks. When OPatchAuto is run, it prints out the stack inventory and updates the binaries as expected.
When rootcrs.pl -postpatch
(Performs the required steps after the Oracle patching tool (OPatch) is invoked) is run, it fails due to ASM instances on different nodes at different patch levels. In this situation, OPatchAuto (which runs OPatch) fails with a non-zero exit code. However, the patch is left in the GI Home. The stack cannot be brought up.
It is important to note that, in this situation, it is not necessary to roll back the patch as it has already been applied to node 1. In general, make sure any attempt to bring up the stack is the very last step performed: Even if the stack fails to come up, the patch has been successfully applied to the node.
Because the patch is non-rollable, to resolve the stack issue:
Bring down the stack on all nodes.
Patch the remaining nodes by following the manual instructions provided in the patch README.
Bring the stack back up on all nodes.
Issue 2: OPatchAuto Fails to Patch the GI Home
You have a system patch containing sub-patches (P1 and P2). When OPatchAuto apply is run, it will first patch the RAC homes. In this scenario P1 is applied to RAC at time t1, P2 is then applied to RAC at time t2. OPatchAuto attempts to apply sub-patch P2 at time t3 to the GI Home but fails.
OPatchAuto fails with a non-zero exit code. The error message indicates failure occurred when applying sub-patch P2 on the GI Home. Note that the error message will provide you with a log file location. The RAC Home now contains P1 and P2, but the GI Home is missing P2.
You need to apply the missing patch to the GI Home. Because the system patch has already been successfully applied to the RAC Home, there is no need to roll back the patch.
From the log file, determine what caused patch application to fail for the GI Home.
Fix the issue that caused the GI Home patch application to fail.
When patch application fails for the GI Home, there are three possible causes:
patchgen
-- In this situation, refer to the recommended action specified for patchgen use case. See "Patchgen".
You will have to manually patch the GI Home. Refer to the patch README for instructions.
opatch -call
command failed. In this situation, an error occurred during OPatch execution. For example, OPatch could not copy a required file.
rootcrs.pl -prepatch
(perform the required steps before OPatch is invoked) fails.
Regardless of the cause of failure, you must resolve the issue and then manually patch the GI Home.
Re-run opatchauto resume
on the GI Home. OPatchAuto resumes the patch application from where it failed.
When applying a system patch, OPatchAuto fails as a result of error conditions encountered by patchgen
.
OPatchAuto fails with the STDOUT error message indicating a patching failure due to problems encountered by patchgen
.
Determine whether the error message is a result of a patchgen
error. From the message output, you can determine whether or not it is of patchgen origin by searching for the keyword "patchgen." The following example shows a sample error message generated by patchgen. The keyword "patchgen" and the associated error code is in bold.
Example 9-3 Patchgen Error Output
$export ORACLE_HOME=/scratch/GI12/product/12.1.0/crs $/scratch/GI12/product/12.1.0/crs/bin/patchgen commit -pi 13852018 loading the appropriate library for linux java.lang.UnsatisfiedLinkError: /scratch/GI12/product/12.1.0/crs/lib/libpatchgensh12.so (libasmclntsh12.so: cannot open shared object file: No such file or directory)
With the patchgen error code, run the oerr
command to obtain the cause and recommended action(s) to resolve the specific problem encountered by patchgen. Implement the suggested action. See "Running OERR".
When patchgen errors out, it will ask whether or not you want to keep the patch or roll it back. By default, patchgen rolls back the patch. Whether or not the patch is rolled back determines your course of action in the next step.
If the patch was not rolled back, run patchgen
again.
Despite the error, the patch itself still exists in the GI/RAC home since it was not rolled back.
If the patch has been rolled back, you may find that the OPatchAuto has applied the system patch to the RAC Home, but not all sub-patches to the GI Home. At this point, you need to apply only part of the system patch to the GI Home.
OPatch will tell you via lsinventory, which patches have not been applied. In order to apply specific sub-patches, you must resort to manual patching:
1. Shut down the stack.
2. Run opatch apply
(not OPatchAuto) on the GI Home.
Refer to the patch README for explicit instructions on applying a patch manually.
The following table lists possible patchgen error codes.
Table 9-2 Patchgen Error Codes
Error Code | Reason | Debugging Information |
---|---|---|
2 |
Internal Error |
Generic failure error code. |
3 |
Internal Error |
MS Windows: Resource file read error. |
4 |
Internal Error |
MS Windows: Resource file write failed. |
5 |
Internal Error |
Unix: Open for patch repository failed. |
6 |
Internal Error |
Unix: Normalization of full path libasmclntsh failed. |
7 |
Internal Error |
Unix: Write to patch repository failed. |
18 |
Internal Error |
PGA initialization failed. |
19 |
Internal Error |
Patch iterator init failed. |
40 |
Syntax Errors, appropriate message would be displayed. |
No argument to patchgen. Example: |
41 |
Syntax Errors, appropriate message would be displayed. |
No arguments to
Example: |
42 |
Syntax Errors, appropriate message would be displayed. |
-pi patchids are not numbers. Example: |
43 |
Syntax Errors, appropriate message would be displayed. |
-rb patchids are not numbers. Example: |
44 |
Syntax Errors, appropriate message would be displayed. |
Argument to
is something other than Example: |
45 |
Syntax Errors, appropriate message would be displayed. |
Patchgen invoked with invalid argument. Example: |
46 |
Loading libpatchgensh12.so failed. |
You attempt to run OPatchAuto to patch four product (e.g., Fusion Middleware) homes. This patch contains both bits and SQL to update the database. When you run OPatchAuto, it performs two actions:
Applies bits to the GI/RAC home
Runs SQL (via the datapatch
command)
Typically, you run OPatchAuto on each GI/RAC home. With each run, OPatchAuto calls datapatch
to run the patch SQL. datapatch
and will do nothing on the first n-1 nodes (no-op). On the last (n) node, datapatch
tries to execute the patch SQL.
If datapatch
fails, you will see an error message. To find out if the error is from datapatch, view the OPatchAuto debug log.
You see a warning message indicating SQLPatch/datapatch has failed. The warning message was generated when datapatch failed to apply the SQL to the last node.
In general, you can ignore the warning message and then run datapatch
manually on the last node. Datapatch establishes a connection to the Database and uses Queryable Inventory (http://docs.oracle.com/cd/E16655_01/appdev.121/e17602/d_qopatch.htm) to get information regarding the patch inventory of the Oracle Home. Any issues with establishing a connection to the Oracle database may result in ORA-nnnnn errors that are described under Oracle error codes and have suitable remedial steps listed (http://docs.oracle.com/cd/B28359_01/server.111/b28278/toc.htm). In addition, Queryable Inventory has some expected ORA-nnnnn errors. The list of these errors can be referenced at http://docs.oracle.com/cd/E16655_01/appdev.121/e17602/d_qopatch.htm#CEGIFCHH . For any other issues please contact Oracle Support.
Rollable VS. Non-Rollable Patches: Patches are designed to be applied in either rolling mode or non-rolling mode. Depending on whether the patch is rollable or non-rollable determines the course of action.
If a patch is rollable, the patch has no dependency on the SQL script. The database can be brought up without issue. Note that a rollable patch can be applied in either rolling or non-rolling mode.
If, however, the patch is non-rollable, then the patch must first be rolled back. Note that OPatchAuto will prevent you from applying a non-rollable patch in rolling mode.
OPatchAuto succeeds with a warning on datapatch
/sqlpatch
.
For rollable patches:
Ignore datapatch errors on node 1 - node(n-1).
On the last node (node n), run datapatch again. You can cut and paste this command from the log file.
If you still encounter datapatch
errors on the last node, call Oracle Support or open a Service Request.
For non-rollable patches:
Bring down all databases and stacks manually for all nodes.
Run opatchauto apply on every node.
Bring up the stack and databases. Note that the databases must be up in order for datapatch to connect and apply the SQL.
Manually run datapatch
on the last node. Note that if you do not run datapatch
, the SQL for the patch will not be applied and you will not benefit from the bug fix. In addition, you may encounter incorrect system behavior depending on the changes the SQL is intended to implement.
If datapatch continues to fail, you must roll back the patch. Call Oracle Support for assistance or open a Service Request.
OPatchAuto provides multiple venues to diagnose problems with OPatchAuto operations and patch application issues.
See also: Chapter 9, "Troubleshooting OPatchAuto" for more information.
There are multiple log files that provide useful information to diagnose operational issues with OPatchAuto and the patching process. To ensure that the log files contain the requisite diagnostic information (such as patch and system configuration details), run OPatchAuto in debug mode.
The following steps detail the typical troubleshooting process:
Look at the log files.
Log Files on the Local Node
The log files will be located in the ORACLE_HOME from which OPatchAuto was run.
Location: <ORACLE_HOME>/cfgtoollogs/opatchauto
Log Files on Remote Nodes
The log files will be located in the <ORACLE_HOME>
of the remote node.
Location: <ORACLE_HOME on remote node>/cfgtoollogs/opatchauto
The <ORACLE_HOME> information for the remote node can be found in the main log file of the local node.
The local console and main log file also contain log information about the remote node. From the console, specific log file information will be available for both the local as well as remote nodes. However, to view detailed log information, you should view the local and remote node log files directly.
If there is a failure, what are the suggested steps to follow in order to understand the issue in detail?
In case of failure, view the logs to determine why patch orchestration has failed. Once resolved, patch orchestration can resume.
Determine where patches are staged on the remote node.
Patches will be copied temporarily to the remote node at the following location:
<ORACLE_HOME>/OPatch/auto/dbtmp/<patch_id>/
OPatchAuto generates a system configuration log. The location is displayed on the console.
For example:
<ORACLE_HOME>/cfgtoollogs/opatchautodb/systemconfig*<timestamp>.log
This log contains all details of the OPatchAuto flow before patch execution activity starts, such as bootstrapping, identification of GI/SIHA, and user credential check on local node.
OPatchAuto interacts with other components such as SRVCTL, Grid utilities like ROOTCRS, OPatch to do patching. Failure can occur with these components also. The following basic checks can be done for each of the components in order for your to isolate where the problem is occurring.
SRVCTL: It is a utility provided to get information or alter the state of database homes. OPatchAuto uses it to do few operations like stop home, start home, home status, relocating instance, etc. Which operation opatchauto is trying to perform can be found out from the opatchauto logs. Now if something fails in this area, then srvctl can be directly used to check the status of the database home.
If the execution of opatchauto is done in debug mode then logs for the srvctl failed command will be available in the below location. This log can be analyzed to find the reason of failure and if it is related to system configuration then it needs to be fixed before using opatchauto.
/tmp/liveoutput_hostname.trc
Grid/SIHA Home utility: OPatchAuto uses rootcrs.pl/roothas.pl to stop and start GI/SIHA homes before and after applying the patch. This utility fails in some scenario depending on the system configuration or due to the patch. The log generated by this utility can be found at,
<GIHome>/cfgtoollogs/crsconfig/crspatch*<timestamp>.log (GI version < 12.2)
<OracleBase>/crsdata/<host>/crsconfig/crspatch*<timestamp>.log (GI version >= 12.2)
<OracleBase>/diag/ (GI version >= 12.2)
This log can be opened to check the reason of failure and if the reason is not familiar then it can be searched over the internet to find the possible cause or fix for it. If still unable to resolve it then the issue can be taken up with the development team for further investigation. Please ensure all the logs are attached along with the details of the initial analysis done on the issue. This will help in saving time for development team by giving a head ups.
OPatchCore: OPatchAuto uses opatch core API's for apply/rollback. Logs for the execution of these API's can be found under the below location,
<ORACLE_HOME>/cfgtoollogs/opatchauto/opatchauto_<timestamp>_binary.log
<ORACLE_HOME>/cfgtoollogs/opatchauto/core/opatch/opatch<timestamp>.log
You can verify whether a patching has been performed correctly.
Verifying that patching steps have been executed on local and/or remote nodes.
If patching has been is executed on the local node, host information will not be available from the console.
If patching has been executed on a remote node, host information will be available from the console.
Verifying that patching has been performed in rolling mode.
You can verify that patches have been applied in rolling mode directly from the console. The following sequence of phases occur when patches are applied in rolling mode:
Init Phase Only: Both the local and remote nodes will be completed.
Shutdown
Offline
Startup
Online
Finalize
All of these phases are performed end-to-end on the local node before proceeding to the remote node.
For multi-node environments, all of these phases performed end-to-end on a given node before moving on to the next node.
Verifying that patching has been performed in non-rolling mode.
You can verify that patches have been applied in non-rolling mode directly from the console. The following sequence of phases occur when patches are applied in non-rolling mode:
Init Phase Only: Both the local and remote nodes will be completed.
Shutdown
Offline
Startup
Online
Finalize
For Oracle Database 11.2 releases, each phase will be executed in parallel on all nodes.
For Oracle Database release 12.0 and greater, all of the phases will first be completed end-to-end on the local node. Each phase will then be executed in parallel on n-2 nodes (n being the number of nodes in the cluster). For the nth node, all phases will be completed end-to-end.
Verifying whether patches have been applied or rolled back.
Both the GRID and RAC homes must be in the same state before and after applying/rolling back patches.
To verify the current status of the GRID and RAC homes, run the following commands:
crsctl check status crs
srvctl status database -d <database>
"Using the "opatch lsinv" command user can verify the patches available in the system.
Issue: ohasd failure happens during rootcrs postpatch
when OPatchAuto is used to apply patches on a 12.1.0.2.0 GI/RAC with OCT/JAN PSU or after multiple apply/rollback of 12.1.0.2 PSUs.
Resolution: Change the following in <crs_home>/./crs/sbs/crswrap.sh.sbs
The workaround involves modifying 2 lines in the ohasd script
UID=`/usr/xpg4/bin/id -u` # Check for root privilege if [ $UID -eq 0 ];
Instead of using UID as a local variable, use anything else (UID1 for instance). This will prevent the following issue:
bash-3.2# ./ohasd restart
./ohasd: line 279: UID: readonly variable
The following patching scenarios illustrate known issues that may be encountered while patching.
The following issues pertain to rootcrs.pl execution.
opatchauto rollback
fails in rootcrs.pl -postpatch
when the -norestart
option is specified.
Running OPatchAuto fails in -norestart
mode during the rootcrs.pl -postpatch
step when rolling back the October PSU patch.
Starting CRS ... Failed Command "/usr/bin/perl /scratch/GI12/app/12.1.0/grid/crs/install/rootcrs.pl -postpatch -norestart" execution failed: Died at /scratch/GI12/app/12.1.0/grid/crs/install/crspatch.pm line 851.
A prerequisite one-off patch is required.
opatchauto
fails on the leaf node of a Flex cluster fails if the stack on the leaf node is not running.
Running OPatchAuto fails on the leaf node of a Flex cluster if the stack is not up on the cluster. This occurs in both rolling and nonrolling patching modes. rootcrs.pl -prepatch
fails with the console message shown in the following example.
Using configuration parameter file: crs/install/crsconfig_params 2013/09/27 06:00:01 CLSRSC-455: Failed attempt to initiate patch on a Leaf node
Bring up the stack on the leaf node before patching.
The following issues pertain to datapatch execution.
When running opatchauto rollback
, SQL changes are rolled back on the first node itself.
SQL changes are rolled back from the very first node.
Output from the command: 2013-10-07_05-16-28 : SQL Patching tool version 12.1.0.1.0 on Mon Oct 7 05:15:31 2013 Copyright (c) 2012, Oracle. All rights reserved. . Connecting to database...OK Determining current state...done The following patches will be rolled back: 17027533
Ignore the message if the patch is going to be rolled back from all the nodes. No workaround is available.
The following issue pertains to OPatch execution.
opatch napply
failureOPatchAuto fails during the opatch napply
step on the CRS home due to active files.
Opatchauto fails when patching the Grid Home.
[Sep 19, 2013 6:52:14 PM] Following executables are active : /u01/app/12.1.0/grid/lib/libclntsh.so.12.1 [Sep 19, 2013 6:52:14 PM] Prerequisite check "CheckActiveFilesAndExecutables" failed.
Wait a short period of time and then run opatchauto resume
.
The following issues pertain to OPatchAuto execution.
The following issue pertains to OPatchAuto version 12.1.0.1.7.
Issue: While trying to create the system instance for a GI/RAC setup when the name of the RAC in upper case, OPatchAuto encounters a null pointer exception. (Bug 20858866)
Symptom: Because the failure occurs in the early stages of setting up the system, and no operations are performed on the GI/RAC setup, there is no adverse impact.
Recommended Actions:
Use OPatch 12.1.0.1.6 (non-HP)
Use the OPatch ZIP file specified in the base bug.
The following issues pertain to OPatchAuto version 12.1.0.1.5 only.
Issue: OPatchAuto does not support software-only homes.
Symptom: config.sh failJune 2016s with the following error message:
kfod.bin: cannot execute: No such file or directory
Recommended Action: Follow the instructions in the patch README for manually applying the patch.
Issue: OPatchAuto errors out because it cannot find a shared home. This can happen on both shared as well as non-shared homes
Symptom: OPatchAuto generates the following error:
System Configuration Collection failed: oracle.osysmodel.driver.crs.productdriver.ProductDriverException: Unable to determine if "ORACLE HOME" is a shared oracle home.
Recommended Action: Run the following command (see examples below) as ROOT on the Oracle Home in order to determine the underlying issue. It should be run from the same location where opatchauto was run.
On a RAC HOME:
su <RAC OWNER> -c "$GRID_HOME/bin/cluvfy comp ssa -t software -s
$DB_HOME -n $NODELIST -display_status
"
On a GRID HOME:
su <GRID OWNER> -c "$GRID_HOME/bin/cluvfy comp ssa -t software -s
$GRID_HOME/crs/install -n $NODELIST -display_status"
Example:
su <GRID OWNER> -c "$GRID_HOME/bin/cluvfy comp ssa -t software -s
$GRID_HOME/crs/install -n node1,node2,node3 -display_status"
After resolving the underlying issue re-run opatchauto.
Issue: OPatchAuto fails to detect the status of a RAC One database. Hence, it fails to apply the SQL changes on it.
Symptom: OPatchAuto displays the following message from the console:
[WARNING] The local database instance 'INST' from 'RAC_HOME' is not running. SQL changes, if any, will not be applied.
Recommended Action: Manually run the datapatch
command on the RAC One database. The exact command will be shown in the opatchauto log file.
Issue: When OPatchAuto is run in -norestart mode, it still displays the message Starting CRS ... Successful
Symptom: OPatchAuto displays this message on the console
Starting CRS ... Successful
Recommended Action: Ignore the message. OPatchAuto performs the required operations without actually starting the CRS.
Issue: OPatchAuto fails to apply a system patch if it contains a one-off that is a subset of an existing patch.
Symptom: The command opatch prereq CheckConflictAgainstOH
is reported to have failed.
Recommended Action: Roll back the superset patch in the home, apply the system patch and then apply the superset patch again.
Issue: OPatchAuto fails to start the RAC home.
Symptom: The error message contains the code CRS-2717.
Recommended Action: Manually run the pending steps listed in the OPatchAuto log file.
Issue: OPatchAuto fails to create the system Instance.
Symptoms: System Configuration Collection failure: oracle.osysmodel.driver.crs.productdriver.ProductDriverException: oracle.ops.mgmt.cluster.ClusterInfoException: PRKC-1094 : Failed to retrieve the active version of crs:
Recommended Action: Refer to the Bug 19262534 for available fixes.
Running opatchauto apply
or opatchauto rollback
runs datapatch on all the nodes.
On the first node to be patched, the customer will see the message shown in Example 9-8 if RAC databases are configured on that node.
SQL changes, if any, are applied successfully on the following database(s):
This message can be ignored. The full information about the SQL changes made by the datapatch
step can be obtained from the OPatchAuto debug log. It is also possible that datapatch
might have applied the SQL changes pending from a previous patching session.
opatchauto resume -reboot
does not run the datapatch
step.
The datapatch
step would not be executed by the opatchauto resume -reboot
command.
The datapatch
step can be manually executed from any one node. All pending changes would also be executed by the next OPatchAuto session's datapatch
command.
Set the environment variables ORACLE_HOME and ORACLE_SID and execute the following command
$ORACLE_HOME/OPatch/datapatch
The opatchauto
command fails without any error message or stack trace on the console.
The user sees the following messages in the console and log files.
Failed to run this command : /usr/bin/perl $GRID_HOME/crs/install/rootcrs.pl -postpatch Executing command: $RAC_HOME/bin/srvctl start home …
Refer to the crspatch
log file at this location and make sure the timestamp points to the OPatchAuto execution time:
$GRID_HOME/cfgtoollogs/crsconfig/crspatch_<hostname>_<timestamp>.log
If this file contains the message CLSRSC-400: A system reboot is required to continue installing, follow these steps:
Reboot the machine.
Run the following command:
opatchauto resume -reboot