|Oracle® Fusion Middleware Oracle Process Manager and Notification Server Administrator's Guide
Release 11g (18.104.22.168.0)
Part Number E14007-02
This chapter describes some troubleshooting tips for OPMN. It features the following topics:
This section describes some of the common problems encountered when using OPMN. It features the following topics:
Unable to start a system process using OPMN.
Try the following if you are unable to start a system process using OPMN:
Verify and if necessary, correct, the command input. Confirm the spelling and choice of option for the command you are entering.
Note:Do not use command line scripts or utilities from previous versions of Oracle Application Server or Oracle Fusion Middleware for starting OPMN or system components.
Most managed processes now have their own log directory, and OPMN uses that if it is present as the location for the process log file.
If the default directory for a managed process does not exist, OPMN places the console log in its own log directory:
For example, the standard output log for Oracle HTTP Server may be
Verify the dependency requirements for the system process you are attempting to start.
Verify the element values for the system component in the
opmn.xml file. Use the
opmnctl validate command to verify configuration of
opmn.xml file. You may have mis-configured the
opmn.xml for the system component you are attempting to start.
Your system processes are dying or unreachable.
If your system processes are dying or unreachable:
Review the system component specific output in the
Look at the
/diagnostics/logs/OPMN/opmn/opmn.log for system processes. Look for
process crashed or
process unreachable messages. OPMN automatically restarts system processes that die or become unresponsive.
See Also:OPMN log Files for more information about OPMN log files
Create event scripts for any pre-stop or post-crash events. The event scripts could be used to create a specific log file or send you an email about a failure.
See Also:Troubleshooting with Event Scripts for information about troubleshooting with event scripts
The time it takes to execute an
opmnctl command is dependent on the type of system process and available computer hardware. Because of this the time it takes to execute an
opmnctl command may not be readily apparent.
To verify successful execution of the
opmnctl command, try the following:
timeout attribute for the component that is not starting. Set the timeout in the
opmn.xml file at a level that allows OPMN to wait for process to come up. This functionality is available with the
startproc command which starts all the relevant processes configured in the
start element in the
opmn.xml file and change the
retry attribute to a higher increment of time.
Look at the
/diagnostics/logs/ for the system process that is not starting.
Review the component-specific log file for the system component that is not starting. For example,
See Also:Chapter 6 for more information about the common configuration of the
A system component is automatically restarted by OPMN.
If a system component is automatically restarted by OPMN, try the following:
Review the message for the system component in the
Verify that the ping timeout for the system component is sufficient. A system component that receives a lot of activity may require an increase in the length of time for the timeout. Increase the ping timeout element in the system component
Occasionally, there is unexpected behavior when you use the
opmnctl start command to start OPMN; either only OPMN is started or OPMN makes a best effort to start system processes. Typically, this unexpected behavior is due to turning-off or rebooting your computer without first shutting down OPMN. When you restart your computer, all system component processes are started.
Oracle recommends that you shutdown OPMN before shutting down your computer. Use the
opmnctl stopall command to stop OPMN and system component processes.
On the Microsoft Windows operating system, you can use the Windows services control panel to stop OPMN and system components.
Note:OPMN keeps a record on disk of the expected status of the processes it manages. If a computer goes down while OPMN is running, upon restart OPMN uses the information cached on disk and make a best effort attempt to automatically restart all processes that were running at the time the system went down. This may catch some users off guard who start only OPMN and notice that processes managed by OPMN have also been started even though an explicit request to start those processes has not been issued.
You can suppress this automatic process recovery by setting the following attribute for the
process-manager element in the
Unable to start an system process.
If you are unable to start an system process, check if an element in the system
opmn.xml file is
disabled. If an element in the
opmn.xml file is
disabled OPMN generates an output message of
If you have multiple Oracle Fusion Middleware installations on one host and you start them at the same time (for example, to start a farm), OPMN may become unresponsive. You may receive an error message such as:
"failed to restart a managed process after the maximum retry limit"
This may occur when two Oracle instances on the same host use the same ports. An OHS component in one Oracle instances is trying to use the same port as an OHS component in a different Oracle instance. When two OPMN processes on a host start at the same time, there is no coordination between them on port usage.Port allocation for all OHS instances within Oracle Fusion Middleware is controlled by OPMN; there can be overlapping port ranges within a single
opmn.xml file. However, when two OPMN processes on a host start at the same time, there is no coordination between them on port usage.
Make sure the ports assigned to OHS in each Oracle instance are unique and not used by any other process on the computer.
It is also recommended that you increase the maximum number of retries for starting Oracle instances. If you have identical port ranges in two Oracle homes and increase the number of times OPMN attempts to restart a process, OPMN eventually selects a port that works. This technique ultimately does not eliminate the problem, because there is the possibility that OPMN will not find a port that works in the number of port connection attempts that you have specified in the opmn.xml file.
If you are unable to stop system components using the
opmnctl stop or
opmnctl stopall commands, the component or process was most likely not started using OPMN. The component or process might have been started using a startup script or utility.
System components should never be started or stopped manually. Do not use command line scripts or utilities from previous versions of Oracle Application Server or Oracle Fusion Middleware for starting and stopping system components.
Use the Fusion Middleware Control Console and the
opmnctl command line utility to start or stop system components.
See Also:Chapter 5 for OPMN command-line examples
You may receive a
globalInitNLS error when executing the
opmnctl command. The following error message is displayed:
"globalInitNLS: NLS boot file not found or invalid -- default linked-in boot block used XML parser init: error 201."
This error occurs when the
ORA_NLS33 environmental variable is set. This environmental variable should not be set.
On some computers, when OPMN starts up, it consumes large amounts of CPU processing capability. This can vary from approximately 50% to 60% of your computer's CPU processing capabilities. In affected computers, the OPMN CPU processing consumption continues until OPMN is shutdown.
One of the possible cause for the excessive CPU processing consumption is the installation environment used multibyte text character sets such as Japanese.
When trying to start OPMN using the
opmnctl start or
opmnctl startall commands you receive the following error messages:
pingwait exits with 1220384
pingwait exits with 1220396
These error messages are generated when there are syntax errors in the
/config/OPMN/opmn/opmn.xml that need to be corrected.
If you encounter these error messages do the following:
run the following command (with the complete directory path to the
opmnctl validate opmn.xml
remove all empty tags from the
If you install Oracle Fusion Middleware on a computer that contains a previous installation of the Oracle Database or Oracle Fusion Middleware, the
/diagnostics/logs/OPMN/opmn/opmn.log file increases in size to over 4100000 KB due to continuous logging. The file may contain the following error message:
"[ons-connect] Local connection 127.0.0.1,6100 invalid form factor "
Change the request port for the
opmn.xml file to a value greater than the 6100. For example:
<port local ="6202" remote="6302" request="6105"/>
This error is most often caused by a conflict with a previous Oracle Database or Oracle Fusion Middleware installation on the same computer. The above occurs due to entries in the
oraInventory directory of an existing or previous installation of Oracle Database or Oracle Fusion Middleware.
I've setup a AS cluster of OHS1 + J2EE1 + OHS2 + J2EE2. OHS1 and J2EE1 are on system1, OHS2 and J2EE2 are on system2. I use OHS2 as opmn static Discovery server. I configure this in all four systems. When I do opmnctl cluster status, I see all four instances. This is true for all four instances. Now I crash system2 (do a shutdown -r now). OHS1 and J2EE1 lose OHS2 and J2EE2 from its map of clustered instances. After system2 comes back online, I just start all the processes in J2EE2 instance. Here is where the bug is. J2EE2 does not join the map of OHS1 and J2EE1. cluster status in OHS1 and J2EE1 just shows two instances on system1. Similarly J2EE2 is only aware of itself.
For high availability, multiple discovery servers should always be configured.
If this is not practical, then multi-cast discovery should be used.
Using multiple Discovery Servers does help in this case. I added OHS1 as a DS
to all four nodes and then retried the same scenario as mentioned above. The
cluster ring is now not broken. The discover line in opmn.xml is like
I'm not sure if these points are explained clearly in the documentation.
Hence assigning this to document team.
For more information refer to Tip
Your Oracle instance or component is providing some incorrect defaults for your
opmnctl deleteinstance command with the
-force true option to clean up the Oracle instance or component.
opmnctl deleteinstance -oracleInstance /scratch/bsong/demo1/inst1 -instanceName inst1 \ -adminHost myhost -adminPort myport -force true
Use discretion when utilizing the
-force option. When you use this option with the
opmnctl deleteinstance command a number of safeguards that protect the integrity of the Oracle instance or component are overruled. Misuse of this option may produce undesirable results.
Explicitly listing all applicable arguments (for example,
adminPort) with the
opmnctl deleteinstance command can help attain the desired results.
During the course of executing cleanup, a forced command displays warnings or exceptions consistent with the damaged state of the Oracle instance or component. These warning are provided as visual feedback for the inconsistencies encountered and do not necessarily indicate that further corrective action is needed.
There are several methods for troubleshooting any problems you may have using OPMN:
The OPMN log files enable you to troubleshoot difficulties you might have in execution and use of OPMN and system component processes. OPMN and system component processes generate log files during processing. You can review the following generated log files to verify successful or unsuccessful execution of an OPMN command:
/diagnostics/OPMN/opmn/opmn.out: contains the standard output (
stdout) and standard error (
stderr) logs of OPMN. Also referred to as the OPMN "console log". After a certain point in OPMN initialization, nothing else is written to this file. Only a small set of messages ever appears in this file; therefore, this file may not be present if you conduct a search through the log file directories.
/opmn/logs/opmn.log: tracks command execution and operation progress. It contains messages useful for monitoring the operations of the OPMN server. Output written to the
opmn.log file contains the exit status of a child OPMN process. A status code of 4 indicates a normal reload of OPMN. All other status codes indicate an abnormal termination of the child OPMN process. The
opmn.log file is configured using the
<log> attribute in the
opmn.xml file. Refer to Chapter 6 for more information about the common configuration of the
/diagnostics/logs/OPMN/opmn/debug.log: contains OPMN debug log messages (English only) for ONS and PM. Review the error codes and messages that are shown in the
debug.log file. The PM portion of OPMN generates and outputs the error messages in this file. The
debug.log file tracks command execution and operation progress. The level of detail that gets logged in the
debug.log can be modified by configuration of the
<debug> element in the
Refer to Chapter 6 for examples of debug levels.
debug.log file to debug the ONS portion of OPMN or for early OPMN errors. The ONS portion of OPMN is initialized before PM. Therefore, errors that occur early in OPMN initialization shows up in the
Enable usage of the
debug.log file only after conferring with Oracle Support. The
debug.log file is used by Oracle Support to debug and diagnose OPMN issues. Messages that are contained in the
debug.log file are typically not readily comprehensible to the user.
OPMN enables you to rotate the
debug.log files based on parameters of file size, specific time, or both, as a basis for file rotation. You can enable rotation by configuring the
rotation-hour attributes of the
<debug> tags in the
opmn.xml file. When either the log file grows to a specified size or the specified time of the day is reached, or a combination of both parameters, the OPMN logging mechanism closes the file, rename the file with a unique time stamp suffix, and then create a new
The OPMN console log file (
opmn.out) is not rotated; this file is typically very small in size. Once OPMN surpasses an point of initialization, output is no longer generated to the console output file; therefore, only a relatively small set of messages appears in this file.
At process startup, before handing off an existing console log file to a managed process, OPMN checks the size against a configured limit (
rotation-size attribute of the
<log> tag). If the file size exceeds the limit, OPMN renames the existing file to include a time stamp, and then create a new file for the managed process. If the
rotation-size attribute is not configured, OPMN is not able rotate the process console log file.
opmnctl debug command to verify the status of an system process and whether any actions are pending. This command generates output that can be used in conjunction with contact to your local Oracle support to diagnose your OPMN problem.
The syntax for the
opmnctl debug command is:
opmnctl debug [comp=pm|ons] [interval=<secs> count=<num>]
Output is generated following execution of the
opmnctl debug command. Oracle recommends that you contact Oracle support to use the generated output to assist in diagnosis of your problem.
The attributes (
<attr>) name for this command are either
count. The value for comp can be either
pm, representing ONS and PM, respectively. If
comp is not specified, then both
pm debug information is reported. For example, the following command outputs debug information for ONS.
opmnctl debug comp=ons
You can specify the interval in seconds and number of requests sent to OPMN to assist in the debugging process. The values of <interval> and <count> must always be specified together. Values for them should be integers greater than 0. For example, the following command, outputs debug information at an interval of 5 seconds 3 times.
opmnctl debug comp=pm interval=5 count=3
Contact your local Oracle support to assist you in using the
opmnctl debug command to diagnose your OPMN problem.
Fusion Middleware Control Console provides a graphical interface that enables diagnosis of system components in your network and enterprise. Fusion Middleware Control Console features a log page. The log page enables you to view all of the system log files in one place and trace problems across multiple log files. Fusion Middleware Control Console uses an API that contacts OPMN.
You can create your own event scripts that record system process event activities. You can create a script that records events prior to the start or stop of system processes, as well as an unscheduled system crash.
Example B-1 shows a pre-start event script.
Example B-1 Pre-start Event Script
#!/bin/sh echo echo =---===----======---=-----=-----=------======----===---= echo =---===----===== PRE-START EVENT SCRIPT =====----===---= echo =---===----======---=-----=-----=------======----===---= timeStamp="N/A" instanceName="N/A" componentId="N/A" processType="N/A" processSet="N/A" processIndex="N/A" stderrPath="N/A" # not available w/pre-start unless part of restart stdoutPath="N/A" # not available w/pre-start unless part of restart reason="N/A" pid="N/A" # only available with pre-stop, post-crash startTime="N/A" # only available with pre-stop, post-crash while [ $# -gt 0 ]; do case $1 in -timeStamp) timeStamp=$2; shift;; -instanceName) instanceName=$2; shift;; -componentId) componentId=$2; shift;; -processType) processType=$2; shift;; -processSet) processSet=$2; shift;; -processIndex) processIndex=$2; shift;; -stderr) stderrPath=$2; shift;; -stdout) stdoutPath=$2; shift;; -reason) reason=$2; shift;; -pid) pid=$2; shift;; -startTime) startTime=$2; shift;; *) echo "Option Not Recognized: [$1]"; shift;; esac shift done echo timeStamp=$timeStamp echo instanceName=$instanceName echo componentId=$componentId echo processType=$processType echo processSet=$processSet echo processIndex=$processIndex echo stderr=$stderrPath echo stdout=$stdoutPath echo reason=$reason echo pid=$pid echo startTime=$startTime
Note:The pre-start event script example, Example B-1, will not work for the Microsoft Windows operating system; however, you can create a script, with a
.batsuffix, with similar functionality.
Use the full path to the
.bat file when adding the necessary configuration information to the
The environment variable used to launch OPMN server is not inherited by the system process started by OPMN server. OPMN sets the environment variables at the
ias-instance level, with the values extracted either from the
ias-instance configuration or from the OPMN run time environment.
See Also:Chapter 6 for more information about the common configuration of the
You can find more solutions on Oracle MetaLink (
http://metalink.oracle.com). If you do not find a solution for your problem, log a service request.
Oracle Fusion Middleware Release Notes, available on the Oracle Technology Network: