Skip Headers
Oracle® Process Manager and Notification Server Administrator's Guide
10g Release 3 (10.1.3)
B15976-01
  Go To Documentation Library
Home
Go To Product List
Solution Area
Go To Table Of Contents
Contents
Go To Index
Index

Previous
Previous
Next
Next
 

A OPMN Troubleshooting

This chapter describes some troubleshooting tips for OPMN. It features the following topics:

A.1 Problems and Solutions

This section describes some of the common problems encountered when using OPMN. It features the following topics:

A.1.1 Oracle Application Server Process Does Not Start

Problem

Unable to start an Oracle Application Server process using OPMN.

Solution

Try the following if you are unable to start an Oracle Application Server process using OPMN:

  • Verify and if necessary, correct, the command input. Confirm the spelling and choice of option for the command you are entering.


    Note:

    Do not use command line scripts or utilities from previous versions of Oracle9i Application Server or Oracle Application Server for starting OPMN or Oracle Application Server components.

  • Review the standard out output log for the Oracle Application Server process. Output from the process console is located in the ORACLE_HOME/opmn/logs directory. For example, the standard output log for Oracle HTTP Server may be HTTP_Server~1.

  • Verify the dependency requirements for the Oracle Application Server process you are attempting to start.

  • Verify the element values for the Oracle Application Server component in the opmn.xml file. Use the opmnctl validate command to verify configuration of opmn.xml file. You may have mis-configured the opmn.xml for the Oracle Application Server component you are attempting to start.

A.1.2 Determining if Oracle Application Server Processes are Dying or Unresponsive

Problem

Your Oracle Application Server processes are dying or unreachable.

Solution

If your Oracle Application Server processes are dying or unreachable:

  • Review the Oracle Application Server component specific output in the ORACLE_HOME/opmn/logs.

    Look at the ORACLE_HOME/opmn/logs/opmn.log for Oracle Application Server processes. Look for process crashed or process unreachable messages. OPMN automatically restarts Oracle Application Server processes that die or become unresponsive.

  • Create event scripts for any pre-stop or post-crash events. The event scripts could be used to create a specific log file or send you an email about a failure.

A.1.3 opmnctl Command Execution Times Out

Problem

The time it takes to execute an opmnctl command is dependent on the type of Oracle Application Server process and available computer hardware. Because of this the time it takes to execute an opmnctl command may not be readily apparent.

The default start time out for OC4J is approximately five minutes. If an OC4J process does not start-up after an opmnctl command, OPMN will wait approximately an hour before timing out and aborting the request.

Solution

To verify successful execution of the opmnctl command, try the following:

  1. Increase the start element timeout attribute for the component that is not starting. Set the timeout in the opmn.xml file at a level that will allow OPMN to wait for process to come up. This functionality is available with the startproc command which will start all the relevant processes configured in opmn.xml.

  2. Check the start element in the opmn.xml file and change the retry attribute to a higher increment of time.

  3. Look at the ORACLE_HOME/opmn/logs/ for the Oracle Application Server process that is not starting.

  4. Review the component-specific log file for the Oracle Application Server component that is not starting. For example, ORACLE_HOME/opmn/logs/OC4J~home~default_group~1.

A.1.4 Oracle Application Server Component Automatically Restarted by OPMN

Problem

An Oracle Application Server component is automatically restarted by OPMN.

Solution

If an Oracle Application Server component is automatically restarted by OPMN, try the following:

  • Review the message for the Oracle Application Server component in the ORACLE_HOME/opmn/logs/opmn.log file.

  • Verify that the ping timeout for the Oracle Application Server component is sufficient. An Oracle Application Server component that receives a lot of activity may require an increase in the length of time for the timeout. Increase the ping timeout element in the Oracle Application Server component opmn.xml file.

A.1.5 Unexpected opmnctl start Behavior

Problem

Occasionally, there is unexpected behavior when you use the opmnctl start command to start OPMN; either only OPMN is started or OPMN makes a best effort to start Oracle Application Server OPMN-managed processes. Typically, this unexpected behavior is due to turning-off or rebooting your computer without first shutting down OPMN. When you restart your computer, all OPMN-managed processes are started.

Solution

Oracle recommends that you shutdown OPMN before shutting down your computer. Use the opmnctl stopall command to stop OPMN and OPMN-managed processes.

On the Microsoft Windows operating system, you can use the Windows services control panel to stop OPMN and OPMN-managed processes.


Note:

OPMN keeps a record on disk of the expected status of the processes it manages. If a computer goes down while OPMN is running, upon restart OPMN will use the information cached on disk and make a best effort attempt to automatically restart all processes that were running at the time the system went down. This may catch some users off guard who start only OPMN and notice that processes managed by OPMN have also been started even though an explicit request to start those processes has not been issued. You can suppress this automatic process recovery by removing all files located in the ORACLE_HOME/opmn/logs/states directory before attempting to start OPMN.

The states directory and its contents should not be modified by the user if OPMN or any process managed by OPMN is running. Oracle recommends not modifying the /states directory.


A.1.6 Disabled Element in the opmn.xml File

Problem

Unable to start an Oracle Application Server process.

Solution

If you are unable to start an Oracle Application Server process, check if an element in the Oracle Application Server opmn.xml file is disabled. If an element in the opmn.xml file is disabled OPMN will generate an output message of "Missing" or "Disabled".

A.1.7 Unable to Start OC4J

Problem

If you have multiple Oracle Application Server installations on one host and you start them at the same time (for example, to start a cluster), OPMN may become unresponsive. You may receive an error message such as:

"failed to restart a managed process after the maximum retry limit"

This may occur when two Oracle homes on the same host use the same port ranges for RMI, JMS, and AJP ports. An OC4J instance in one Oracle home is trying to use the same port as an OC4J instance in a different Oracle home.Port allocation for all OC4J instances within Oracle Application Server is controlled by OPMN; there can be overlapping port ranges within a single opmn.xml file. However, when two OPMN processes on a host start at the same time, there is no coordination between them on port usage.

Solution

To coordinate port usage, assign unique port ranges to each Oracle home. The OPMN process in one Oracle home and the OPMN in a different Oracle home will not attempt to use the same port numbers when assigning OC4J ports, and will not attempt to bind to the same port.

It is also recommended that you increase the maximum number of retries for starting OC4J instances. If you have identical port ranges in two Oracle homes and increase the number of times OPMN attempts to restart a process, OPMN will eventually select a port that works. This technique ultimately does not eliminate the problem, because there is the possibility that OPMN will not find a port that works in the number of port connection attempts that you have specified in the opmn.xml file.

A.1.8 Unable to Stop Component

Problem

If you are unable to stop Oracle Application Server components or OPMN-managed processes using the opmnctl stop or opmnctl stopall commands, the component or process was most likely not started using OPMN. The component or process might have been started using a startup script or utility.

Solution

Oracle Application Server components and OPMN-managed processes should never be started or stopped manually. Do not use command line scripts or utilities from previous versions of Oracle Application Server for starting and stopping Oracle Application Server components.

Use the Application Server Control Console and the opmnctl command line utility to start or stop Oracle Application Server components and OPMN-managed processes.

A.1.9 globalInitNLS Error

Problem

You may receive a globalInitNLS error when executing the opmnctl command. The following error message is displayed:

"globalInitNLS: NLS boot file not found or invalid -- default linked-in boot block used XML parser init: error 201."

Solution

This error occurs when the ORA_NLS33 environmental variable is set. This environmental variable should not be set.

A.1.10 Start Remote Hosts of a Cluster Independently

Problem

Starting a cluster of remote hosts using Application Server Control Console will result in an unknown status. This occurs because ONS is bound to the local host IP address and it is not reachable from remote hosts.

Solution

Oracle recommends starting each member of the cluster independently to effectively monitor and obtain the status from remote hosts. Additionally, make sure ONS is not bound to local host IP address.

A.1.11 OPMN Start Up Consumes CPU Processing Capability

Problem

On some computers, when OPMN starts up, it consumes large amounts of CPU processing capability. This can vary from approximately 50% to 60% of your computer's CPU processing capabilities. In affected computers, the OPMN CPU processing consumption will continue until OPMN is shutdown.

Solution

The following are some possible causes for the excessive CPU processing consumption:

  • the installation environment used multibyte text character sets such as Japanese.

  • the multi-cast address for all ONS servers is mis-configured in the opmn.xml file.

    ONS uses this address to discover all other instances in the cluster

A.1.12 Error Messages During Start-up of OPMN

Problem

When trying to start OPMN using the opmnctl start or opmnctl startall commands you receive the following error messages:

pingwait exits with 1220384

or

pingwait exits with 1220396

These error messages are generated when there are syntax errors in the ORACLE_HOME/opmn/conf/opmn.xml that need to be corrected.

Solution

If you encounter these error messages do the following:

  • run the following command (with the complete directory path to the opmn.xml file):

    prompt > opmnctl validate opmn.xml
    
    
  • remove all empty tags from the opmn.xml file.

A.1.13 Disable, or Reconfigure, Firewall When Creating Topology Using Multi-Cast Address Configuration

Problem

When setting up a network of Oracle Application Server instances to form a topology using the multi-cast address configuration for all ONS servers, some of the instances are not recognized by OPMN.

Solution

If you are planning to network multiple Oracle Application Server instances to form a topology, by using the multi-cast address configuration for all ONS servers in the opmn.xml file, you must disable, or reconfigure, the firewall before initiating networking with other Oracle Application Server instances.

If the firewall is not disabled, or re configured, the multi-cast information for setting up the network may not get through and the topology will not be setup correctly. All of the OPMN ports must be allowed to accept incoming notifications.

A.2 Diagnosing OPMN Problems

There are several methods for troubleshooting any problems you may have using OPMN:

A.2.1 OPMN log Files

The OPMN log files enable you to troubleshoot difficulties you might have in execution and use of OPMN and OPMN-managed processes. OPMN and OPMN-managed processes generate log files during processing. You can review the following generated log files to verify successful or unsuccessful execution of an OPMN command:

  • ORACLE_HOME/opmn/logs/opmn.out: contains the standard output (stdout) and standard error (stderr) logs of OPMN. Also referred to as the OPMN "console log". After a certain point in OPMN initialization, nothing else will be written to this file. Only a small set of messages will ever appear in this file; therefore, this file may not be present if you conduct a search through the log file directories.

  • Process control log files (ORACLE_HOME/opmn/logs/): contain the standard output and standard error of OPMN managed processes. OPMN creates a log file for each component and assigns a unique concatenation of the Oracle Application Server component with a number. For example, the standard output log for OC4J may be OC4J~home~default_group~1. When a process terminates and is replaced by a new process, console log output from the previous process is preserved and the replacement process appends to the end of the console log file. The process specific console logs are the first and best resource for investigating problems related to starting and stopping components.

  • ORACLE_HOME/opmn/logs/opmn.log: tracks command execution and operation progress. It contains messages useful for monitoring the operations of the OPMN server. Output written to the opmn.log file contains the exit status of a child OPMN process. A status code of 4 indicates a normal reload of OPMN. All other status codes indicate an abnormal termination of the child OPMN process. The opmn.log file is configured using the <log> attribute in the opmn.xml file. Refer to Chapter 6, "opmn.xml Common Configuration" for more information.

  • ORACLE_HOME/opmn/logs/opmn.dbg: contains OPMN debug log messages (English only) for ONS and PM. Review the error codes and messages that are shown in the opmn.dbg file. The PM portion of OPMN generates and outputs the error messages in this file. The opmn.dbg file tracks command execution and operation progress. The level of detail that gets logged in the opmn.dbg can be modified by configuration of the <debug> element in the opmn.xml file.

    Refer to Chapter 6, "opmn.xml Common Configuration" for examples of debug levels.

    Use the opmn.dbg file to debug the ONS portion of OPMN or for early OPMN errors. The ONS portion of OPMN is initialized before PM. Therefore, errors that occur early in OPMN initialization will show up in the opmn.dbg file.

    Enable usage of the opmn.dbg file only after conferring with Oracle Support. The opmn.dbg file is used by Oracle Support to debug and diagnose OPMN issues. Messages that are contained in the opmn.dbg file are typically not readily comprehensible to the user.

A.2.1.1 opmn.log and opmn.dbg File Rotation

OPMN enables you to rotate the opmn.log and opmn.dbg files based on parameters of file size, specific time, or both, as a basis for file rotation. You can enable rotation by configuring the rotation-size and rotation-hour attributes of the <log> and <debug> tags in the opmn.xml file. When either the log file grows to a specified size or the specified time of the day is reached, or a combination of both parameters, the OPMN logging mechanism will close the file, rename the file with a unique time stamp suffix, and then create a new opmn.log or opmn.dbg file.

The OPMN console log file (opmn.out) is not rotated; this file is typically very small in size. Once OPMN surpasses an point of initialization, output is no longer generated to the console output file; therefore, only a relatively small set of messages will appear in this file.

A.2.1.2 Process Console log File Rotation

At process startup, before handing off an existing console log file to a managed process, OPMN checks the size against a configured limit (rotation-size attribute of the <log> tag). If the file size exceeds the limit, OPMN will rename the existing file to include a time stamp, and then create a new file for the managed process. If the rotation-size attribute is not configured, OPMN will not be able rotate the process console log file.

A.2.2 opmnctl debug

Use the opmnctl debug command to verify the status of an Oracle Application Server process and whether any actions are pending. This command generates output that can be used in conjunction with contact to your local Oracle support to diagnose your OPMN problem.

The syntax for the opmnctl debug command is:

opmnctl [<scope>] debug [comp=pm|ons] [interval=<secs> count=<num>]

where @scope is the optional scope for the request.

Output is generated following execution of the opmnctl debug command. Oracle recommends that you contact Oracle support to use the generated output to assist in diagnosis of your problem.

The attributes (<attr>) name for this command are either comp, interval, or count. The value for comp can be either ons or pm, representing ONS and PM, respectively. If comp is not specified, then both ons and pm debug information is reported. For example, the following command outputs debug information for ONS.

prompt > opmnctl debug comp=ons

You can specify the interval in seconds and number of requests sent to OPMN to assist in the debugging process. The values of <interval> and <count> must always be specified together. Values for them should be integers greater than 0. For example, the following command, outputs debug information at an interval of 5 seconds 3 times.

prompt > opmnctl debug comp=pm interval=5 count=3

Contact your local Oracle support to assist you in using the opmnctl debug command to diagnose your OPMN problem.

A.2.3 Oracle Enterprise Manager 10g Application Server Control Console

Application Server Control Console provides a graphical interface that enables diagnosis of Oracle Application Server components in your network and enterprise. Application Server Control Console features a log page. The log page enables you to view all of the Oracle Application Server log files in one place and trace problems across multiple log files. Application Server Control Console uses an API that contacts OPMN.

You can use Application Server Control Console to enable or disable Oracle Application Server components: You can disable components so they do not start when you start an Oracle Application Server instance.


See Also:

Oracle Application Server Administrator's Guide

A.2.4 Troubleshooting with Event Scripts

You can create your own event scripts that record Oracle Application Server process event activities. You can create a script that records events prior to the start or stop of Oracle Application Server processes, as well as an unscheduled system crash.

Refer to the <event-scripts> element description in Chapter 6, "opmn.xml Common Configuration".

Example A-1 shows a pre-start event script.

Example A-1 Pre-start Event Script

#!/bin/sh
echo
echo =---===----======---=-----=-----=------======----===---=
echo =---===----===== PRE-START EVENT SCRIPT =====----===---=
echo =---===----======---=-----=-----=------======----===---=

timeStamp="N/A"
instanceName="N/A"
componentId="N/A"
processType="N/A"
processSet="N/A"
processIndex="N/A"
stderrPath="N/A"  # not available w/pre-start unless part of restart
stdoutPath="N/A"  # not available w/pre-start unless part of restart
reason="N/A"
pid="N/A"         # only available with pre-stop, post-crash
startTime="N/A"   # only available with pre-stop, post-crash

while [ $# -gt 0 ]; do
     case $1 in
        -timeStamp)    timeStamp=$2; shift;;
        -instanceName) instanceName=$2; shift;;
        -componentId)  componentId=$2; shift;;
        -processType)  processType=$2; shift;;
        -processSet)   processSet=$2; shift;;
        -processIndex) processIndex=$2; shift;;
        -stderr)       stderrPath=$2; shift;;
        -stdout)       stdoutPath=$2; shift;;
        -reason)       reason=$2; shift;;
        -pid)          pid=$2; shift;;
        -startTime)    startTime=$2; shift;;
        *) echo "Option Not Recognized: [$1]"; shift;;
        esac
        shift
done

echo timeStamp=$timeStamp
echo instanceName=$instanceName
echo componentId=$componentId
echo processType=$processType
echo processSet=$processSet
echo processIndex=$processIndex
echo stderr=$stderrPath
echo stdout=$stdoutPath
echo reason=$reason
echo pid=$pid
echo startTime=$startTime


Note:

The pre-start event script example, Example A-1, will not work for the Microsoft Windows operating system; however, you can create a script, with a.bat suffix, with similar functionality.

Use the full path to the.bat file when adding the necessary configuration information to the opmn.xml file,.


A.2.5 opmn.xml Environment Variables

The environment variable used to launch OPMN server is not inherited by the Oracle Application Server process started by OPMN server. OPMN sets the environment variables at the ias-instance level, with the values extracted either from the ias-instance configuration or from the OPMN run time environment.

A.3 Need More Help?

You can find more solutions on Oracle MetaLink (http://metalink.oracle.com). If you do not find a solution for your problem, log a service request.


See Also: