OPMN manages Essbase; therefore, the first set of logs to consult are the OPMN log files for Essbase. However, a failure reported in these log files may point to activity in the back-end Essbase log files that will help diagnose the problem.
To properly diagnose error conditions, use the Essbase and OPMN log files listed in Table 10 and chain the log entries, known as log chaining, across the various processes according to the time line in which they occurred. There are many possible scenarios of issues involving runtime failures that affect failover. Sample Scenario uses several logs to troubleshoot a failover issue.
The following files are key log files when Essbase runs in failover mode. See Table 10 for the location and description of these logs.
Essbase Server log
Essbase Server Lease Manager log
Essbase Agent Lease Manager log
OPMN server log
Essbase console log
OPMN Forward Ping log
The relational database that holds the lease information for Essbase is shut down. Two nodes are running in failover mode. Node behavior that causes a failure in this scenario:
Node 1 sequence of events:
OPMN attempts a number of times to start Essbase on Node 1:
<start timeout="600" retry="2"/> // This is the default//
Note: | OPMN attempts to start Essbase three times. |
Because the Lease database is down, Essbase fails to start on Node 1.
Node two sequence of events:
To achieve failover, OPMN attempts to start Essbase on Node 2 for the configured number of startup attempts.
Note: | OPMN attempts to start Essbase 3 times (since the configuration is symmetrical between the two nodes, the number of attempts will also be three on Node 2). |
Essbase will stay down on both Node 1 and Node 2.
To find out why OPMN cannot start Essbase, analyze these logs:
Essbase is unable to get the database lease. The Essbase Agent attempts up to six times (Six times is the default—attempts once, then attempts five more times) and after that, Essbase stops attempting and self-terminates.
These lease manager log entries appear on both nodes of the cluster as OPMN attempts to start Essbase on both nodes.
Chain these four logs with a timeline to understand the root of the issue.
A sample time sequence of these events:
A request to start Essbase is issued at 2010-03-26T03:32:09
OPMN attempts to start Essbase at 2010-03-26T03:32:11 (first time)
Essbase starts up at Fri Mar 26 03:32:12 2010 (first time)
Normal bootstrap continues but then it appears to have stopped sometime between Fri Mar 26 03:32:12 2010 and Fri Mar 26 03:32:26 2010
Lease Manager, running within the Agent, attempts to acquire the Agent lease at 2010-03-26T03:32:12 (first time)
Lease Manager continues to attempt the acquisition up to 6 times and finally terminates the Agent at 2010-03-26T03:32:22
The following are sample snippets from logs that you check to diagnose the issue in the above scenario:
Note: | The following snippets show only the relevant portions of these logs. Line items in bold indicate the multiple OPMN attempts to start Essbase that fail. |
Console output from OPMNCTL:
opmnctl startproc: starting opmn managed processes... opmn id=adc2170731.us.oracle.com:8001 0 of 1 processes started. ias-instance id=instance1 ias-component/process-type/process-set: <EssbaseClusterName>/EssbaseAgent/AGENT/ Error --> Process (index=0,uid=0,pid=0) Service fail over start failure Log: none --> Process (index=1,uid=1051800837,pid=12982) failed to start a managed process after the maximum retry limit Process terminated. Log: EPM_ORACLE_HOME/product/11.1.1/webtierIH/instances/instance1/diagnostics/logs/OPMN/opmn/console~<EssbaseClusterName>~EssbaseAgent~AGENT~1.log
Log 1: Essbase console log (console~<EssbaseClusterName>~EssbaseAgent~AGENT~1.log):
-------- 10/03/26 03:32:40 Start process -------- [Fri Mar 26 03:32:40 2010]Local/ESSBASE0///Info(1051283) Retrieving License Information Please Wait... [Fri Mar 26 03:32:40 2010]Local/ESSBASE0///Info(1051286) License information retrieved.
.... (product banner and startup info which abruptly stops after some time) ...
Log 2: opmn.log:
[2010-03-26T03:32:09][opmn][TRACE:1][667][OPMN][code:pm-requests]Request 5 Started. Command: /start?process-type=EssbaseAgent [2010-03-26T03:32:11][opmn][TRACE:1][662][OPMN][code:pm-process]Starting Process: <EssbaseClusterName>1~EssbaseAgent~AGENT~1 (1051800837:0) [2010-03-26T03:32:25][opmn][TRACE:1][662][OPMN][code:pm-process]Starting Process: <EssbaseClusterName>~EssbaseAgent~AGENT~1 (1051800837:0) [2010-03-26T03:32:39][opmn][TRACE:1][662][OPMN][code:pm-process]Starting Process: <EssbaseClusterName>~EssbaseAgent~AGENT~1 (1051800837:0) [2010-03-26T03:32:53][opmn][TRACE:1][][OPMN][code:pm-fos]Service-failover: <ias-instance id="instance1"> <ias-component id="<EssbaseClusterName>"> <process-type id="EssbaseAgent"> <process-set id="AGENT"> <process id="0" pid="0" status="_fail" index="0" log="" operation="request" result="failure">
.... (more diagnostic info) ...
Log 3: Essbase.log:
[Fri Mar 26 03:32:12 2010] Local/ESSBASE0///Info(1051283)
Retrieving License Information Please Wait...
[Fri Mar 26 03:32:12 2010]Local/ESSBASE0///Info(1051286)
License information retrieved.
[Fri Mar 26 03:32:12 2010]Local/ESSBASE0///Info(1051227)
No JVM location specified in essbase.cfg. Will try to load from library path
[Fri Mar 26 03:32:12 2010]Local/ESSBASE0///Info(1051216)
JVM Started Successfully !
[Fri Mar 26 03:32:12 2010]Local/ESSBASE0///Info(1051232)
Log 4: Lease Manager log: leasemanager_essbase_ClusterNode1.log:
[2010-03-26T03:32:12.182-10:32] ESSBASE / Info (1320004) Lease manager logging is initialized. [2010-03-26T03:32:12.243-10:32] ESSBASE / Info (1320045) Connection will be established to [leasedb] as [system] [2010-03-26T03:32:12.315-10:32] ESSBASE / Error (1320009) Attempt to connect to database failed with error [[DataDirect][ODBC 20101 driver][20101]13105].
...(more diagnostic information)...
[2010-03-26T03:32:22.944-10:32] ESSBASE / Error (1320009) Attempt to connect to database failed with error [[DataDirect][ODBC 20101 driver][20101]13105]. [2010-03-26T03:32:22.944-10:32] ESSBASE / Error (1320001) Failed to acquire the lease after [6] consecutive attempts. [2010-03-26T03:32:22.945-10:32] ESSBASE / Info (1320016) Lease is being surrendered. [2010-03-26T03:32:22.945-10:32] ESSBASE / Info (1320011) Preparing to shutdown abort. [2010-03-26T03:32:24.946-10:32] ESSBASE / Error (1320012) Terminating the process.