14 Troubleshooting JVM Diagnostics

This chapter describes the errors you may encounter while deploying and using JVM Diagnostics and the workaround steps you can follow to resolve each of them. It contains the following sections:

Agent Automated Deployment Errors

This section lists errors that occur during the automated deployment of the JVM Diagnostics Agent.

DeployAd4jAgentOnTarget Step Errors

Table 14-1 DeployAD4JAgentOn Target Step Errors

Error Message Workaround Steps
New SOA Composite deployed on the SOA Server from
JDeveloper are not displayed automatically in
Enterprise anager Grid Control. war file <war
file name> is already deployed. Use Option force
to redploy

Navigate to the Agent Deployment page, check the Force checkbox and click Deploy.

Caused by: java.lang.SecurityException: User:
<username>, failed to be authenticated

This error occurs if the username and password for the WebLogic Administration server is incorrect.

Specify the correct credentials for the WebLogic Administration server and click Deploy on the Agent Deployment page.

Exception in thread "main"
java.lang.NoClassDefFoundError:
javax/enterprise/deploy/spi/exceptions/TargetExc
eption
Caused by: java.lang.ClassNotFoundException:
javax.enterprise.deploy.spi.exceptions
TargetException 
at java.net.URLClassLoader$1.run(URLClassLoader? .java:200) 
at java.security.AccessController.doPrivileged(Nati
ve Method) 
at java.net.URLClassLoader.findClass(URLClassLoader?
.java:188) 
at java.lang.ClassLoader.loadClass(ClassLoader?
.java:307) 
at
sun.misc.Launcher$AppClassLoader.loadClass(Launc
her.java:301) 
at java.lang.ClassLoader.loadClass(ClassLoader?
.java:252) 
at
java.lang.ClassLoader.loadClassInternal(ClassLoa
der? .java:320) 
Could not find the main class:
oracle.sysman.e2e.model.ad4j.util.RebuildWar.
Program will exit. 

Reason: This error may occur if:

  • The path to the weblogic.jar (present on the machine on which the Management Agent is running) as specified on the Agent Auto Deployment page.

  • The user does not have Read permissions on the directory on which the weblogic.jar file is present.


CreateTempDir Step Errors

Table 14-2 CreateTempDir Step Errors

Error Message Workaround Steps
RemoteOperationException: ERROR: Invalid username
and/or password”

If the Management Agent monitoring the domain is running on a Windows machine, you may encounter this error if you specify an incorrect username or password for this machine. This error may also occur you do not have the logon as batch job privilege.

nmo set suid root

You may see this error if you do not have Read permissions on the binary and the entire path.

To resolve this error, enter the command, chmod 7755 nmo to change the permission of emagent11g/bin/nmo to suid root.

Exception in thread "main"
java.lang.NoClassDefFoundError:
javax/enterprise/deploy/spi/exceptions/TargetExc
eption
Caused by: java.lang.ClassNotFoundException:
javax.enterprise.deploy.spi.exceptions
TargetException 
at java.net.URLClassLoader$1.run(URLClassLoader?
.java:200) 
at java.security.AccessController.doPrivileged(Nati
ve Method) 
at
java.net.URLClassLoader.findClass(URLClassLoader?
.java:188) 
at java.lang.ClassLoader.loadClass(ClassLoader?
.java:307) 
at
sun.misc.Launcher$AppClassLoader.loadClass(Launc
her.java:301) 
at java.lang.ClassLoader.loadClass(ClassLoader?
.java:252) 
at java.lang.ClassLoader.loadClassInternal(ClassLoa
der? .java:320) 
Could not find the main class:
oracle.sysman.e2e.model.ad4j.util.RebuildWar.
Program will exit. 

This error occurs if:

  • the path to the weblogic.jar (present on the machine on which the Management Agent is running) as specified on the Agent Auto Deployment page.


Active State Error

Table 14-3 CreateTempDir Step Errors

Error Message Workaround Steps
Error: Ensure that Ad4j Manager is Active

You can deploy the JVM Diagnostics Agent only after the JVM Diagnostics Manager has been installed and is in an active state.


Secure Communication Errors

Table 14-4 Secure Communication Errors

Error Message Workaround Steps
JAM Agent: loadNative Exception loading
[/tmp/libjamagent_nz.so.1] /tmp/libjamagent
_nz.so.1: libnnz11.so: cannot open shared object
file: No such file or directory. JAM Agent:
Please make sure that libnnz11 and libclntsh is
in the LD_LIBRARY_PATH

This error occurs if the NZ libraries are not in LD_LIBRARY_PATH.

To resolve this issue, add the libnz11.so and libclntsh.so.11.1 files to the LD_LIBRARY_PATH.

Jam Communication : Open wallet error : 28759
Jam Communication : setup_auth failed 28759

This error occurs if an unrecognized wallet resource locator has been used to open the wallet.

To resolve this error, ensure that the wallet location is preceded by file.

Jam Communication : Open wallet error : 28759 
Jam Communication : setup_auth failed 28759

This error occurs when the wallet file cannot be opened by the JVM Diagnostics Agent.

Ensure that the file path to the wallet is correct and the correct permissions have been provided. While specifying the wallet location in the web.xml file, do not specify the absolute path to the wallet.

Jam Communication : nz initialize SSL handshake
error 28864 Jam Communication : nzos_Write error
- 29031 AM Console: Invalid connection from
 10.228.234.246.55859

These errors occur if the JVM Diagnostics Manager has been launched in non-secure mode.

javax.crypto.IllegalBlockSizeException: Input
length must be multiple of 8 when decrypting
with padded cipher
at com.sun.crypto.provider.SunJCE_f.b(DashoA13?
*..)
at
com.sun.crypto.provider.DESedeCipher.engineDoFin
al(DashoA13? *..) 
at javax.crypto.Cipher.doFinal(DashoA13? *..) 
at jamagent.jamagent.decrypt(jamagent.java:725)
at jamagent.jamagent.decrypt(jamagent.java:741)
at jamagent.jamagent.<init>(jamagent.java:799)
at jamagent.jaminit.init(jaminit.java:50)

This error occurs if the encrypted password specified for the JVM Diagnostics Agent is incorrect. Enter the correct password in web.xml file.

Jam Communication : Setting up context
Jam Communication : nz initialize SSL handshake
error 29024 
Jam Communication : nzos_Write error :- 29031

This error occurs if the wallet password is incorrect or the certificate for the JVM Diagnostics Agent has not been certified by the same CA as the JVM Diagnostics Manager.

Jam Communication : nz initialize SSL handshake
error 28860 
Jam Communication : nzos_Read error : 28865
Console Err code: -1716291680 Receiving data on
port 3600 from 127.0.0.1:57187, FD 450: Broken
pipe
 
JAM Agent: Conn reset by console
JVM Diagnostics Manager Error:
JAM Communication: nz initialize SSL handshake 
error 29049
JAM Communication: nzos_Read error: 28865
Console Err code: -1716291680 Receiving data on
port 3600 from 127.0.0.1:58221, FD 450:
Transport endpoint is not connected

This error occurs if the JVM Diagnostics Agent is running in non-secure mode and is trying to connect to a secure JVM Diagnostics Manager


Cross Tier Functionality Errors

This section lists the errors that show the status of the JVM Diagnostics Manager.

Table 14-5 Cross Tier Functionality Errors

Error Message Workaround Steps
DBWait link not displayed on JVM Threads Real
Time Analysis page.
No data displayed in Top DBStates / SQLs tables.

Cross tier functionality errors may occur due the following:

  • Incorrect database credentials

  • Database Agent errors

If the database credentials are incorrect:

  • Navigate to Middleware > JVM Diagnostics > Setup > Databases page and verify the credentials for the registered database to be monitored.

  • Enter the id of user who has installed the database in the OS User field.

  • Specify the database application user credentials in the DB User field.

  • Specify the database system user credentials in the DB User (Explain Plan) field.

If database agent errors occur, ensure that the database agent is running on a machine on which the database is installed with the correct IP address and port number.

  • Navigate to the Threads > Real Time Analysis page for the JVM. If you see a thread that is in the DB Wait state, it should be a hyperlink. Click on the hyperlink to drill down to the database details and view the database session information including the SQL statements executed.

  • If the DB Wait state is not a hyperlink, navigate to the Middleware > JVM Diagnostics > Setup page and set the Cross Tier Log Level to 6 and send AD4J Manager logs report issue.


Trace Errors

This section lists errors that occur during tracing.

Table 14-6 Trace Errors

Error Message Workaround Steps
weblogic.transaction.internal.TimedOutException:
Transaction timed out after 30 seconds

This error occurs if the Poll Duration has a large value and results in a timeout.

This error does not affect the Trace functionality and can be ignored.


Deployment Script Execution Errors

This section lists the errors that occur when you run the deployment script.

Table 14-7 Script Execution Errors

Error Message Workaround Steps
ScriptException: Error occured while performing
deploy: The action you performed timed out after
600,000 milliseconds

This error occurs when you are deploying the JVM Diagnostics Agent. To resolve this issue check if the lock for the WebLogic Administration Console on which the JVM Diagnostics Manager has already been acquired. If it has been acquired, release it and run the script again.

  • Login to the WebLogic Administration Console: http://<machine address>:<webogic port>/console.

  • If you see the Activate Changes and Undo All Changes buttons in the left pane, click on these buttons to clear them. If the buttons are not cleared, click Undo All Changes and run the script again.


Deployment on 64-bit JVMs

This section lists the errors that occur when you try to deploy the JVM Diagnostics Manager on a 64-bit JVM.

Table 14-8 Deployment on 64-bit JVM Errors

Error Message Workaround Steps
JAM Console: Only 32 bit JAM Console: Only 32 bit
JVM is supported. The shared lib might not be
loaded on this platform
>>java.library.path:
/scratch/skbalakr/Oracle/Middleware/oms11g/lib
JAM Console: loadNative Exception loading
[/tmp/libJamConsole.so.1]
/tmp/libJamConsole.so.1: ld.so.1: java: fatal:
/tmp/libJamConsole.so.1: wrong ELF class:
ELFCLASS32 (Possible cause: architecture word
width mismatch
java.lang.UnsatisfiedLinkError:
/tmp/libJamConsole.so.1: ld.so.1: java: fatal:
/tmp/libJamConsole.so.1: wrong ELF class:
ELFCLASS32 (Possible cause: architecture word
width mismatch)
at
java.lang.ClassLoader$NativeLibrary.load(Native
Method)
at
java.lang.ClassLoader.loadLibrary0(ClassLoader?
.java:1778)
at java.lang.ClassLoader.loadLibrary(ClassLoader?
.java:1674)
at java.lang.Runtime.load0(Runtime.java:770)
at java.lang.System.load(System.java:1003)
at
oracle.sysman.e2e.model.ad4j.remote.Jam.init(Jam
.java:597)
at
oracle.sysman.e2e.model.ad4j.remote.servlet.AD4J
ManagerServlet.init(AD4JManagerServlet? .java:38)

This error occurs as the JVM Diagnostics Manager can be deployed only on 32-bit JVMs.To reslve this issue, open the DOMAIN_HOME>/bin/startWeblogic.sh(cmd) file and change value of JAVA_VM variable from -d64 to -d32.


JVM Diagnostics Manager Automated Deployment

This section lists the errors that occur during the automated deployment process.

Table 14-9 JVM Diagnostics Manager Automated Deployment Errors

Error Message Workaround Steps
Failed to deploy the application with status
failed. Current Status of your Deployment:
Deployment command type: deploy Deployment State:
failed Deployment Message:
weblogic.management.ManagementException:
[Deployer:149007] New source location,
/scratch/skbalakr/Oracle/Middleware/oms11g/ad4j/
jammanager_dummy.ear,cannot be deployed 
to configured application,jammanager. The
application source is at 
/scratch/skbalakr/Oracle/Middleware/oms11g/ad4j/
jammanager.ear. Changing the source location is
not allowed for a previously attempted
deployment. Try deploying without specifying the
source. No stack trace available. This Exception
occurred at Tue May 11 06:40:50 UTC 2010.
weblogic.management.scripting.ScriptException:
Error occured while performing deploy :
Deployment Failed. 

Automated deployment fails when you try to change the source location for a previously attempted deployment.

JAM Console: Only 32 bit JVM is supported. The
shared lib might not be loaded on this platform
java.library.path:
/scratch/skbalakr/Oracle/Middleware/oms11g/lib
JAM Console: loadNative Exception loading
[/tmp/libJamConsole.so.1]
/tmp/libJamConsole.so.1: ld.so.1: java: fatal:
/tmp/ libJamConsole.so.1: wrong ELF class:
ELFCLASS32 (Possible cause: architecture word
width mismatch) java.lang.UnsatisfiedLinkError:
/tmp/libJamConsole.so.1: ld.so.1: java: fatal:
/tmp/libJamConsole.so.1: wrong ELF class:
ELFCLASS32 (Possible cause: architecture word
width mismatch) at
java.lang.ClassLoader$NativeLibrary.load(Native
Method) at
java.lang.ClassLoader.loadLibrary0(ClassLoader?
.java:1778) at
java.lang.ClassLoader.loadLibrary(ClassLoader?
.java:1674) at
java.lang.Runtime.load0(Runtime.java:770) 

This error occurs if you try to run the deployment script on a 64-bit JVM.

To resolve this issue, open the DOMAIN_HOME>/bin/startWeblogic.sh(cmd) file and change value of JAVA_VM variable from -d64 to -d32.


LoadHeap Errors

This section lists loadheap errors.

Table 14-10 LoadHeap Errors

Error Message Workaround Steps
glibc detected * free(): invalid next size
(fast): 0x0965d090" ./loadheap.sh: line 237:
32357 Aborted ./bin/${bindir}/processlog
in=$infile hdr=${sumdata} obj=${objdata}
rel=${reldata} root=${rootdata}
osum=${objsumdata} rrel=${rootrel} heap=${heap
_id} skip=$skipgarbage db=$dbtype $* 
Error processing file /tmp/heapdump6.txt 

Check if the heapdump operation has been successfully completed. Open the heapdump6.txt file and check if there is a heapdump finished string at the end of the file. If you see this string, load the finished dump file.

Heapdump already in progress, cannot take another
heapdump

Check if the heapdump operation has been successfully completed. Open the heapdump6.txt file and check if there is a heapdump finished string at the end of the file. If you see this string, restart the jamagent and run the heapdump operation again.

loadheap.sh created unusable unique indexes.

Run the loadheap/sql/cleanup.sql shipped with loadheap.zip to fix the unique indexes.


Errors on JVM Diagnostics UI Pages

This section lists the user interface errors.

Table 14-11 JVM Diagnostics UI Page Errors

Error Message Workaround Steps
JAM Console: Socket timed out after recv --
client adc2100083.us.oracle.com:7001 is not
Active [0] secs JAM Console jamlooptimeout =
[3]JAM CONSOLE: JVM 1 is not active JAM Cons Err
Processing Request: 128 JVM 1 is not active
jamDAL: jamreq returned 128 return status < 0
from jamDalInst.processRequest 

To resolve this error, increase the Agent Request Timeout (secs) and Agent Loop Request Timeout (secs)

This page cannot be displayed at the current JVM
optimization level.

When the optimization level is set to 0, the Show Heap (Memory) operation is very time consuming and is not supported.

Method locals on Real Time Thread page not
displayed.

Method locals is not supported if optimization level is set to 0.

Agent is up and running but is not displayed in
the real time pages.

If the log file shows JAMMANAGER: OLD AGENT or NULL POOl or wrong jamoptimization level, this indicates that the old jamagent/Dbagent is being used.

To resolve this issue, download the latest jamagent from Setup > Download page.

You do not have the necessary privileges to view
this page

Ensure that you have the required JVM Diagnostics Administrator or User privileges to view the JVM Diagnostics data.


Frequently Asked Questions

This section lists some of the questions you may have while using JVM Diagnostics. It includes the following:

  • Location of the JVM Diagnostics Logs

  • JVM Diagnostics Manager Status

  • JVM Diagnostics Agent Status

  • Monitoring Status

  • Creating Less Privileged Users

  • Usage of Try Changing Threads Parameter

  • Significance of Optimization Levels

  • Manually Deploying the JVM Diagnostics Agent

  • Log Manager Level

  • Repository Space Requirements

Location of the JVM Diagnostics Logs

You can find the JVM Diagnostics logs in the following locations:

  • The JVM Diagnostics Manager Log file is located at $T_WORK/gc_inst/user_projects/domains/GCDomain/servers/EMAD4JMANAGER/logs/EMAD4JMANAGER.out

  • UI related errors are logged in:

    • $T_WORK/gc_inst/user_projects/domains/GCDomain/servers/EMGC_OMS1/logs/EMGC_OMS1.out

    • $T_WORK/gc_inst/user_projects/domains/GCDomain/servers/EMGC_OMS1/logs/EMGC_OMS1.log

  • Communication errors between the JVM Diagnostics Manager and the Console are logged in $T_WORK/gc_inst/em/EMGC_OMS1/sysman/log/emoms.log

JVM Diagnostics Manager Status

To check the status of the JVM Diagnostics Manager:

  • Click Middleware > JVM Diagnostics > Setup. In the Console Setup page, check if the status of the JVM Diagnostics Manager is Up or Down.

  • Check the status of all Managers at Middleware > JVM Diagnostics > Setup > JVMs & Managers page.

  • Check the JVM Diagnostics Agent log file to verify the connection between Agent and the Manager. If you see an error - JAM Agent ERROR: Cannot connect to Console:Connection refused, this indicates that the JVM Diagnostics Manager is not running.

  • Check if the message JAM Console: Agent connection from:[Hostname] is present in the JVM Diagnostics Manager log file. If this message appears, it indicates that the JVM Diagnostics Manager is running and is connected to the Agent.

JVM Diagnostics Agent Status

To check the status of the JVM Diagnostics Agent:

  • Click Middleware > JVM Diagnostics > Threads > Real Time Analysis to view the pool to which the JVM belongs. Check the JVM Status in the Connected JVMs table.

    • If the status is Not Active, this indicates that the Agent is not connected to the Manager. Check the agent logs to verify if it is running and the IP address and port number of the Manager is correct.

    • If the status is No AD4J Agent Deployed, the JVM Diagnostics Agent must be deployed on that JVM.

  • Navigate to the Threads > Real Analysis page for the JVM. If the JVM Diagnostics Agent is running, the active threads data must be visible. If the JVM Diagnostics Agent is not running, you will see a message - JVM is inactive, Please try again after some time.

Monitoring Status

To verify if the JVM Diagnostics Manager is monitoring the data:

  1. Click Middleware >JVM Diagnostics > Setup. In the Console Setup page, verify that the Enable Monitoring checkbox is checked.

  2. Navigate to the Monitoring page under Setup and check if monitoring status is On for the Pool to which the JVM being monitored belongs.

  3. Navigate to the JVM Pools page under Setup and verify if the Poll Enabled checkbox has been checked for the Pool to which the JVM being monitored belongs. Monitoring should now be enabled.

Running the create_jvm_diagnostic_db_user.sh Script

You can run the create_jvm_diagnostic_db_user.sh script if you want to create less privileged users who can only load heaps using the loadHeap script.

Usage of the Try Changing Threads Parameter

This parameter should be used only when the JVM is highly active.

Significance of Optimization Levels

The JVM Diagnostics Agent supports three optimization levels:

  • Level 0 indicates that the JVM Diagnostics Agent is using a JVMTI based engine. This level is supported for JDK 6 series on almost all supported platforms.

  • Level 1 is a hybrid between level 0 and level 2. It is supported only for very few JDKs on selected platforms.

  • Level 2 uses Runtime object analysis technique for monitoring. This implementation is usually complex and time consuming and is supported for selective JDKs and platforms.

Manually Deploying the JVM Diagnostics Agent

To deploy the JVM Diagnostics Agent manually, follow these steps:

  1. Create a JVM Pool with the same name as that of the WebLogic Server Domain target but replace each '/'(slash) with '__' (two underscores). For example if the WebLogic domain name is /sample_EMGC_DOMAIN/EMGC_DOMAIN, the JVM Pool should be __sample_EMGC_DOMAIN__EMGCDOMAIN.

  2. Check the Poll Enabled checkbox to start monitoring this JVM Pool.

  3. Download the jamagent.war from the JVM Diagnostics > Setup >Download Page.

  4. In the WEB-INF/web.xml file of the jamagent.war, update the value of the following properties and create the jamagent.war again.

    • oracle.ad4j.groupidprop – The name of WebLogic Server Domain target on which the Agent is to be deployed. This is shown on JVM Diagnostics Deployment page under the Target Name column.

    • oracle.ad4j.jvmidprop - The type of WebLogic Server Domain target on which the Agent is to be deployed. This is shown on JVM Diagnostics Deployment page under the Target Type column with the value weblogic_j2eeserver.

    • jampool – The name of the domain on which the Agent is being deployed. The "/" replaced by "__" (forward slash is replaced by double underscore)

  5. Deploy rebundled jamagent.war on the WebLogic Server Domain manually.

  6. Navigate to Middleware page, select the WebLogic Server Domain target. The WebLogic Server Domain Home page is displayed. Select the JVM Diagnostics option from the menu

Log Manager Level

The default log manager level is 3. You can increase it a higher level only if you are encountering some issues.

Repository Space Requirements

For monitoring data, Oracle recommends 100 MB per JVM per day with the default setting of a 24 hour purge interval. This amount can vary based upon runtime factors (e.g depth of call stacks, etc) within your environment. Hence, you must check the tablespace growth periodically and if required, you may need to change the space requirements. This will ensure that database growth due to standard monitoring will occur smoothly without sudden spikes. Tablespace sizing can be affected by the following:

  • Heap Dumps: Analyzing heaps requires a large amount of tablespace. As a standard practice, we recommend that you must have 5 times the size of heap dump file being loaded in your tablespace. Since you know the size of your dump file, make sure that there is adequate space to accommodate the dump file before it is loaded into the database.

  • Thread Traces: While these are smaller than heaps. they are loaded into the database automatically when a user initiates a trace at the console. The size of these threads can vary dramatically depending on the number of active threads during the trace, the duration of the trace, and the sample interval of the trace. This should usually be under 100MB but if several thread traces have been initiated, it could fill up the database quickly. Before initiating the traces, you must ensure that there is adequate space in the database.