7 Troubleshooting Monitoring Deployments

A variety of reasons could cause a monitored instance in the Oracle Enterprise Manager Cloud Control to return information improperly. These issues are typically associated with misconfiguration, incorrect permissions, or changing the configuration within one of the products.

Table 7-1 shows OKM Cluster problems and possible solutions.

Table 7-1 Appliance Problems and Solutions

Problem Solution

OKM Cluster does not appear to be Up after being added as an instance to be monitored.

An error could have been made when setting up the instance for monitoring, or a change in configuration on the KMA could be affecting communication. Check the following:

  • Is the designated DNS name or IP address correct in the asset's Monitoring Configuration?

  • Is the named asset accessible from the Management Agent system?

This may be an indication that the timeout setting for the plugin could be increased, especially if the target goes down intermittently. To change the timeouts for the plugin, follow these steps:

  1. Log into the Management Agent system and navigate to the Management Agent installation directory.

  2. Navigate to the agent_inst subdirectory.

  3. Navigate to the subdirectory of the named asset.

  4. Modify the values for connectTimeout and/or transTimeout in the profile.cfg file located in this directory. No restart of the Management Agent is necessary, the changes will take effect for the next polling cycle.

OKM Cluster is Up, but some items are missing from Summary and/or Performance pages.

You must upload metrics before they can be displayed in the Summary or Performance pages. Metrics can take up to an entire collection cycle (up to 1 day for some metrics) to be uploaded. If the KMA is not responding consistently, this can affect metric collection. See the previous Solution for setting the timeouts for the plugin. Metrics and information about system values are collected in varying intervals. If you have a need for real-time information, refer to the Oracle Key Manager Management GUI.

A Metric Detail is not showing a change made on the system.

Metrics and information about system values are collected in varying intervals. In the worst case, a metric could be as much as 24 hours out of sync with a KMA. If you have a need for real-time information, refer to the Oracle Key Manager Management GUI.

A certificate exported for the OKM Operator is not being accepted.

If this certificate was exported using a 3.0.x Solaris Oracle Key Manager Management GUI, the size of the certificate file is 0. This certificate must be exported using an older GUI, or must be exported from a Windows Oracle Key Manager Management GUI.


Metric Collection Errors

Some common configuration problems manifest as Metric Collection Errors. In some cases, these error messages contain an "Internal Error" followed by a longer Java Exception. The Support Center uses the Java Exception (if the problem is complex). Common Metric Collection problems are as follows in Table 7-2.

Table 7-2 Common Metric Collection Errors

Metric Collection Error Problem/Solution

No java security providers configured on Management Agent: TLS_RSA_WITH_AES_256_CBC_SHA

Problem: The plugin cannot use AES256, which is required to communicate with OKM.

Possible causes: The java used by the Enterprise Manager Management Agent has not been configured to use AES256.

Solution: Follow the instructions in "Enabling Java Unlimited Cryptographic Strengths".

1000 Access denied

Problem: The credentials being used to monitor the system are being rejected by the KMA.

Possible causes:

  • The OKM User may not have an Operator role associated with it, or maybe be Disabled.

  • The certificate files associated with the Monitoring Credentials may be missing or inaccessible to the Management Agent system, or maybe be incorrect or corrupt.

Solution:

  1. Verify the User is configured correctly on the KMA.

  2. Follow the instructions in "Configuring the OKM Appliance".


OKM KMA Audit Logs

For detailed debug information, use the audit log on the target OKM KMA.

Note:

Debugging the reason that an OKM Agent is not properly collecting statistics from an OKM KMA can be more difficult and should only be done by service personnel. Contact the Oracle Support Center (see Appendix A, "Additional Resources").

Host Agent Logs

Most problems (other than version mismatches) occur during the collection of metrics or response information. Look for the following logs on the Enterprise Management Agent host. The Enterprise Management Agent host may differ from the OEM Management Service location. In the following file locations, %AGENT_LOCATION % indicates the home directory of the Agent, typically similar to /export/home/oracle/OracleHomes/agent10g. An asterisk (*) indicates there are several files with additional extensions. Table 7-3 lists host Agent log locations.

Table 7-3 Host Agent Logs

Agent Log Location Description

%AGENT_HOME%/<targetName>/okmclient*.log

Contains errors that occurred below the Oracle Enterprise Manager Cloud Control framework during attempts to communicate with the OKM cluster. These files will typically contain fine-grained details on connection problems with an OKM KMA or failures that occur while retrieving information.

%AGENT_HOME%/sysman/log/emagent.trc*

These logs also contain connection exceptions and any information on using the data returned from the system to populate database tables.

%AGENT_HOME%/sysman/log/*

Remaining logs will contain finer-grained details on various elements in the Cloud Control Framework and 90% of the issues can be diagnosed with the above.


Note:

When interacting with the Support Center or developers within your own help desk, include the emagent.trc* files for reference.