19803 - Communication Agent Stack Event Queue Utilization

Alarm Group:
CAF
Description:
The percent utilization of the Communication Agent Task stack queue is approaching defined threshold capacity. If this problem persists and the queue reaches above the defined threshold utilization, the new StackEvents (Query/Response/Relay) messages for the Task can be discarded based on the StackEvent priority and Application's Global Congestion Threshold Enforcement Mode.
Severity:
Minor, Major, Critical
Instance:
<ComAgent StackTask Name>
HA Score:
Normal
Auto Clear Seconds:
0 (zero)
OID:
cAFQueueUtilNotify
Cause:

This alarm raises when KPI ComAgentQueueUtil exceeds the thresholds defined in the SysMetricThreshold table .

  • MINOR: ComAgentQueueUtil|CAF|-*|Current|19803|60|50|3000
  • MAJOR: ComAgentQueueUtil|CAF|**|Current|19803|80|70|3000
  • CRITICAL: ComAgentQueueUtil|CAF|*C|Current|19803|95|90|3000
Diagnostic Information:

The percent utilization of the Communication Agent Task's Queue is approaching its defined capacity. If this problem persists and the queue reaches above the defined threshold utilization, the new StackEvents (Query/Response/Relay) messages for the Task can be discarded, based on the StackEvent priority and Application's Global Congestion Threshold Enforcement Mode.

This alarm should not normally occur when no other congestion alarms are asserted. This may occur for a variety of reasons:
  • An IP network or Adjacent node problem may exist preventing from transmitting messages into the network at the same pace that messages are being received from the network.
  • The Task thread may be experiencing a problem preventing it from processing events from its event queue.
  • The mis-configuration of Adjacent Node IP routing may result in too much traffic being distributed to the MP.
  • There may be an insufficient number of MPs configured to handle the network traffic load.

Recovery:

  1. Navigate to Main Menu, and then Alarms & Events to examine the alarm log.

    An IP network or Adjacent node problem may exist preventing from transmitting messages into the network at the same pace that messages are being received from the network. The Task thread may be experiencing a problem preventing it from processing events from its event queue. It is recommended to contact My Oracle Support for assistance.

  2. Navigate to Status & Manage, and then KPIs to monitor the ingress traffic rate of each MP.

    Each MP in the server site should be receiving approximately the same ingress transaction per second.

    It is recommended to contact My Oracle Support for assistance.

  3. If the MP ingress rate is approximately the same, there may be an insufficient number of MPs configured to handle the network traffic load.

    If all MPs are in a congestion state, then the offered load to the server site is exceeding its capacity.

    It is recommended to contact My Oracle Support for assistance.