3.9.100 22224 - Average Hold Time Limit Exceeded
- Alarm Group:
- DIAM
- Description:
-
The average transaction hold time has exceeded its configured limits.
This alarm is generated when KPI #10098 (TmAvgRspTime) exceeds Peer CNDRA-wide engineering attributes associated with average hold time, defined in the DraWorker profile assigned to the DraWorker server. KPI #10098 is defined as the average time (in milliseconds) from when the routing layer (DRL) receives a request message from a downstream peer to the time that an answer response is sent to that downstream peer. The source measurement of KPI #10098 is the TmResponseTimeDownstreamMp (10093) measurement.
This alarm indicates the average response time (TmAvgRspTime) for messages forwarded by the Relay Agent is larger than what is defined for a deployment as per DraWorker profile assignment. One of these problems could exist:- The IP network may be experiencing problems
that are adding propagation delays to the forwarded request message and the
answer response.
- Verify the IP network connectivity exists between the MP server and the adjacent nodes.
- View the event history logs for additional events or alarms from this MP server.
- One or more upstream nodes may be experiencing traffic overload.
- One or more MPs is experiencing traffic
overload.
- View the KPI Routing Recv Msgs/Sec.
- View the CPU utilization of MPs.
- The IP network may be experiencing problems
that are adding propagation delays to the forwarded request message and the
answer response.
- Severity:
- Minor, Major, Critical
- Instance:
- N/A
- HA Score:
- Normal
- Auto Clear Seconds:
- 0 (zero)
- OID:
- eagleXgDiameterAvgHoldTimeLimitExceededNotify
- Cause:
-
Alarm 22224 is generated when KPI #10098 (TmAvgRspTime) exceeds Peer CNDRA-wide engineering attributes associated with average hold time, defined in the DraWorker profile assigned to the DraWorker server. KPI #10098 is defined as the average time (in milliseconds) from when the routing layer (DRL) receives a request message from a downstream peer to the time that an answer response is sent to that downstream peer. The source measurement of KPI #10098 is the TmResponseTimeDownstreamMp (10093) measurement.
The alarm thresholds are configurable for:
- Average hold time minor alarm onset threshold
- Average hold time minor alarm abatement threshold
- Average hold time major alarm onset threshold
- Average hold time major alarm abatement threshold
- Average hold time critical alarm onset threshold
- Average hold time critical alarm abatement threshold
The severity of the alarm (Minor, Major, or Critical) is according to onset threshold/abatement threshold of each severity level. When the average hold time initially exceeds the average hold time for an alarm onset threshold, a minor, major, or critical alarm is triggered. When the average hold time subsequently exceeds a higher onset threshold, or drops below an abatement threshold, but is still above the minor alarm abatement threshold, the alarm severity changes based on the highest onset threshold crossed by the current average hold time.
- Diagnostic Information:
-
If Alarm #22224 is raised, then it indicates the average response time (TmAvgRspTime) for messages forwarded by the Relay Agent is larger than the defined for a deployment as per DraWorker profile assignment. One of the following problems could exist:
- The IP network may
be experiencing problems that are adding propagation delays to the forwarded
request message and the answer response.
- Verify the IP network connectivity exists between the MP server and the adjacent nodes.
- View the event history logs for additional events or alarms from this MP server.
- The IP network may be experiencing problems that are adding propagation delays to the forwarded request message and the answer response.
- One or more upstream nodes may be experiencing traffic overload.
-
One or more MPs is experiencing traffic overload.
- View the KPI Routing Recv Msgs/Sec.
- View the CPU utilization of MPs.
- The IP network may
be experiencing problems that are adding propagation delays to the forwarded
request message and the answer response.
Recovery: