4 Metrics, Alerts and KPIs
This section provides the details of the Metrics, Alerts and KPIs applicable for InterWorking and Mediation Function.
IWF Metrics
The following are IWF Metrics:
Table 4-1 New Metriation Metrics for 1.5 Release
SL.No | Prometheus state Metric Name | Metric Description | Dimensions | Example | Metric Type |
---|---|---|---|---|---|
1 | ociwf_med_total_rule_count |
Total Number of Rules Configured Condition: As soon as mediation service comes up, total number of rules configured will be counted. |
|
ociwf_med_total_rule_count{app= "nf-mediation",nfInstanceId="IWF1",vendor="oracle",kubernetes_namespace="medsvc"} | Gauge |
2 | ociwf_med_active_rule_count |
Total Number of Rules which will be invoked for a particular message Condition: When rules get executed. |
|
ociwf_med_active_rule_count_total{app="nf-mediation",nfInstanceId="IWF1",vendor="oracle",kubernetes_namespace="medsvc"} | Gauge |
3 | ociwf_med_individual_rule_exec_count_total |
Total Number of times a particular rule gets invoked. Condition: When rules get executed |
|
sum(ociwf_med_individual_rule_exec_count_total{app="nf-mediation"}) by (ruleName,ruleGroupName) | Counter |
4 | ociwf_med_http_req_total |
Total Number of ingress messages to NF Condition: Whenever a msg lands on nf-mediaton. |
|
ociwf_med_req_total{msgType="consumerRequest",app="nf-mediation",ruleGroupName="scp-agenda"} | Counter |
5 | ociwf_med_http_rsp_total |
Total Number of egress messages Condition: When nf sends response |
|
ociwf_med_rsp_total{msgType="consumerRequest",ruleGroupName="scp-agenda",statusCode~="2.*"} | Counter |
6 | ociwf_med_test_req_total |
Total Number of Incoming messages to NF Test mode Condition: whenever test mode is enabled |
|
ociwf_med_test_req_total{msgType="consumerRequest",app="nf-mediation-test",ruleGroupName="scp-agenda"} | Counter |
7 | ociwf_med_test_rsp_total |
Total Number of response by test mode* Condition: when response is sent by nf *Although test mode won't send any response but here it means proper execution of message by test mode. |
|
ociwf_med_test_rsp_total{msgType="consumerRequest"} | Counter |
8 | ociwf_med_msg_forwarded_to_test_mode_total | Total Number of Requests forwarded to Test mode |
|
sum(ociwf_med_msg_forwarded_to_test_mode_total) | Counter |
9 | ociwf_med_rule_update_status |
If Rules Reloading failed or successful Value = 0 {failed} Value = 1 {successful} |
|
ociwf_med_rule_update_status{app="nf-mediation"} | Gauge |
10 | ociwf_med_individual_rule_processing_time | Processing time of Every rule invoked |
|
Histogram | |
11 | ociwf_med_msg_processing_time | Processing time of message which lands on mediation |
|
Histogram | |
12 | ociwf_med_forward_to_ext_total |
Total Number of messages forwarded to external. Condition: When mediation works in proxy mode. |
|
Counter | |
13 | ociwf_med_incoming_rsp_from_ext_total |
Total Number of responses from external. Condition: When mediation works in proxy mode |
|
Counter | |
14 | ociwf_med_incoming_d2h_req_total | Total Number of requests from D2H service as a part of protocol translation mode. |
|
Counter | |
15 | ociwf_med_outgoing_rsp_to_d2h_total | Total Number of responses to D2H service as a part of protocol translation mode. |
|
Counter | |
16 | ociwf_med_forward_to_h2d_total | Total Number of messages forwarded to H2D as a part of protocol translation mode. |
|
Counter | |
17 | ociwf_med_incoming_rsp_from_h2d_total | Total Number of responses received from H2D as a part of protocol translation as a service. |
|
Counter |
Table 4-2 IWF Metric Reference
Metric Details | IWF Micro Service | Metric |
---|---|---|
Number of the incoming request to DP from PDRA | DP | pdraIngressCounter |
Number of successfully converted messages by D2H | D2H | iwfd2h_conversionSuccess_messages_total |
Number of successful conversion from http to diameter | H2D | iwfh2d_h2dConversion_Success_total |
Number of successful conversion from diameter to http | H2D | iwfh2d_d2hConversion_Success_total |
Number of success Outgoing responses from PcfDiscovery | PcfDiscovery | pcfDiscSuccessResponseCounter |
Number of success Outgoing responses from BSF | PcfDiscovery | bsfSucessResponseCounter |
Number of responses received from mediation | D2H | iwfd2h_mediation_response_total |
Number of responses from mediation to D2H | Mediation | iwfd2h_mediation_response_total |
Number of Response | All | IWF Egress Request rate |
Number of requests sent to mediation from D2H | D2H | iwfd2h_mediation_outgoing_total |
Number of requests sent to D2H from DP | DP | iwfdiameterproxy1_d2h_outgoing_total |
Number of Request | All | IWF Ingress Request rate |
Number of outgoing responses from H2D to mediation | Mediation | iwfmediation_h2d_response_total |
Number of outgoing responses from mediation | Mediation | iwfmediation_outgoing_response_total |
Number of outgoing requests from mediation | Mediation | iwfmediation_outgoing_request_total |
Number of messages going from PDRA to PCF | DP | pdraPcfEgressCounter |
Number of Incoming responses to PcfDiscovery | PcfDiscovery | pcfDiscRequestCounter |
Number of incoming responses to mediation | Mediation | iwfmediation_incoming_response_total |
Number of incoming responses to BSF | PcfDiscovery | bsfRequestCounter |
Number of incoming requests to mediation | Mediation | iwfmediation_incoming_request_total |
Number of incoming requests from mediation to H2D | Mediation | iwfmediation_h2d_outgoing_total |
Number of incoming requests from D2H to mediation | Mediation | iwfd2h_mediation_incoming_total |
Number of incoming messages to D2H from diameter Proxy Service | D2H | iwfd2h_incoming_messages_total |
Number of failures while sending message to mediation from D2H | D2H | iwfd2h_mediation_error_total |
Number of failure Outgoing responses from PcfDiscovery | PcfDiscovery | pcfDiscFailureResponseCounter |
Number of failure Outgoing responses from BSF | PcfDiscovery | bsfFailureResponseCounter |
Number of failed response (responses other than 200OK )from PCF Discovery to DP | DP | failedResponseToPdraCounter |
Number of diameter requests sent to diameter proxy | H2D | iwfh2d_diameter_request_outgoing_total |
Number of Diameter requests sent to Diameter peer | DP | iwfdiameterproxy1_diameter_outgoing_total |
Number of Diameter requests received from H2D | DP | iwfdiameterproxy1_h2d_incoming_total |
Number of Diameter requests received from Diameter peer | DP | iwfdiameterproxy1_diameter_incoming_total |
Number of diameter request send error occurred | H2D | iwfh2d_diameter_error_total |
Number of diameter answers received from diameter proxy | H2D | iwfh2d_diameter_response_incoming_total |
Number of 200OK responses from PCF Discovery to DP | DP | successResponseToPdraCounter |
The table Mediation Metric Reference provides the information about Mediation Metrics.
Table 4-3 Mediation Metric Reference
Metric | Metric Description | Mediation Micro Service |
---|---|---|
Mediation Ingress Request rate | Number of Request | All |
Mediation Egress Request rate | Number of Response | All |
iwfmediation_incoming_request_total | Number of incoming requests to iwf mediation | iwf-mediation |
iwfmediation_outgoing_response_total | Number of outgoing responses from iwf mediation | iwf-mediation |
iwfd2h_mediation_incoming_total | Number of incoming requests from D2H to iwf mediation | iwf-mediation |
iwfd2h_mediation_response_total | Number of responses from iwf mediation to D2H | iwf-mediation |
iwfmediation_h2d_outgoing_total | Number of incoming requests from iwf mediation to H2D | iwf-mediation |
iwfmediation_h2d_response_total | Number of outgoing responses from H2D to iwf mediation | iwf-mediation |
iwfmediation_outgoing_request_total | Number of outgoing requests from iwf mediation | iwf-mediation |
iwfmediation_incoming_response_total | Number of incoming responses to iwf mediation | iwf-mediation |
nfmediation_incoming_request_total | Number of incoming requests to nf mediation | nf-mediation |
nfmediation_outgoing_response_total | Number of outgoing responses from nf mediation | nf-mediation |
Alerts
The following are the alerts of IWF:
IWF Alerts
Table 4-4 IWF Alerts
SLNo | Alert Name | Severity | OID used for SNMP Traps | Metric Applicable | Threshold | Description |
---|---|---|---|---|---|---|
1 | NFMediationIngressTrafficRateAboveMinorThreshold | Info |
|
rate(ociwf_nf_med_http_req_total{app="nf-mediation"}[2m]) | 70% of max MPS | Notify user after a certain threshold traffic rate is reached. |
2 | NFMediationIngressTrafficRateAboveMajorThreshold | Warning |
|
rate(ociwf_nf_med_http_req_total{app="nf-mediation"}[2m]) | 80% of max. MPS | Notify user after a certain threshold traffic rate is reached. |
3 | NFMediationIngressTrafficRateAboveCriticalThreshold | Critical |
|
rate(ociwf_nf_med_http_req_total{app="nf-mediation"}[2m]) | 95% of max. MPS | Notify user after a certain threshold traffic rate is reached. |
4 | NFMediation Response Failure | Info |
|
rate(ociwf_nf_med_http_rsp_total{statusCode!="200",app="nf-mediation"}[2m]) | 100 | Notify user that there is a failure in the response execution. |
5 | NFMediationTest Response Failure | Info |
|
rate(ociwf_nf_med_test_rsp_total{statusCode!="200",app="nf-mediation-test"}[2m]) | 100 | Notify user that there is a failure in the response execution. |
6 | NFMediationPodMemoryUsage | Warning |
|
sum(container_memory_usage_bytes{namespace="medsvc",container_name="nf-mediation"}) by (pod_name,namespace) | 70% | Notify user that NFMediation Memory usage per pod threshold value is reached. |
7 | NFMediationPodTestMemoryUsage | Warning |
|
sum(container_memory_usage_bytes{namespace="medsvc",container_name="nf-mediation-test"}) by (pod_name,namespace) | 70% | Notify user that NFMediation Test Memory usage per pod threshold value is reached. |
8 | NFMediationPodCPUUsage | Warning |
|
sum(container_cpu_usage_seconds_total{namespace="medsvc",container_name="nf-mediation"}) by (pod_name,namespace) | 70% | Notify user that NFMediation CPU usage per pod threshold value is reached. |
9 | NFMediationPodTestCPUUsage | Warning |
|
sum(container_cpu_usage_seconds_total{namespace="medsvc",container_name="nf-mediation-test"}) by (pod_name,namespace) | 70% | Notify user that NFMediation Test CPU usage per pod threshold value is reached. |
10 | NFMediationRuleUpdateFailure | Critical |
|
ociwf_med_rule_update_status{app="nf-mediation"} | 0 | Notify user that rule updation into the configmap failed. |
11 | NFMediationTestRuleUpdateFailure | Critical |
|
ociwf_med_rule_update_status{app="nf-mediation-test"} < 1 | 0 | Notify user that rule updation into the configmap failed for test mode. |
12 | IWFMediationPodCPUUsage | Warning |
|
sum(container_cpu_usage_seconds_total{container="iwf-mediation"}) by (pod_name,namespace) | 70% | Notify user that IWFMediation CPU usage per pod threshold value is reached. |
13 | IWFMediationPodMemoryUsage | Warning |
|
sum(container_memory_usage_bytes{container="iwf-mediation"}) by (pod_name,namespace) | 70% | Notify user that IWFMediation Memory usage per pod threshold value is reached. |
IWF Alert Configuration
Follow the steps below for IWF Alert configuration in Prometheus:
Note:
- By default Namespace for OCIWF is
ociwf
that must be updated as per the deployment. - The
OCIWF-config-1.5.0.0.0.zip
file can be downloaded from OHC. Unzip theOCIWF-config-1.5.0.0.0.zip
package after downloading to getIWFAlertrules-1.5.0.yaml
file.
Procedure
- Take a backup of current configuration map of Prometheus:
kubectl get configmaps _NAME_-server -o yaml -n _Namespace_ > /tmp/ tempConfig.yaml
- Check and add OCIWF Alert file name inside Prometheus
configuration map:
sed -i '/etc\/config\/alertsiwf/d' /tmp/tempConfig.yaml sed -i '/rule_files:/a\ \- /etc/config/alertsiwf' /tmp/tempConfig.yaml
- Update configuration map with updated file name of OCIWF alert
file:
kubectl replace configmap _NAME_-server -f /tmp/tempConfig.yaml
- Add OCIWF Alert rules in configuration map under file name of
OCIWF alert file:
kubectl patch configmap _NAME_-server -n _Namespace_--type merge --patch "$(cat ~/iwfAlertrules.yaml)"
Note:
The Prometheus server takes an updated configuration map that is automatically reloaded after approximately 20 seconds. Refresh the Prometheus GUI to confirm that the OCIWF Alerts have been reloaded.IWF KPIs
The following are IWF KPIs:
SL.NO | KPI Name | KPI Details | Metric User |
---|---|---|---|
1 | OCIWF Ingress Request | Rate of HTTP requests received at OCIWF Ingress Gateway | oc_ingressgateway_http_requests |
2 | OCIWF Incoming Request per Agenda Group | Rate of HTTP requests received at OCIWF service per Agenda Group | ociwf_med_http_req_total |
3 | OCIWF 2xx Response per Agenda Group | Rate of 2xx HTTP response from OCIWF per Agenda Group | ociwf_med_http_rsp_total |
4 | OCIWF 4xx Response per Agenda Group | Rate of 4xx HTTP response from OCIWF per Agenda Group | ociwf_med_http_rsp_total |
5 | OCIWF 5xx Response per Agenda Group | Rate of 5xx HTTP response from OCIWF per Agenda Group | ociwf_med_http_rsp_total |
6 | OCIWF CPU Usage per service | CPU utilization per service | container_cpu_usage_seconds_total |
7 | OCIWF Memory consumed per service | Memory Consumed per service | container_memory_usage_bytes |
8 | OCIWF Processing time | OCIWF Processing Time | ociwf_med_msg_processing_time |