4 Metrics, Alerts and KPIs

This section provides the details of the Metrics, Alerts and KPIs applicable for InterWorking and Mediation Function.

IWF Metrics

The following are IWF Metrics:

Table 4-1 New Metriation Metrics for 1.5 Release

SL.No Prometheus state Metric Name Metric Description Dimensions Example Metric Type
1 ociwf_med_total_rule_count

Total Number of Rules Configured

Condition: As soon as mediation service comes up, total number of rules configured will be counted.

  1. app (nf-mediation,nf-mediation-test, iwf-mediation, iwf-mediation-test)
  2. nfInstanceId
  3. vendor
  4. kubernetes_namespace
ociwf_med_total_rule_count{app= "nf-mediation",nfInstanceId="IWF1",vendor="oracle",kubernetes_namespace="medsvc"} Gauge
2 ociwf_med_active_rule_count

Total Number of Rules which will be invoked for a particular message

Condition: When rules get executed.

  1. app (nf-mediation,nf-mediation-test, iwf-mediation, iwf-mediation-test)
  2. nfInstanceId
  3. vendor
  4. kubernetes_namespace
ociwf_med_active_rule_count_total{app="nf-mediation",nfInstanceId="IWF1",vendor="oracle",kubernetes_namespace="medsvc"} Gauge
3 ociwf_med_individual_rule_exec_count_total

Total Number of times a particular rule gets invoked.

Condition: When rules get executed

  1. app (nf-mediation,nf-mediation-test, iwf-mediation, iwf-mediation-test)
  2. nfInstanceId
  3. vendor
  4. kubernetes_namespace
  5. ruleName
  6. ruleGroupName
sum(ociwf_med_individual_rule_exec_count_total{app="nf-mediation"}) by (ruleName,ruleGroupName) Counter
4 ociwf_med_http_req_total

Total Number of ingress messages to NF

Condition: Whenever a msg lands on nf-mediaton.

  1. app (nf-mediation, iwf-mediation)
  2. nfInstanceId
  3. vendor
  4. kubernetes_namespace
  5. ruleGroupName
  6. msgType (consumerRequest, producerResponse)
ociwf_med_req_total{msgType="consumerRequest",app="nf-mediation",ruleGroupName="scp-agenda"} Counter
5 ociwf_med_http_rsp_total

Total Number of egress messages

Condition: When nf sends response

  1. app (nf-mediation,nf-mediation-test, iwf-mediation, iwf-mediation-test)
  2. nfInstanceId
  3. vendor
  4. kubernetes_namespace
  5. ruleGroupName
  6. statusCode (supports specific codes:: 200, 500, 503)

  7. msgType (consumerRequest, producerResponse)

ociwf_med_rsp_total{msgType="consumerRequest",ruleGroupName="scp-agenda",statusCode~="2.*"} Counter
6 ociwf_med_test_req_total

Total Number of Incoming messages to NF Test mode

Condition: whenever test mode is enabled

  1. app (nf-mediation-test, iwf-mediation-test)
  2. nfInstanceId
  3. vendor
  4. kubernetes_namespace
  5. ruleGroupName
  6. msgType (consumerRequest, producerResponse)
ociwf_med_test_req_total{msgType="consumerRequest",app="nf-mediation-test",ruleGroupName="scp-agenda"} Counter
7 ociwf_med_test_rsp_total

Total Number of response by test mode*

Condition: when response is sent by nf

*Although test mode won't send any response but here it means proper execution of message by test mode.

  1. app (nf-mediation,nf-mediation-test, iwf-mediation, iwf-mediation-test)
  2. nfInstanceId
  3. vendor
  4. kubernetes_namespace
  5. ruleGroupName
  6. statusCode (supports specific codes:: 200, 500, 503)
  7. msgType (consumerRequest, producerResponse)
ociwf_med_test_rsp_total{msgType="consumerRequest"} Counter
8 ociwf_med_msg_forwarded_to_test_mode_total Total Number of Requests forwarded to Test mode
  1. app(nf-mediation, iwf-mediation)
sum(ociwf_med_msg_forwarded_to_test_mode_total) Counter
9 ociwf_med_rule_update_status

If Rules Reloading failed or successful

Value = 0 {failed}

Value = 1 {successful}

  1. app (nf-mediation, nf-mediation-test, iwf-mediation, iwf-mediation-test)
  2. vendor
  3. nfInstanceId
  4. kubernetes_namespace
ociwf_med_rule_update_status{app="nf-mediation"} Gauge
10 ociwf_med_individual_rule_processing_time Processing time of Every rule invoked
  1. app (nf-mediation, nf-mediation-test, iwf-mediation, iwf-mediation-test)
  2. vendor
  3. nfInstanceId
  4. kubernetes_namespace
  5. ruleName
  6. ruleGroupName
  Histogram
11 ociwf_med_msg_processing_time Processing time of message which lands on mediation
  1. app (nf-mediation, nf-mediation-test, iwf-mediation, iwf-mediation-test)
  2. vendor
  3. nfInstanceId
  4. kubernetes_namespace
  Histogram
12 ociwf_med_forward_to_ext_total

Total Number of messages forwarded to external.

Condition: When mediation works in proxy mode.

  1. app (iwf-mediation, iwf-mediation-test)
  2. vendor
  3. nfInstanceId
  4. kubernetes_namespace
  Counter
13 ociwf_med_incoming_rsp_from_ext_total

Total Number of responses from external.

Condition: When mediation works in proxy mode

  1. app (iwf-mediation, iwf-mediation-test)
  2. vendor
  3. nfInstanceId
  4. kubernetes_namespace
  Counter
14 ociwf_med_incoming_d2h_req_total Total Number of requests from D2H service as a part of protocol translation mode.
  1. app (iwf-mediation, iwf-mediation-test)
  2. vendor
  3. nfInstanceId
  4. kubernetes_namespace
  Counter
15 ociwf_med_outgoing_rsp_to_d2h_total Total Number of responses to D2H service as a part of protocol translation mode.
  1. app (iwf-mediation, iwf-mediation-test)
  2. vendor
  3. nfInstanceId
  4. kubernetes_namespace
  Counter
16 ociwf_med_forward_to_h2d_total Total Number of messages forwarded to H2D as a part of protocol translation mode.
  1. app (iwf-mediation, iwf-mediation-test)
  2. vendor
  3. nfInstanceId
  4. kubernetes_namespace
  Counter
17 ociwf_med_incoming_rsp_from_h2d_total Total Number of responses received from H2D as a part of protocol translation as a service.
  1. app (iwf-mediation, iwf-mediation-test)
  2. vendor
  3. nfInstanceId
  4. kubernetes_namespace
  Counter

Table 4-2 IWF Metric Reference

Metric Details IWF Micro Service Metric
Number of the incoming request to DP from PDRA DP pdraIngressCounter
Number of successfully converted messages by D2H D2H iwfd2h_conversionSuccess_messages_total
Number of successful conversion from http to diameter H2D iwfh2d_h2dConversion_Success_total
Number of successful conversion from diameter to http H2D iwfh2d_d2hConversion_Success_total
Number of success Outgoing responses from PcfDiscovery PcfDiscovery pcfDiscSuccessResponseCounter
Number of success Outgoing responses from BSF PcfDiscovery bsfSucessResponseCounter
Number of responses received from mediation D2H iwfd2h_mediation_response_total
Number of responses from mediation to D2H Mediation iwfd2h_mediation_response_total
Number of Response All IWF Egress Request rate
Number of requests sent to mediation from D2H D2H iwfd2h_mediation_outgoing_total
Number of requests sent to D2H from DP DP iwfdiameterproxy1_d2h_outgoing_total
Number of Request All IWF Ingress Request rate
Number of outgoing responses from H2D to mediation Mediation iwfmediation_h2d_response_total
Number of outgoing responses from mediation Mediation iwfmediation_outgoing_response_total
Number of outgoing requests from mediation Mediation iwfmediation_outgoing_request_total
Number of messages going from PDRA to PCF DP pdraPcfEgressCounter
Number of Incoming responses to PcfDiscovery PcfDiscovery pcfDiscRequestCounter
Number of incoming responses to mediation Mediation iwfmediation_incoming_response_total
Number of incoming responses to BSF PcfDiscovery bsfRequestCounter
Number of incoming requests to mediation Mediation iwfmediation_incoming_request_total
Number of incoming requests from mediation to H2D Mediation iwfmediation_h2d_outgoing_total
Number of incoming requests from D2H to mediation Mediation iwfd2h_mediation_incoming_total
Number of incoming messages to D2H from diameter Proxy Service D2H iwfd2h_incoming_messages_total
Number of failures while sending message to mediation from D2H D2H iwfd2h_mediation_error_total
Number of failure Outgoing responses from PcfDiscovery PcfDiscovery pcfDiscFailureResponseCounter
Number of failure Outgoing responses from BSF PcfDiscovery bsfFailureResponseCounter
Number of failed response (responses other than 200OK )from PCF Discovery to DP DP failedResponseToPdraCounter
Number of diameter requests sent to diameter proxy H2D iwfh2d_diameter_request_outgoing_total
Number of Diameter requests sent to Diameter peer DP iwfdiameterproxy1_diameter_outgoing_total
Number of Diameter requests received from H2D DP iwfdiameterproxy1_h2d_incoming_total
Number of Diameter requests received from Diameter peer DP iwfdiameterproxy1_diameter_incoming_total
Number of diameter request send error occurred H2D iwfh2d_diameter_error_total
Number of diameter answers received from diameter proxy H2D iwfh2d_diameter_response_incoming_total
Number of 200OK responses from PCF Discovery to DP DP successResponseToPdraCounter

The table Mediation Metric Reference provides the information about Mediation Metrics.

Table 4-3 Mediation Metric Reference

Metric Metric Description Mediation Micro Service
Mediation Ingress Request rate Number of Request All
Mediation Egress Request rate Number of Response All
iwfmediation_incoming_request_total Number of incoming requests to iwf mediation iwf-mediation
iwfmediation_outgoing_response_total Number of outgoing responses from iwf mediation iwf-mediation
iwfd2h_mediation_incoming_total Number of incoming requests from D2H to iwf mediation iwf-mediation
iwfd2h_mediation_response_total Number of responses from iwf mediation to D2H iwf-mediation
iwfmediation_h2d_outgoing_total Number of incoming requests from iwf mediation to H2D iwf-mediation
iwfmediation_h2d_response_total Number of outgoing responses from H2D to iwf mediation iwf-mediation
iwfmediation_outgoing_request_total Number of outgoing requests from iwf mediation iwf-mediation
iwfmediation_incoming_response_total Number of incoming responses to iwf mediation iwf-mediation
nfmediation_incoming_request_total Number of incoming requests to nf mediation nf-mediation
nfmediation_outgoing_response_total Number of outgoing responses from nf mediation nf-mediation

Alerts

The following are the alerts of IWF:

IWF Alerts

The following are IWF Alerts:

Table 4-4 IWF Alerts

SLNo Alert Name Severity OID used for SNMP Traps Metric Applicable Threshold Description
1 NFMediationIngressTrafficRateAboveMinorThreshold Info
1.3.6.1.4.1.323.5.3.47.1.2.1001
rate(ociwf_nf_med_http_req_total{app="nf-mediation"}[2m]) 70% of max MPS Notify user after a certain threshold traffic rate is reached.
2 NFMediationIngressTrafficRateAboveMajorThreshold Warning
1.3.6.1.4.1.323.5.3.47.1.2.1001
rate(ociwf_nf_med_http_req_total{app="nf-mediation"}[2m]) 80% of max. MPS Notify user after a certain threshold traffic rate is reached.
3 NFMediationIngressTrafficRateAboveCriticalThreshold Critical
1.3.6.1.4.1.323.5.3.47.1.2.1001
rate(ociwf_nf_med_http_req_total{app="nf-mediation"}[2m]) 95% of max. MPS Notify user after a certain threshold traffic rate is reached.
4 NFMediation Response Failure Info
1.3.6.1.4.1.323.5.3.47.1.2.2001
rate(ociwf_nf_med_http_rsp_total{statusCode!="200",app="nf-mediation"}[2m]) 100 Notify user that there is a failure in the response execution.
5 NFMediationTest Response Failure Info
1.3.6.1.4.1.323.5.3.47.1.2.2002
rate(ociwf_nf_med_test_rsp_total{statusCode!="200",app="nf-mediation-test"}[2m]) 100 Notify user that there is a failure in the response execution.
6 NFMediationPodMemoryUsage Warning
1.3.6.1.4.1.323.5.3.47.1.2.3001
sum(container_memory_usage_bytes{namespace="medsvc",container_name="nf-mediation"}) by (pod_name,namespace) 70% Notify user that NFMediation Memory usage per pod threshold value is reached.
7 NFMediationPodTestMemoryUsage Warning
1.3.6.1.4.1.323.5.3.47.1.2.3002
sum(container_memory_usage_bytes{namespace="medsvc",container_name="nf-mediation-test"}) by (pod_name,namespace) 70% Notify user that NFMediation Test Memory usage per pod threshold value is reached.
8 NFMediationPodCPUUsage Warning
1.3.6.1.4.1.323.5.3.47.1.2.4001
sum(container_cpu_usage_seconds_total{namespace="medsvc",container_name="nf-mediation"}) by (pod_name,namespace) 70% Notify user that NFMediation CPU usage per pod threshold value is reached.
9 NFMediationPodTestCPUUsage Warning
1.3.6.1.4.1.323.5.3.47.1.2.4002
sum(container_cpu_usage_seconds_total{namespace="medsvc",container_name="nf-mediation-test"}) by (pod_name,namespace) 70% Notify user that NFMediation Test CPU usage per pod threshold value is reached.
10 NFMediationRuleUpdateFailure Critical
1.3.6.1.4.1.323.5.3.47.1.2.5001
ociwf_med_rule_update_status{app="nf-mediation"} 0 Notify user that rule updation into the configmap failed.
11 NFMediationTestRuleUpdateFailure Critical
1.3.6.1.4.1.323.5.3.47.1.2.5002
ociwf_med_rule_update_status{app="nf-mediation-test"} < 1 0 Notify user that rule updation into the configmap failed for test mode.
12 IWFMediationPodCPUUsage Warning
1.3.6.1.4.1.323.5.3.47.1.2.6001
sum(container_cpu_usage_seconds_total{container="iwf-mediation"}) by (pod_name,namespace) 70% Notify user that IWFMediation CPU usage per pod threshold value is reached.
13 IWFMediationPodMemoryUsage Warning
1.3.6.1.4.1.323.5.3.47.1.2.7001
sum(container_memory_usage_bytes{container="iwf-mediation"}) by (pod_name,namespace) 70% Notify user that IWFMediation Memory usage per pod threshold value is reached.

IWF Alert Configuration

Follow the steps below for IWF Alert configuration in Prometheus:

Note:

  1. By default Namespace for OCIWF is ociwf that must be updated as per the deployment.
  2. The OCIWF-config-1.5.0.0.0.zip file can be downloaded from OHC. Unzip the OCIWF-config-1.5.0.0.0.zip package after downloading to get IWFAlertrules-1.5.0.yamlfile.

Procedure

  1. Take a backup of current configuration map of Prometheus:
    kubectl get configmaps _NAME_-server -o yaml -n _Namespace_ > /tmp/ tempConfig.yaml
  2. Check and add OCIWF Alert file name inside Prometheus configuration map:
    sed -i '/etc\/config\/alertsiwf/d' /tmp/tempConfig.yaml 
    sed -i '/rule_files:/a\ \- /etc/config/alertsiwf' /tmp/tempConfig.yaml
  3. Update configuration map with updated file name of OCIWF alert file:
    kubectl replace configmap _NAME_-server -f /tmp/tempConfig.yaml
  4. Add OCIWF Alert rules in configuration map under file name of OCIWF alert file:
    kubectl patch configmap _NAME_-server -n _Namespace_--type merge 
    --patch "$(cat ~/iwfAlertrules.yaml)"

Note:

The Prometheus server takes an updated configuration map that is automatically reloaded after approximately 20 seconds. Refresh the Prometheus GUI to confirm that the OCIWF Alerts have been reloaded.

IWF KPIs

The following are IWF KPIs:

SL.NO KPI Name KPI Details Metric User
1 OCIWF Ingress Request Rate of HTTP requests received at OCIWF Ingress Gateway oc_ingressgateway_http_requests
2 OCIWF Incoming Request per Agenda Group Rate of HTTP requests received at OCIWF service per Agenda Group ociwf_med_http_req_total
3 OCIWF 2xx Response per Agenda Group Rate of 2xx HTTP response from OCIWF per Agenda Group ociwf_med_http_rsp_total
4 OCIWF 4xx Response per Agenda Group Rate of 4xx HTTP response from OCIWF per Agenda Group ociwf_med_http_rsp_total
5 OCIWF 5xx Response per Agenda Group Rate of 5xx HTTP response from OCIWF per Agenda Group ociwf_med_http_rsp_total
6 OCIWF CPU Usage per service CPU utilization per service container_cpu_usage_seconds_total
7 OCIWF Memory consumed per service Memory Consumed per service container_memory_usage_bytes
8 OCIWF Processing time OCIWF Processing Time ociwf_med_msg_processing_time