3 Metrics, KPIs and Alerts
SEPP Metrics
This section provides information about the SEPP metrics.
SL.No. | Prometheus Stat Metrics Name | Metrics Description(including pegging condition) | Dimensions |
---|---|---|---|
1. | sepp_n32c_handshake_failure | If N32c Handshake Procedure fails, this metrics will be pegged and corresponding alarm will be raised. |
|
2. | sepp_cn32c_handshake_requests_total |
Total number of requests send over n32c for handshake procedure. Condition:: When sepp initiates any handshake procedure requests towards peer sepp. |
|
3. | sepp_cn32c_handshake_response_total |
Total number of responses received over n32c for handshake procedure. Condition:: When sepp receives any handshake procedure response from peer sepp. It can be successful or failure based on responseCode |
|
4. | sepp_cn32c_handshake_initiation_req_total |
Total number of Handshake Initiation requests received from config-mgr. Condition:: When handshake initiation requests received from config-mgr. |
|
5. | sepp_cn32c_handshake_delete_req_total |
Total number of Handshake context delete requests received from config-mgr. Condition:: When handshake context delete requests received from config-mgr. |
|
S.No. | Prometheus Stat Metrics Name | Metrics Description(including pegging condition) | Dimensions |
---|---|---|---|
1. | sepp_pn32c_handshake_requests_total |
Total number of requests received over n32c for handshake procedure. Condition:: When any handshake procedure request received from peer sepp. |
|
2. | sepp_pn32c_handshake_response_total |
Total number of responses sent over n32c for handshake procedure. Condition:: When sepp sends response to handshake procedure received. It can be a success response or failure response based on success code. |
|
S.No. | Prometheus Stat Metrics Name | Metrics Description(including pegging condition) | Dimensions |
---|---|---|---|
1. | sepp_cn32f_requests_failure |
Total number of requests failed to be sent from cn32f to remote SEPP. Condition:: When any error or exception occurs on cn32f side because of which request is not sent to pn32f. |
|
2. | sepp_cn32f_response_failure |
Total number of response failed to be sent from cn32f pod to NF. Condition:: When any error or exception occurs on cn32f side because of which request is not sent to NF. |
|
3. | sepp_cn32f_requests |
Total number of requests received from NF. Condition:: When a request is received on InboundInterface of cn32f. |
|
4. | sepp_cn32f_response |
Total number of response received from remote SEPP. Condition:: When a response is received on OutboundInterface of cn32f. |
|
Microservice Name: pN32f
S.No | Prometheus Stat Metric Name | Metric Description(including pegging condition) | Dimensions |
---|---|---|---|
1 | sepp_pn32f_requests_rx |
Number of requests received from Peer Sepp Condition: When a request reaches pn32f from peer Sepp. |
|
2 | sepp_pn32f_requests_tx | Number of requests transmitted to NF.Condition: When a request transmits a message to NF |
|
3 | sepp_pn32f_responses_rx |
Number of responses received from EGW Condition: When a response reaches pn32f from EGW. |
|
4 | sepp_pn32f_responses_tx | Number of responses transmitted to cSepp.Condition: When a responses transmits a message to cSepp |
|
Note:
The templates for dashboard creations not shared in the deliverables. So dashboard needs to be created manually based on needsSEPP KPIs
This section provides information about the SEPP KPIs.
KPI Details | SEPP Microservice | Metrics Used | Service Operation | Response Code |
---|---|---|---|---|
cn32c Handshake Success rate | cn32c | (sum(ocsepp_cn32c_handshake_response_total)/sum(ocsepp_cn32c_handshake_requests_total))*100 | n32c Handshake | 200 |
cn32f Routing Success Rate | cn32f | (sum(ocsepp_cn32f_response_total)/sum(ocsepp_cn32f_requests_total))*100 | n32f message forward | All |
cn32f Requests Rate Per Remote SEPP | cn32f | sum(irate(ocsepp_cn32f_requests_total[2m]))by(PEER_DOMAIN, PEER_FQDN, PLMN_ID) | n32f message forward | All |
cn32f Response Rate Per Remote SEPP | cn32f | sum(irate(ocsepp_cn32f_response_total[2m])) by(PEER_DOMAIN, PEER_FQDN, PLMN_ID) | n32f message forward | All |
cn32c Handshake Failure Per Remote SEPP | cn32c | sum(ocsepp_n32c_handshake_failure)by(peer_domain, peer_fqdn, peer_plmn_id) | n32c Handshake | 4xx and 5xx |
pn32c Handshake Success rate | pn32c | (sum(ocsepp_pn32c_handshake_response_total)/sum(ocsepp_pn32c_handshake_requests_total))*100 | n32c Handshake | 200 |
pn32f Routing Success Rate | pn32f | (sum(ocsepp_pn32f_response_total)/sum(ocsepp_pn32f_requests_total))*100 | n32f message forward | All |
pn32f Requests Rate | pn32f | sum(irate(ocsepp_pn32f_requests_rx_total[2m])) | n32f message forward | All |
pn32f Response Rate | pn32f | sum(irate(ocsepp_pn32f_response_rx_total[2m])) | n32f message forward | All |
pn32c Handshake Failure Per Remote SEPP | pn32c | sum(ocsepp_n32c_handshake_failure)by(peer_domain, peer_fqdn, peer_plmn_id) | n32c Handshake | 4xx and 5xx |
SEPP Alerts
This section provides information about the SEPP alerts and their configuration.
Table 3-1 SEPP Alerts
SL.no. | Alert | Severity | oid | Dimensions | Description | Remarks |
---|---|---|---|---|---|---|
1 | SEPPCn32cHandshakeFailureAlert | Major | 1.3.6.1.4.1.323.5.3.46.1.2.2001 | reason | Handshake procedure has failed on Consumer sepp | SEPP-1.4 |
2 | SEPPPn32cHandshakeFailureAlert | Major | 1.3.6.1.4.1.323.5.3.46.1.2.3001 | reason | Handshake procedure has failed on Producer sepp | SEPP-1.4 |
3 | SEPPN32fRoutingFailure | warning | 1.3.6.1.4.1.323.5.3.46.1.2.4001 | error_msg | N32f service not able to forward message | SEPP-1.4 |
4 | SEPPPodMemoryUsageAlert | warning | 1.3.6.1.4.1.323.5.3.46.1.2.4003 | Pod memory usage is above threshold ( 70% ) | SEPP-1.4 | |
5 | SEPPPodCpuUsageAlert | warning | 1.3.6.1.4.1.323.5.3.46.1.2.4002 | Pod CPU usage is above threshold ( 70% ) | SEPP-1.4 |
SEPP Alert Configuration
This section describes the Measurement based Alert rules configuration for SEPP. The Alert Manager uses the Prometheus measurements values as reported by microservices in conditions under alert rules to trigger alerts.
Steps for SEPP Alert Configuration in Prometheus
- Find the config map to configure alerts in prometheus server by executing
the following
command:
kubectl get configmap -n <Namespace>
where, <Namespace> is the prometheus server namespace used in helm install command.
-
Take Backup of current config map of prometheus server by executing the following command:where, <Namespace> is the prometheus server namespace used in helm install command.
kubectl get configmaps _NAME_-server -o yaml -n _Namespace_ > /tmp/tempConfig.yaml
For Example, assuming chart name is "prometheus-alert", so "_NAME_-server" becomes "prometheus-alert-server", execute the following command to find the config map:kubectl get configmaps prometheus-alert-server -o yaml -n prometheus-alert2 > /tmp/tempConfig.yaml
- Check if alertssepp is present in the t_mapConfig.yaml file by executing the
following
command:
cat /tmp/t_mapConfig.yaml | grep alertssepp
- If alertssepp is present, delete the alertssepp entry from the
t_mapConfig.yaml file, by executing the following
command:
sed -i '/etc\/config\/alertssepp/d' /tmp/t_mapConfig.yaml
- If alertssepp is not present, add the alertssepp entry in the
t_mapConfig.yaml file by executing the following
command:
sed -i '/rule_files:/a\ \- /etc/config/alertssepp' /tmp/t_mapConfig.yaml
- Reload the config map with the modifed file by executing the following
command:
kubectl replace configmap <Name> -f /tmp/t_mapConfig.yaml
- Add seppAlertRules.yaml file into prometheus config map under filename of
SEPP alert file by executing the following command
:
kubectl patch configmap <Name> -n <Namespace> --type merge --patch "$(cat <PATH>/seppAlertRules.yaml)"
- Restart prometheus-server pod.
- Verify the alerts in prometheus GUI.
Note:
Prometheus server takes updated configmap reloaded after sometime automatically (~20 sec)