6 Alerts
This section provides information about the supported alerts and how to configure the alerts.
Note:
The performance and capacity of the SCP system may vary based on the call model, feature or interface configuration, network conditions, and underlying CNE and hardware environment.
You can configure alerts in Prometheus and ScpAlertrules.yaml file.
Table 6-1 Alerts Levels or Severity Types
Alerts Levels / Severity Types | Definition |
---|---|
Critical | Indicates a severe issue that poses a significant risk to safety, security, or operational integrity. It requires immediate response to address the situation and prevent serious consequences. Raised for conditions may affect the service of SCP. |
Major | Indicates a more significant issue that has an impact on operations or poses a moderate risk. It requires prompt attention and action to mitigate potential escalation. Raised for conditions may affect the service of SCP. |
Minor | Indicates a situation that is low in severity and does not pose an immediate risk to safety, security, or operations. It requires attention but does not demand urgent action. Raised for conditions may affect the service of SCP. |
Info or Warn (Informational) | Provides general information or updates that are not related to immediate risks or actions. These alerts are for awareness and do not typically require any specific response. WARN and INFO alerts may not impact the service of SCP. |
Caution:
User, computer and applications, and character encoding settings may cause an issue when copy-pasting commands or any content from PDF. The PDF reader version also affects the copy-pasting functionality. It is recommended to verify the pasted content when the hyphens or any special characters are part of the copied content.Note:
kubectl
commands might vary based on the platform deployment. Replacekubectl
with Kubernetes environment-specific command line tool to configure Kubernetes resources through kube-api server. The instructions provided in this document are as per the Oracle Communications Cloud Native Environment (OCCNE) version of kube-api server.- The alert file can be customized as required by the deployment environment. For example, namespace can be added as a filtered criteria to the alert expression to filter alerts only for a specific namespace.
6.1 System level alerts
This section lists the system level alerts.
6.1.1 SCPNotificationPodMemoryUsage
Table 6-2 SCPNotificationPodMemoryUsage
Field | Description |
---|---|
Description |
Notify Notification service Pod memory usage if it is above threshold Threshold value is 85% of allocated (4GB) memory: 3.4 GB |
Summary | Memory usage is above 85% for podname: {{$labels.kubernetes_pod_name}}, instancename: {{$labels.instance}}, namespace: {{$labels.kubernetes_namespace}} with current value {{ $value }},scpfqdn: {{$labels.ocscp_fqdn}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} |
Severity | Major |
Conditions | sum(container_memory_usage_bytes{image!="",pod=~".*scpc-notification.+"}) by (kubernetes_pod_name,kubernetes_namespace, instance) > 3650722201 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.3001 |
Metric Used | ocscp_nrf_notifications_requests_nf_total |
Recommended Actions |
Cause: When high notification rate or very large NF profile size is present in notifications. Diagnostic Information: Monitor the notification metric: ocscp_nrf_notifications_requests_nf_total. Notification usage reduces after some time when it crosses 2.5 GB or 3 GB. Recovery: This alert is cleared automatically when the scpc-notification pod memory usage reduces below the defined threshold. Reduce the notification rate. These notifications are generated by NRF and can be controlled through NRF. For any assistance, contact My Oracle Support. |
6.1.2 SCPWorkerPodMemoryUsage
Table 6-3 SCPWorkerPodMemoryUsage
Field | Description |
---|---|
Description | Notify Worker per Pod memory usage is above threshold Threshold value is 85% of allocated (16GB) memory: 13.6 GB |
Summary | Memory usage is above 85% for podname: {{$labels.kubernetes_pod_name}}, instancename: {{$labels.instance}}, namespace: {{$labels.kubernetes_namespace}} with current value {{ $value }},scpfqdn: {{$labels.ocscp_fqdn}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} |
Severity | major |
Conditions | sum(container_memory_usage_bytes{image!="",pod=~".*scp-worker.+"}) by (kubernetes_pod_name,kubernetes_namespace, instance) > 14602888806 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.7004 |
Metric Used |
|
Recommended Actions |
Cause: When there is high traffic rate, alternate routing, more number of routing rules and rules size, and due to network or producer NF latency. Diagnostic Information: Monitor traffic rate, alerts, and latency on the KPI Dashboard. Check the
traffic rates of the following metrics if they are too high:
Check the upstream response time by using the following command and ensure whether upstream is taking too long to respond: ocscp_metric_upstream_service_time_total. Check the following platform metric for current memory usage by the scp-worker pod: container_memory_usage_bytes. Recovery: This alert is cleared automatically when the scp-worker pod memory usage reduces below the defined threshold. Reduce the traffic rate and improve the latency. For any assistance, contact My Oracle Support. |
6.1.3 SCPInstanceDown
Table 6-4 SCPInstanceDown
Field | Description |
---|---|
Description | Notify that if any pod in ocscp release is down. Provides information like pod name, instance id and app name. |
Summary | Pod with podname: {{$labels.kubernetes_pod_name}} is Down , timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} |
Severity | Critical |
Conditions | kube_pod_status_ready{pod =~ '.*scp-worker.*|.*scpc-notification.*|.*scpc-subscription.*|.*scpc-configuration.*|.*scpc-audit.*|.*scp-nrfproxy.*|.*scp-load-manager.*|.*scp-cache.*|.*scpc-alternate-resolution.*|.*scp-mediation.*',condition =~ 'true'} !=1 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.7006 |
Recommended Actions |
Cause: When the following issues occur:
Diagnostic Information:
Recovery: This alert is cleared automatically when the inactive pod is active. Recover DB services if down. Collect the application logs and contact My Oracle Support for any assistance. |
6.2 Application level alerts
This section lists the application level alerts.
6.2.1 SCPCcaFeatureEnabledWithoutHttps
Table 6-5 SCPCcaFeatureEnabledWithoutHttps
Field | Description |
---|---|
Severity | Info |
Condition | ocscp_worker_cca_validation_feature_enabled_without_https > 0 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.9022 |
Description | An alert is raised when the CCA validation feature is enabled without enabling HTTPS. |
Recommended Actions |
Cause: CCA validation feature is enabled without enabling HTTPS. Diagnostic Information: Deploy HTTPS SCP deployment. Recovery: The alert is cleared automatically if either the CCA feature is disabled or deployment is changed to HTTPS. For any assistance, contact My Oracle Support. |
6.2.2 SCPIngressTrafficRateAboveMinorThreshold
Table 6-6 SCPIngressTrafficRateAboveMinorThreshold
Field | Details |
---|---|
Description | This alert notifies that the traffic rate has increased from 9800 to 11200 Mbps, based on the user-configured minor threshold value. It also includes the locality and the current traffic rate value. |
Summary | namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}},scpfqdn: {{$labels.ocscp_fqdn}},scpauthority:{{$labels.ocscp_authority}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Current Ingress Traffic Rate is {{ $value | printf "%.2f" }} mps which is above 70 Percent of Max MPS(14000) |
Severity | minor |
Condition | sum(rate(ocscp_metric_http_rx_req_total{app_kubernetes_io_name="scp-worker"}[2m])) by (kubernetes_namespace,ocscp_locality,kubernetes_pod_name,ocscp_fqdn,ocscp_authority) >= 9800 < 11200 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.7001 |
Metric Used | ocscp_metric_http_rx_req_total |
Recommended Action |
Cause: When the Consumer NF sends more traffic than expected. Diagnostic Information: Monitor the ingress traffic to pod using the KPI Dashboard. Refer to the rate of ocscp_metric_http_rx_req_total metric on the Grafana GUI. Recovery: This alert is cleared automatically when the ingress traffic reduces below the minor threshold or exceeds the major threshold. If this alert is not cleared, then check the Consumer NF for an uneven distribution of traffic per connection or for any other issue. For any assistance, contact My Oracle Support. |
Note:
The alert expression is configured for the SCP profile (12 vCPUs and 16 Gi of memory). For a smaller resource profile (for example, 8 vCPUs and 12 Gi of memory), the per-worker pod MPS should be evaluated based on the dimensioning sheet, and the alert file must be updated accordingly.6.2.3 SCPIngressTrafficRateAboveMajorThreshold
Table 6-7 SCPIngressTrafficRateAboveMajorThreshold
Field | Details |
---|---|
Description | This alert notifies that the traffic rate has increased from 11200 to 13300 Mbps, based on the user-configured major threshold value. It also includes the locality and the current traffic rate value. |
Summary | namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}},scpfqdn: {{$labels.ocscp_fqdn}},scpauthority:{{$labels.ocscp_authority}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Current Ingress Traffic Rate is {{ $value | printf "%.2f" }} mps which is above 80 Percent of Max MPS(14000) |
Severity | major |
Condition | sum(rate(ocscp_metric_http_rx_req_total{app_kubernetes_io_name="scp-worker"}[2m])) by (kubernetes_namespace,ocscp_locality,kubernetes_pod_name,ocscp_fqdn,ocscp_authority) >= 11200 < 13300 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.7001 |
Metric Used | ocscp_metric_http_rx_req_total |
Recommended Action |
Cause: When the Consumer NF sends more traffic than expected. Diagnostic Information: Monitor the ingress traffic to pod using the KPI Dashboard. Refer to the rate of ocscp_metric_http_rx_req_total metric on the Grafana GUI. Recovery: This alert is cleared automatically when the ingress traffic reduces below the major threshold or exceeds the critical threshold. If this alert is not cleared, then check the Consumer NF for an uneven distribution of traffic per connection or for any other issue. If this alert continues for a long duration, then reduce the ingress traffic from consumer to pod. For any assistance, contact My Oracle Support. |
Note:
The alert expression is configured for the SCP profile (12 vCPUs and 16 Gi of memory). For a smaller resource profile (for example, 8 vCPUs and 12 Gi of memory), the per-worker pod MPS should be evaluated based on the dimensioning sheet, and the alert file must be updated accordingly.6.2.4 SCPIngressTrafficRateAboveCriticalThreshold
Table 6-8 SCPIngressTrafficRateAboveCriticalThreshold
Field | Details |
---|---|
Description | This alert notifies that the traffic rate has increased above 13300 mps, based on the user-configured major threshold value. It also includes the locality and the current traffic rate value. |
Summary | namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}},scpauthority:{{$labels.ocscp_authority}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Current Ingress Traffic Rate is {{ $value | printf "%.2f" }} mps which is above 95 Percent of Max MPS(14000) |
Severity | critical |
Condition | sum(rate(ocscp_metric_http_rx_req_total{app_kubernetes_io_name="scp-worker"}[2m])) by (kubernetes_namespace,ocscp_locality,kubernetes_pod_name,ocscp_fqdn,ocscp_authority) >= 13300 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.7001 |
Metric Used | ocscp_metric_http_rx_req_total |
Recommended Action |
Cause: When the Consumer NF sends more traffic than expected. Diagnostic Information: Monitor the ingress traffic to pod using the KPI Dashboard. Refer to the rate of the ocscp_metric_http_rx_req_total metric on the Grafana GUI. Recovery: This alert is cleared automatically when the ingress traffic reduces below the critical threshold. If this alert is not cleared, then check the Consumer NF for an uneven distribution of traffic per connection or for any other issue. If this alert continues for a long duration, then reduce the ingress traffic from consumer to pod. For any assistance, contact My Oracle Support. |
Note:
The alert expression is configured for the SCP profile (12 vCPUs and 16 Gi of memory). For a smaller resource profile (for example, 8 vCPUs and 12 Gi of memory), the per-worker pod MPS should be evaluated based on the dimensioning sheet, and the alert file must be updated accordingly.6.2.5 SCPRoutingFailedForProducer
Table 6-9 SCPRoutingFailedForProducer
Field | Description |
---|---|
Severity | Info |
Conditions | increase(ocscp_metric_routing_attempt_fail_total{app_kubernetes_io_name="scp-worker"}[2m]) > 0 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.7005 |
Description | Notify that Routing failed for producer. Provides detail such as NFService Type, NFType, Locality, producer FQDN and value. |
Recommended Actions |
Cause: When routing fails to select a producer NF due to unavailability of routing rules for an NF service or producer. Diagnostic Information:
Recovery: This alert is cleared automatically when the routing is complete for a producer NF or no more traffic is received in the next Promethues scrape interval. Check if the NF is deregistered. Register the NF to create routing rules if rules do not exist. For any assistance, contact My Oracle Support. |
6.2.6 SCPAuditErrorResponse
Table 6-10 SCPAuditErrorResponse
Field | Description |
---|---|
Severity | Info |
Conditions | ocscp_audit_error_response > 0 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.4001 |
Description |
Alert is raised when Audit module receives a 3xx, 4xx, or 5xx error from NRF. This alert is labeled with specific nftype, nrfRegionOrSetId, and auditmethod. Note: Alert is cleared on the next audit cycle. |
Recommended Actions |
Cause: When the configured NRF sends error responses, down, or not reachable. Diagnostic Information:
Recovery: The alert is cleared automatically during the next audit cycle and when no more errors are received. Collect audit and worker service logs and contact My Oracle Support for any assistance. |
6.2.7 SCPAuditEmptyNFArrayResponse
Table 6-11 SCPAuditEmptyNFArrayResponse
Field | Description |
---|---|
Description |
Alert is generated when Audit module receives a 2xx response with empty NFInstance array from NRF. Alert is labeled with specific nftype, nrfRegionOrSetId, and auditmethod. Alert is cleared if Audit receives a success response with non-empty NFInstance array or on next audit cycle if topology source is changed to LOCAL. |
Summary | SCP Audit received Empty NF Array Response for nfType {{$labels.nfType}}, nrfRegionOrSetId: {{$labels.nrfRegionOrSetId}}, auditmethod: {{$labels.auditmethod}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}},scpfqdn: {{$labels.ocscp_fqdn}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} |
Severity | Critical |
Conditions | ocscp_audit_2xx_empty_nf_array > 0 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.4002 |
Recommended Actions |
Cause: When NRF does not have any NF registered or due to any error condition on NRF. Diagnostic Information: Check if NRF contains any registered NF and validate as required. For more information, refer to NRF documents. Recovery: This alert is cleared automatically if Audit receives a success response with non-empty NFInstance array or during the next audit cycle when the topology source is changed to LOCAL. Register a NF with NRF or change the topology source to LOCAL. For any assistance, contact My Oracle Support. |
6.2.8 DuplicateLocalityFoundInForeignNF
Table 6-12 DuplicateLocalityFoundInForeignNF
Field | Description |
---|---|
Severity | Major |
Conditions | ocscp_notification_duplicate_foreign_location > 0 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.3002 |
Description | Alert is raised when an unknown NF or SCP is registered with duplicate locality from the present region. |
Recommended Actions |
Cause: When SCP discovers a duplicate locality of an NF from an unknown region. Diagnostic Information: Check logs for NF
notification received by running the following command:
Check the following metric to get the NFInstanceId information for which this alert is raised: ocscp_notification_duplicate_foreign_location (nfInstanceId). From the metric, get the NF Instance ID, Locality, and serving_scope. Check the NF Profile of the corresponding NF in the unknown region as identified by the serving_scope. Check and correct the locality in the NF Profile to ensure it aligns with localities of that unknown region that should be different from locality of SCP which reported this alert. Recovery: This alert is cleared automatically if an unknown NF or SCP is deregistered or registers update with the correct locality. Re-register NF with correct locality information. Collect logs for notification and audit service. For any assistance, contact My Oracle Support. |
6.2.9 ForeignNFLocalityNotServed
Table 6-13 ForeignNFLocalityNotServed
Field | Description |
---|---|
Severity | Critical |
Conditions | ocscp_notification_foreign_nf_locality_unserved > 0 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.3003 |
Description | Alert is raised when a Foreign Producer NF's locality is not served by any SCP. |
Recommended Actions |
Cause: When SCP discovers an unknown Producer NF's without any locality served by an SCP. Diagnostic Information: Check logs for received NF
notification by running the following command: Note: Use the complete name of notification pod in the
following command: Check the following metric to get the NFInstanceId information for which this alert is raised: ocscp_notification_foreign_nf_locality_unserved (nfInstanceId). Recovery: This alert is cleared automatically if the unknown NF is deregistered or registers update received with locality served by SCP. Re-register NF with correct locality information. For any assistance, contact My Oracle Support. |
6.2.10 UnknownLocalityFoundInForeignNF
Table 6-14 UnknownLocalityFoundInForeignNF
Field | Description |
---|---|
Severity | critical |
Conditions | ocscp_notification_foreign_nf_locality_absent > 0 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.3004 |
Description | Alert will be raised when a Foreign Producer NF's locality is unknown. |
Recommended Actions |
Cause: When SCP discovers an unknown Producer NF's without locality information. Diagnostic Information: Check logs for the
received NF notification by running the following command:
Use the complete name of notification pod in the following
command: Check the following metric to get the NFInstanceId information for which this alert is raised: ocscp_notification_foreign_nf_locality_absent(nfInstanceId). Recovery: This alert is cleared automatically if unknown NF is deregistered or registers update received with locality known to SCP. Re-register NF with correct locality information. For any assistance, contact My Oracle Support. |
6.2.11 SCPUpstreamResponseTimeout
Table 6-15 SCPUpstreamResponseTimeout
Field | Description |
---|---|
Description | Alert is raised when Upstream connection to a producer NF fails |
Summary | SCP SUpstream Response Timeout for nfservicetype: {{$labels.ocscp_nf_service_type}}, nftype {{$labels.ocscp_nf_type}}, responsecode {{$labels.ocscp_response_code}}, nfclustername {{$labels.ocscp_producer_host}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} |
Severity | info |
Conditions | increase(ocscp_metric_upstream_response_timeout_total[2m]) > 0 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.7011 |
Recommended Actions |
Cause: When a producer NF is down, not reachable, or latency is high. Diagnostic Information: Check whether the producer
NF is up and network connectivity to the producer NF is established
by using one of the following steps:
Check the upstream response time by using the following metric and determine if upstream is taking too long to respond: ocscp_metric_upstream_service_time (producer FQDN) Recovery: This alert is cleared automatically in the next scrape interval if the system does not observe any error. For any assistance, contact My Oracle Support. |
6.2.12 SCPSingleNfInstanceAvailableForNFType
Table 6-16 SCPSingleNfInstanceAvailableForNFType
Field | Description |
---|---|
Severity | Major |
Conditions | ocscp_no_nf_instance == 0 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.3005 |
Description | Alert is raised when there is a single NFInstance available with SCP for an NFType. |
Recommended Actions |
Cause: When the
Diagnostic Information: Check all SCP NRFs for specific NFType in the alert if only one NFInstance is available. For information about registered NFs, see Oracle Communications Cloud Native Core, Network Repository Function User Guide. Check the number of NFs of a particular type by using API or CNC console of SCP. For information about procedures to check the NFs available with SCP, see " Configuring Service Communication Proxy using the CNC Console" in Oracle Communications Cloud Native Core, Service Communication Proxy User Guide. Recovery: This alert is cleared automatically in the next scrape interval if more than one NFInstance is available for a specified NFType in the alert. For any assistance, contact My Oracle Support. |
6.2.13 SCPNoNfInstanceForNFType
Table 6-17 SCPNoNfInstanceForNFType
Field | Description |
---|---|
Severity | Critical |
Conditions | ocscp_no_nf_instance == 1 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.3006 |
Description | Alert is raised when there is a no NFInstance available with SCP for a NFType |
Recommended Actions |
Cause: When the
Diagnostic Information: Check all SCP NRFs for specific NFType in the alert if no NFInstance is available. For information about registered NFs, see Oracle Communications Cloud Native Core, Network Repository Function User Guide. Check the number of NFs of a particular type by using API or CNC console of SCP. For information about procedures to check the NFs available with SCP, see " Configuring Service Communication Proxy using the CNC Console" in Oracle Communications Cloud Native Core, Service Communication Proxy User Guide. Recovery: This alert is cleared automatically in the next scrape interval if at least one NFInstance is available for a specified NFType in the alert. For any assistance, contact My Oracle Support. |
6.2.14 SCPIngressTrafficRateExceededConfiguredLimit
Table 6-18 SCPIngressTrafficRateExceededConfiguredLimit
Alert Parameters | Value |
---|---|
Description | Ingress traffic rate exceeds configured rate limit for consumer fqdn: {{$labels.ocscp_consumer_fqdn}} |
Summary | 'Ingress traffic rate exceeds configured rate limit for consumer fqdn: ocscpconsumerfqdn = {{$labels.ocscp_consumer_host}},consumernfinstanceid = {{$labels.ocscp_consumer_nf_instance_id}}, consumernftype = {{$labels.ocscp_consumer_nf_type}}, configuredingressrate = {{$labels.ocscp_configured_ingress_rate}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}},scp_fqdn: ' {{$labels.scp_fqdn}} ',timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} and value = {{ $value }} ' |
Severity | Critical |
Condition | This alert is raised when the ingress traffic rate
exceeds the configured rate for consumer FQDN.
increase(ocscp_metric_ingress_rate_limiting_throttle_req_total[2m]) > 0 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.7012 |
Metric Used | ocscp_metric_ingress_rate_limiting_throttle_req_total |
Recommended Actions | Cause: When the ingress traffic rate exceeds the configured rate limit for the consumer FQDN. Diagnostic Information:
This alert is cleared when no more requests get suppressed due to ingress rate limiting in the next scrape interval. For any assistance, contact My Oracle Support. |
6.2.15 SCPIngressTrafficRoutedWithoutRateLimitTreatment
Table 6-19 SCPIngressTrafficRoutedWithoutRateLimitTreatment
Alert Parameters | Value |
---|---|
Description | Ingress traffic routed without rate limit treatment |
Summary | 'Ingress traffic routed without rate limit treatment: consumernftype = {{$labels.ocscp_consumer_nf_type}},consumernfinstanceid = {{$labels.ocscp_consumer_nf_instance_id}}, consumerfqdn = {{$labels.ocscp_consumer_host}}, cause = {{$labels.ocscp_cause}} ,namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}},scp_fqdn: ' {{$labels.scp_fqdn}} ', timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} and value = {{ $value }} ' |
Severity | Major |
Condition | This alert is raised when the ingress traffic routes
without rate limiting treatment.
increase(ocscp_metric_ingress_rate_limiting_not_applied_req_total[2m]) > 0 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.7013 |
Metric Used | ocscp_metric_ingress_rate_limiting_not_applied_req_total |
Recommended Actions | Cause: When the ingress traffic routes without rate limiting treatment. Diagnostic Information:
This alert is cleared when no more requests get routed without ingress rate limiting treatment in the next scrape interval. For any assistance, contact My Oracle Support. |
6.2.16 SCPEgressTrafficRateExceededConfiguredLimit
Table 6-20 SCPEgressTrafficRateExceededConfiguredLimit
Field | Description |
---|---|
Description | Alert is raised when the egress traffic rate exceed the configured rate. |
Summary | Egress traffic rate exceeds configured rate limit: consumernftype = {{$labels.ocscp_consumer_nf_type}},consumernfinstanceid = {{$labels.ocscp_consumer_nf_instance_id}}, consumerfqdn = {{$labels.ocscp_consumer_host}}, producernftype = {{$labels.ocscp_nf_type}}, producernfservicetype = {{$labels.ocscp_nf_service_type}}, producernfinstanceid = {{$labels.ocscp_nf_instance_id}}, producerfqdn = {{$labels.ocscp_producer_host}}, configuredegressrate = {{$labels.ocscp_configured_egress_rate}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}},scpfqdn: {{$labels.ocscp_fqdn}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} and value = {{ $value }} |
Severity | Critical |
Conditions | increase(ocscp_metric_egress_rate_limiting_throttle_req_total{app_kubernetes_io_name="scp-worker"}[2m]) > 0 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.7014 |
Recommended Actions |
Cause: When the egress traffic rate exceeds the configured rate. Diagnostic Information: Check the egress traffic rate by using the following metric: ocscp_metric_http_tx_req_total. Check the egress rate limit configuration as described in Oracle Communications Cloud Native Core, Service Communication Proxy REST Specification Guide. Recovery: This alert is cleared when no more requests get suppressed due to egress rate limiting in the next scrape interval. For any assistance, contact My Oracle Support. |
6.2.17 SCPEgressTrafficRoutedWithoutRateLimitTreatment
Table 6-21 SCPEgressTrafficRoutedWithoutRateLimitTreatment
Field | Description |
---|---|
Description | Alert is raised when egress traffic routes without rate limiting. |
Summary | Egress traffic routed without rate limiting: producernftype = {{$labels.ocscp_nf_type}}, producernfservicetype = {{$labels.ocscp_nf_service_type}}, producernfinstanceid = {{$labels.ocscp_nf_instance_id}}, producerfqdn = {{$labels.ocscp_producer_host}}, cause = {{$labels.ocscp_cause}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, scpfqdn: {{$labels.ocscp_fqdn}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} and value = {{ $value }} |
Severity | Major |
Conditions | increase(ocscp_metric_egress_rate_limiting_not_applied_req_total{app_kubernetes_io_name="scp-worker"}[2m]) > 0 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.7015 |
Recommended Actions |
Cause: When the egress traffic routes without rate limiting treatment. Diagnostic Information: Check the egress rate limiting configurations for the untreated producer FQDN. Obtain the producer FQDN by using the following metric: ocscp_metric_egress_rate_limiting_not_applied_req_total(ocscp_producer_fqdn) Check the egress rate limit configuration as described in Oracle Communications Cloud Native Core, Service Communication Proxy REST Specification Guide. Recovery: This alert is cleared when no more requests get routed without egress rate limiting treatment in the next scrape interval. For any assistance, contact My Oracle Support. |
6.2.18 SCPNotificationRejectTopologySourceLocal
Table 6-22 SCPNotificationRejectTopologySourceLocal
Field | Description |
---|---|
Severity | Info |
Conditions | increase(ocscp_notifications_rejected_topologysource_local_total[15m]) > 0 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.3007 |
Description | Alert is raised when SCP rejects a notification from NRF due to topology source set to LOCAL for NF Type. |
Recommended Actions |
Cause: When NF Topology Source Info is set to LOCAL. Diagnostic Information: Check the topology source information of an NF Type. For information about the topology source APIs, see Oracle Communications Cloud Native Core, Service Communication Proxy User Guide. Recovery: This alert is cleared automatically after 15 minutes when NF Topology Source Info is set to NRF from LOCAL. For any assistance, contact My Oracle Support. |
6.2.19 SCPNotificationProcessingFailureForNF
Table 6-23 SCPNotificationProcessingFailureForNF
Field | Description |
---|---|
Description | Alerts is raised when Notification processing has failed on SCP. |
Summary | SCP Notification Processing failure for nfInstanceId: {{$labels.nfInstanceId}}, nfType: {{$labels.nfType}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}},scpfqdn: {{$labels.ocscp_fqdn}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} |
Severity | Major |
Conditions | increase(ocscp_notification_nf_profile_processing_failure_total[15m]) > 0 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.3008 |
Recommended Actions |
Cause: When Notification processing has failed on SCP. Diagnostic
Information: Check notification pod logs for any errors by
running the following command:
Sample
logs: .
To get the list of pods, run the following command:
For information about the topology source APIs, see Oracle Communications Cloud Native Core, Service Communication Proxy REST Specification Guide. Recovery: This alert is cleared automatically after 15 minutes. For any assistance, contact My Oracle Support. |
6.2.20 SCPSubscriptionFailureForNFType
Table 6-24 SCPSubscriptionFailureForNFType
Field | Description |
---|---|
Severity | Critical |
Conditions | increase(ocscp_subscription_nf_failure_total[2m]) > 0 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.2001 |
Description | Alerts is raised when SCP subscription to NRF has failed. This alert is labeled with specific nftype, nrfRegionOrSetId, and auditmethod. |
Recommended Actions |
Cause: When the subscription fails for an NF Type with NRF. Diagnostic Information: Check whether NRF is up. To check the NRF status, see the Oracle Communications Cloud Native Core, Network Repository Function User Guide. Check whether the NRF is reachable
or not by using one of the following steps:
If NRF is up, check scp-worker logs to find any error response from NRF. If there are error responses, monitor NRF logs: kubectl logs <pod name> -n <namespace>. Sample logs:
Recovery: This alert is cleared automatically when NRF is up and running or errors are corrected for received error responses. For any assistance, contact My Oracle Support. |
6.2.21 SCPReSubscriptionFailureForNFType
Table 6-25 SCPReSubscriptionFailureForNFType
Field | Description |
---|---|
Severity | Critical |
Conditions | increase(ocscp_patch_subscription_nf_failure_total[2m]) > 0 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.2002 |
Description | Alerts is raised when SCP re-subscription to NRF has failed. This alert is labeled with specific nftype, nrfRegionOrSetId, and auditmethod. |
Recommended Actions |
Cause: When the re-subscription fails for an NF Type with NRF. Diagnostic Information: Check whether NRF is up. To check the NRF status, see the Oracle Communications Cloud Native Core, Network Repository Function User Guide. Check whether the NRF is reachable or not by using one of
the following steps:
Check scp-worker logs to find any error response from NRF. If there are error responses, monitor NRF logs: kubectl logs <pod name> -n <namespace>. Recovery: This alert is cleared automatically when NRF is up and running or errors are corrected for received error responses. For any assistance, contact My Oracle Support. |
6.2.22 SCPNrfRegistrationFailureForRegionOrSetId
Table 6-26 SCPNrfRegistrationFailureForRegionOrSetId
Field | Description |
---|---|
Severity | Major |
Conditions | increase(ocscp_nrf_registration_failure_total[2m]) > 0 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.2003 |
Description | Alerts is raised when SCP registration fails. This alert is labeled with specific nftype, nrfRegionOrSetId, and auditmethod. |
Recommended Actions |
Cause: When the registration fails for an NF Type with NRF. Diagnostic Information: Check whether NRF is up. To check the NRF status, see the Oracle Communications Cloud Native Core, Network Repository Function User Guide. Check whether the NRF is reachable or not by using one of
the following steps:
Check scp-worker logs to find any error response from NRF. If there are error responses, monitor NRF logs: kubectl logs <pod name> -n <namespace>. Sample logs:
Recovery: This alert is cleared automatically when NRF is up and running or errors are corrected for received error responses. For any assistance, contact My Oracle Support. |
6.2.23 SCPNrfHeartbeatFailureForRegionOrSetId
Table 6-27 SCPNrfHeartbeatFailureForRegionOrSetId
Field | Description |
---|---|
Description | Alerts is raised when SCP Heartbeat fails. |
Summary | SCP Heartbeat to NRF Failure for Region Or SetId: {{$labels.nrfRegionOrSetId}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}},scpfqdn: {{$labels.ocscp_fqdn}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} |
Severity | Major |
Conditions | increase(ocscp_subscription_nrf_heartbeat_failures_total[2m]) > 0 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.2004 |
Recommended Actions |
Cause: When the Heartbeat fails for an NF Type with NRF. Diagnostic Information: Check whether NRF is up. To check the NRF status, see the Oracle Communications Cloud Native Core, Network Repository Function User Guide. Check whether the NRF is reachable or not by using one of
the following steps:
Check scp-worker logs to find any error response from NRF. If there are error responses, monitor NRF logs: kubectl logs <pod name> -n <namespace>. Recovery: This alert is cleared automatically when NRF is up and running or errors are corrected for received error responses. For any assistance, contact My Oracle Support. |
6.2.24 SCPDBOperationFailure
Table 6-28 SCPDBOperationFailure
Field | Description |
---|---|
Severity | Warning |
Conditions | increase(ocscp_db_operation_failure_total[2m]) > 0 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.2005 |
Description | Alert is raised for any DB operation failures. |
Recommended Actions |
Cause: When the SCP DB operation fails. Diagnostic Information: Check whether the DB service is up. Check the status/age of the mysql pod by using the following command:
Recovery: This alert is cleared automatically when the DB service is up and running. For any assistance, contact My Oracle Support. |
6.2.25 SCPGeneratedErrorsResponseForNFService
Table 6-29 SCPGeneratedErrorsResponseForNFService
Field | Description |
---|---|
Severity | Info |
Conditions | increase(ocscp_metric_scp_generated_response_total[2m]) > 0 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.7016 |
Description | Alert is raised for NF type for which SCP generated response is triggered. |
Recommended Actions |
Cause: When the error response is generated for NF Service Type by SCP. Diagnostic Information: Monitor scp-worker logs to determine the reason for error responses generated by SCP. Check for error reason in the logs: kubectl logs <pod name> -n <namespace>. Recovery: This alert is cleared automatically when the cause for error response at SCP worker is corrected and configured. For any assistance, contact My Oracle Support. |
6.2.26 SCPCircuitBreakingAppliedForNF
Table 6-30 SCPCircuitBreakingAppliedForNF
Field | Description |
---|---|
Severity | Info |
Conditions | ocscp_circuit_breaking_applied > 0 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.7017 |
Description | Alert is raised for NF when circuit breaking is applied. |
Recommended Actions |
Cause: When Circuit Breaking applies for producer NF FQDN based on the configured http2MaxRequests value. Diagnostic Information: Monitor scp-worker logs for number of error responses when outstanding requests exceed the configured http2MaxRequests value: kubectl logs <pod name> -n <namespace>. Check the latency to upstream producer from SCP. Use the following metric to check the same: ocscp_metric_upstream_service_time_total(ocscp_producer_host or ocscp_nf_end_point). Recovery: This alert is cleared automatically when the configuration for http2MaxRequests for circuit breaking is configured beyond the traffic at worker or lower the traffic than the value configured for circuit breaking. For any assistance, contact My Oracle Support. |
6.2.27 SCPUpgradeStarted
Table 6-31 SCPUpgradeStarted
Field | Description |
---|---|
Severity | Info |
Conditions | When the SCP upgrade process for an SCP microservice starts. |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.6001 |
Description | Alert is raised when the SCP upgrade process for an SCP microservice starts. |
Recommended Actions |
Cause: When SCP upgrade is performed for a particular microservice. Diagnostic Information: Not applicable. Recovery: This alert is cleared automatically in
5 minutes when the customAlertExpiryEnabled parameter is set to
false in the
For any assistance, contact My Oracle Support. |
6.2.28 SCPUpgradeFailed
Table 6-32 SCPUpgradeFailed
Field | Description |
---|---|
Severity | Critical |
Conditions | When any SCP microservice upgrade fails during the upgrade process. |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.6002 |
Description | Alert is raised when any SCP microservice upgrade fails. |
Recommended Actions |
Cause: When any SCP microservice upgrade fails during the upgrade process. Diagnostic Information: Monitor new hook-jobs that might have failed after multiple attempts. Also, monitor any failed log. Run the following command to check the pod of hook-job:
Run the following command to check the logs:
Recovery: This alert is cleared automatically in
5 minutes when the customAlertExpiryEnabled parameter is set to
false in the
For any assistance, contact My Oracle Support. |
6.2.29 SCPUpgradeSuccessful
Table 6-33 SCPUpgradeSuccessful
Field | Description |
---|---|
Severity | Info |
Conditions | When any SCP microservice upgrade is completed. |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.6003 |
Description | Alert is raised when any SCP microservice upgrade is completed. |
Recommended Actions |
Cause: When any SCP microservice upgrade is completed. Diagnostic Information: Not applicable. Run the following command to check the pod of hook-job:
Run the following command to check the logs:
Recovery: This alert is cleared automatically in
5 minutes when the customAlertExpiryEnabled parameter is set to
false in the
For any assistance, contact My Oracle Support. |
6.2.30 SCPRollbackStarted
Table 6-34 SCPRollbackStarted
Field | Description |
---|---|
Severity | Info |
Conditions | When the rollback process for an SCP microservice starts. |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.6004 |
Description | Alert is raised when the rollback process for an SCP microservice starts. |
Recommended Actions |
Cause: When the rollback process for an SCP microservice starts. Diagnostic Information: Not applicable. Recovery: This alert is cleared automatically in
5 minutes when the customAlertExpiryEnabled parameter is set to
false in the
For any assistance, contact My Oracle Support. |
6.2.31 SCPRollbackFailed
Table 6-35 SCPRollbackFailed
Field | Description |
---|---|
Severity | Critical |
Conditions | When any SCP microservice rollback fails during the rollback process. |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.6005 |
Description | Alert is raised when any SCP microservice rollback fails. |
Recommended Actions |
Cause: When any SCP microservice rollback fails during the rollback process. Diagnostic Information: Monitor new hook-jobs that might have failed after multiple attempts. Also, monitor any failed log. Run the following command to check the pod of
hook-job:
Run the following command to check the logs:
Recovery: This alert is cleared automatically in
5 minutes when the customAlertExpiryEnabled parameter is set to
false in the
For any assistance, contact My Oracle Support. |
6.2.32 SCPRollbackSuccessful
Table 6-36 SCPRollbackSuccessful
Field | Description |
---|---|
Severity | Info |
Conditions | When any SCP microservice rollback is completed. |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.6006 |
Description | Alert is raised when any SCP microservice rollback is completed. |
Recommended Actions |
Cause: When any SCP microservice rollback is completed. Diagnostic Information: Not applicable. Recovery: This alert is cleared automatically in
5 minutes when the customAlertExpiryEnabled parameter is set to
false in the
For any assistance, contact My Oracle Support. |
6.2.33 ScpWorkerPodCpuUtilizationAboveWarnThreshold
Table 6-37 ScpWorkerPodCpuUtilizationAboveWarnThreshold
Field | Details |
---|---|
Description | CPU utilization of SCP worker at warn level |
Summary | CPU utilization of SCP worker at warn level. namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} and value = {{ $value }} |
Severity | Warning |
Condition | This alert is raised when CPU utilization of SCP-Worker
reaches the WARN level.
increase(ocscp_worker_pod_overload_control_cpu_utilization_total{ocscp_threshold_level="WARN"}[2m]) > 0 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.7018 |
Metric Used | ocscp_worker_pod_overload_control_cpu_utilization_warn |
Recommended Action |
Cause: When CPU utilization of scp-worker reaches the WARN level. Diagnostic Information:
Recovery:
For any assistance, contact My Oracle Support. |
6.2.34 ScpWorkerPodCpuUtilizationAboveMinorThreshold
Table 6-38 ScpWorkerPodCpuUtilizationAboveMinorThreshold
Field | Details |
---|---|
Description | CPU utilization of SCP worker at minor level |
Summary | Worker CPU utilization lead to minor level.namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} |
Severity | Minor |
Condition | This alert is raised when CPU utilization of scp-worker
reaches the MINOR level.
increase(ocscp_worker_pod_overload_control_cpu_utilization_total{ocscp_threshold_level="MINOR"}[2m]) > 0 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.7019 |
Metric Used | ocscp_worker_pod_overload_control_cpu_utilization_minor |
Recommended Action |
Cause: When CPU utilization of SCP-Worker reaches the MINOR level. Diagnostic Information:
Recovery:
For any assistance, contact My Oracle Support. |
6.2.35 ScpWorkerPodCpuUtilizationAboveMajorThreshold
Table 6-39 ScpWorkerPodCpuUtilizationAboveMajorThreshold
Field | Details |
---|---|
Description | CPU utilization of SCP worker at major level |
Summary | Worker CPU utilization lead to major level. namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} |
Severity | Major |
Condition | This alert is raised when CPU utilization of scp-worker
reaches the MAJOR level.
increase(ocscp_worker_pod_overload_control_cpu_utilization_total{ocscp_threshold_level="MAJOR"}[2m]) > 0 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.7020 |
Metric Used | ocscp_worker_pod_overload_control_cpu_utilization_major |
Recommended Action |
Cause: When CPU utilization of SCP-Worker reaches the MAJOR level. Diagnostic Information:
Recovery:
For any assistance, contact My Oracle Support. |
6.2.36 ScpWorkerPodCpuUtilizationAboveCriticalThreshold
Table 6-40 ScpWorkerPodCpuUtilizationAboveCriticalThreshold
Field | Details |
---|---|
Description | CPU utilization of SCP worker at critical level |
Summary | Worker CPU utilization lead to critical level. namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} |
Severity | Critical |
Condition | This alert is raised when CPU utilization of scp-worker
reaches the CRITICAL level.
increase(ocscp_worker_pod_overload_control_cpu_utilization_total{ocscp_threshold_level="CRITICAL"}[2m]) > 0 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.7021 |
Metric Used | ocscp_worker_pod_overload_control_cpu_utilization_critical |
Recommended Action |
Cause: When CPU utilization of scp-worker reaches the CRITICAL level. Diagnostic Information:
Recovery:
For any assistance, contact My Oracle Support. |
6.2.37 SCPUnhealthyPeerSCPDetected
Table 6-41 SCPUnhealthyPeerSCPDetected
Field | Details |
---|---|
Description | Next hop SCP is marked unhealthy |
Summary | 'Next hop SCP is marked unhealthy. peerscphost: {{labels.peerScpName}}, scpFqdn: {{labels.scpFqdn}} , namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} and value = {{ $value }} ' |
Severity | Info |
Condition | This alert is raised when peer SCP is marked as
unhealthy.
ocscp_peer_scp_unhealthy > 0 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.7022 |
Metric Used | ocscp_peer_scp_unhealthy |
Recommended Action |
Cause: The peer SCP is marked as unhealthy because of consecutive failure responses. Diagnostic Information:
Recovery: This alert is automatically cleared after the degradation time is over. Degradation time = Number of consecutive degradations multiplied by configured base ejection. For any assistance, contact My Oracle Support. |
6.2.38 SCPDnsSrvQueryFailure
Table 6-42 SCPDnsSrvQueryFailure
Field | Details |
---|---|
Description | DNS SRV Query failed with cause {{$labels.cause}} |
Summary | 'DNS SRV Query failed with cause {{$labels.cause}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Critical |
Condition | This alert is raised when the DNS server lookup for SRV
fails due to network, servfail, or timed-out errors.
ocscp_alternate_resolution_dnssrv_rx_error_res == 1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.8001 |
Metric Used | ocscp_alternate_resolution_dnssrv_rx_error_res |
Recommended Action |
Cause: When the dnsSRVAlternateRouting flag is set to true, if the DNS SRV lookup fails due to network, servfail, or timed-out errors. Diagnostic Information: Check the DNS SRV server status and re-establish the status to normal. Recovery: This alert is automatically cleared when SCP performs a successful DNS SRV query. For any assistance, contact My Oracle Support. |
6.2.39 SCPProducerOverloadThrottled
Table 6-43 SCPProducerOverloadThrottled
Field | Details |
---|---|
Description | Producer is in Throttled Overload state |
Summary | Producer is in Throttled Overload state.ocscp_peer_fqdn: {{$labels.ocscp_peer_fqdn}}, ocscp_peer_nf_instance_id: {{$labels.ocscp_peer_nf_instance_id}}, ocscp_peer_service_instance_id: {{$labels.ocscp_peer_service_instance_id}}, scpfqdn: {{$labels.ocscp_fqdn}}, podname: {{$labels.kubernetes_pod_name}}, namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} |
Severity | Info |
Condition | ocscp_load_manager_peer_load_throttled_threshold == 1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.7023 |
Metric Used | ocscp_load_manager_peer_load_throttled_threshold |
Recommended Action |
Cause: When the load of producer NF is higher than the throttled threshold configured for the service. Diagnostic Information:
Recovery: This alert clears automatically when the NF profile is deregistered or changed with load less than the throttled abatement threshold. For any assistance, contact My Oracle Support. |
6.2.40 SCPProducerOverloadAlternateRouted
Table 6-44 SCPProducerOverloadAlternateRouted
Field | Details |
---|---|
Description | Producer is in Alternate Route Overload state |
Summary | Producer is in Alternate Route Overload state.ocscp_peer_fqdn: {{$labels.ocscp_peer_fqdn}}, ocscp_peer_nf_instance_id: {{$labels.ocscp_peer_nf_instance_id}}, ocscp_peer_service_instance_id: {{$labels.ocscp_peer_service_instance_id}}, scpfqdn: {{$labels.ocscp_fqdn}}, podname: {{$labels.kubernetes_pod_name}}, namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} |
Severity | Info |
Condition | ocscp_load_manager_peer_load_alternateRoute_threshold == 1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.7024 |
Metric Used | ocscp_load_manager_peer_load_alternateRoute_threshold |
Recommended Action |
Cause: When the load of producer NF is higher than alternate routing threshold configured for the service. Diagnostic Information:
Recovery: This alert clears automatically when the NF profile is deregistered or changed with load less than the alternate routing abatement threshold. For any assistance, contact My Oracle Support. |
6.2.41 SCPSeppNotConfigured
Table 6-45 SCPSeppNotConfigured
Field | Details |
---|---|
Description | SEPP is not configured for PLMN |
Summary | 'SEPP is not configured for PLMN'Summary: 'SEPP is not configured for PLMN. plmnid: {{$labels.plmn_id}}, scpfqdn: {{$labels.scp_fqdn}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Major |
Condition | This alert is raised when Security Edge Protection Proxy
(SEPP) is not configured.
ocscp_metric_sepp_not_configured_current == 1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.7025 |
Metric Used | cscp_metric_sepp_not_configured_current |
Recommended Action |
Cause: When SEPP routing related rules are not configured at SCP for selected PLMN in the inter-PLMN routing. Diagnostic Information:
Recovery: This alert clears automatically when the SEPP related routing rules are created at SCP for selected PLMN in the inter-PLMN routing. For any assistance, contact My Oracle Support. |
6.2.42 SCPSeppRoutingFailed
Table 6-46 SCPSeppRoutingFailed
Field | Details |
---|---|
Description | Routing towards SEPP failed |
Summary | Routing towards SEPP failed. sepp_fqdn: {{$labels.ocscp_sepp_fqdn}}, scpfqdn: {{$labels.scp_fqdn}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Minor |
Condition | This alert is raised when routing towards SEPP fails.
ocscp_metric_sepp_routing_attempt_fail_current == 1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.7026 |
Metric Used | ocscp_metric_sepp_routing_attempt_fail_current |
Recommended Action |
Cause: Inter-PLMN routing failed for the selected SEPP instances. Diagnostic Information:
Recovery: This alert clears automatically when routing is successful for selected SEPP. For any assistance, contact My Oracle Support. |
6.2.43 SCPGlobalEgressRLRemoteParticipantConnectivityFailure
Table 6-47 SCPGlobalEgressRLRemoteParticipantConnectivityFailure
Field | Details |
---|---|
Description | 'SCP Global Egress RL Remote Participant Connectivity Failure for participant |
Summary | 'SCP Global Egress RL Remote Participant Connectivity Failure for participant: {{$labels.scp_remote_coh_cluster_name}}, scp_fqdn: {{$labels.scp_fqdn}}, scp_local_coh_cluster_name: {{$labels.scp_local_coh_cluster_name}}, scp_remote_coh_cluster_fqdn: {{$labels.scp_remote_coh_cluster_fqdn }}, scp_remote_coh_cluster_port: {{$labels.scp_remote_coh_cluster_port }}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Major |
Condition | This alert is raised when the remote participant SCP
connection is not established or goes down.
ocscp_global_egress_rl_remote_participant_connectivity_failure == 1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.9001 |
Metric Used | ocscp_global_egress_rl_bucketkey_not_rate_controlled_total |
Recommended Action |
Cause: When the remote participant SCP connection is not established or down. Diagnostic Information:
Recovery: This alert clears automatically if the connection is established with the remote participant SCP. For any assistance, contact My Oracle Support. |
6.2.44 SCPGlobalEgressRLRemoteParticipantWithDuplicateNFInstanceId
Table 6-48 SCPGlobalEgressRLRemoteParticipantWithDuplicateNFInstanceId
Field | Details |
---|---|
Description | This alert is raised when a duplicate remote coherence participant is found. |
Summary | SCP Global Egress RL Remote Participant Configured With Duplicate NFInstanceId for participant: {{$labels.scp_remote_coh_cluster_name}}, scp_fqdn: {{$labels.ocscp_fqdn}}, scp_nf_instance_id: {{$labels.scp_nf_instance_id}}, scp_local_coh_cluster_name: {{$labels.scp_local_coh_cluster_name}}, scp_remote_coh_cluster_fqdn: {{$labels.scp_remote_coh_cluster_fqdn }}, scp_remote_coh_cluster_port: {{$labels.scp_remote_coh_cluster_port }}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} |
Severity | Major |
Condition | ocscp_global_egress_rl_remote_participant_is_duplicate == 1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.9002 |
Metric Used | ocscp_global_egress_rl_remote_participant_is_duplicate |
Recommended Action |
Cause: Duplicate configuration of remote coherence participants with local SCP. Diagnostic Information:
Recovery:
For any assistance, contact My Oracle Support. |
6.2.45 SCPMediationConnectivityFailure
Table 6-49 SCPMediationConnectivityFailure
Field | Details |
---|---|
Description | 'SCP Mediation Connectivity Failed, scp_fqdn |
Summary | 'SCP Mediation Connectivity Failed, scp_fqdn: {{$labels.scp_fqdn}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Major |
Condition | This alert is raised when Mediation connection is
not established or request to Mediation is not successful.
ocscp_mediation_http_not_reachable == 1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.9002 |
Metric Used | ocscp_mediation_http_not_reachable |
Recommended Action |
Cause: The remote Mediation connection is not established or request to Mediation is not successful. Diagnostic Information:
Recovery:
For any assistance, contact My Oracle Support. |
6.2.46 SCPNotificationQueuesUtilizationAboveMinorThreshold
Table 6-50 SCPNotificationQueuesUtilizationAboveMinorThreshold
Field | Details |
---|---|
Description | 'instancename: {{$labels.instance}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}: SCP Notification Queues Utilization Above Minor Threshold' |
Summary | 'SCP Notification Queues Utilization Above Minor Threshold, instancename: {{$labels.instance}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Minor |
Condition | This alert is raised when the queues in the notification
service are utilized above 65% of the maximum size (user configure minor
threshold value).
ocscp_notification_queue_alert{severity="MINOR"} == 1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.3009 |
Metric Used | ocscp_notification_queue_alert |
Recommended Action |
Cause: The Notification module is getting more traffic than expected. Diagnostic Information:
Recovery: This alert clears automatically when notification traffic goes below minor threshold or exceeds major threshold. For any assistance, contact My Oracle Support. |
6.2.47 SCPNotificationQueuesUtilizationAboveMajorThreshold
Table 6-51 SCPNotificationQueuesUtilizationAboveMajorThreshold
Field | Details |
---|---|
Description | 'instancename: {{$labels.instance}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}: SCP Notification Queues Utilization Above Major Threshold' |
Summary | 'SCP Notification Queues Utilization Above Major Threshold, instancename: {{$labels.instance}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Major |
Condition | This alert is raised when the queues in the notification
service is utilized above 75% of the maximum size (user configure major
threshold value).
ocscp_notification_queue_alert{severity="MAJOR"} == 1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.3010 |
Metric Used | ocscp_notification_queue_alert |
Recommended Action |
Cause: The Notification module is getting more traffic than expected. Diagnostic Information:
Recovery: This alert clears automatically when notification traffic goes below major threshold or above critical major threshold. For any assistance, contact My Oracle Support. |
6.2.48 SCPNotificationQueuesUtilizationAboveCriticalThreshold
Table 6-52 SCPNotificationQueuesUtilizationAboveCriticalThreshold
Field | Details |
---|---|
Description | 'instancename: {{$labels.instance}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}: SCP Notification Queues Utilization Above Critical Threshold' |
Summary | 'SCP Notification Queues Utilization Above Critical Threshold, instancename: {{$labels.instance}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Critical |
Condition | This alert is raised when the queues in notification
service are utilized above 85% of the maximum size (user configure
critical threshold value).
ocscp_notification_queue_alert{severity="CRITICAL"} == 1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.3011 |
Metric Used | ocscp_notification_queue_alert |
Recommended Action |
Cause: The Notification module is getting more traffic than expected. Diagnostic Information:
Recovery: This alert clears automatically when notification traffic goes below critical threshold. For any assistance, contact My Oracle Support. |
6.2.49 SCPNrfProxyQueuesUtilizationAboveMinorThreshold
Table 6-53 SCPNrfProxyQueuesUtilizationAboveMinorThreshold
Field | Details |
---|---|
Description | 'instancename: {{$labels.instance}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}: SCP Nrfproxy Queues Utilization Above Minor Threshold' |
Summary | 'SCP Nrfproxy Queues Utilization Above Minor Threshold, instancename: {{$labels.instance}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Minor |
Condition | This alert is raised when the task queues in
scp-nrfproxy service are utilized above 65% of the maximum size (user
configure minor threshold value).
ocscp_nrfproxy_queue_alert{severity="MINOR"} == 1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.9010 |
Metric Used | ocscp_nrfproxy_queue_alert |
Recommended Action |
Cause: NrfProxy task queues are getting filled and the traffic is more than expected. Diagnostic Information:
Recovery: This alert clears automatically when the traffic goes below minor threshold or above major threshold. For any assistance, contact My Oracle Support. |
6.2.50 SCPNrfProxyQueuesUtilizationAboveMajorThreshold
Table 6-54 SCPNrfProxyQueuesUtilizationAboveMajorThreshold
Field | Details |
---|---|
Description | 'instancename: {{$labels.instance}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}: SCP Nrfproxy Queues Utilization Above Major Threshold' |
Summary | 'SCP Nrfproxy Queues Utilization Above Major Threshold, instancename: {{$labels.instance}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Major |
Condition | This alert is raised when the task queues in
scp-nrfproxy service are utilized above 75% of the maximum size (user
configure major threshold value).
ocscp_nrfproxy_queue_alert{severity="MAJOR"} == 1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.9011 |
Metric Used | ocscp_nrfproxy_queue_alert |
Recommended Action |
Cause: NrfProxy task queues are getting filled and the traffic is more than expected. Diagnostic Information:
Recovery: This alert clears automatically when the traffic goes below major threshold or above critical threshold. For any assistance, contact My Oracle Support. |
6.2.51 SCPNrfProxyQueuesUtilizationAboveCriticalThreshold
Table 6-55 SCPNrfProxyQueuesUtilizationAboveCriticalThreshold
Field | Details |
---|---|
Description | 'instancename: {{$labels.instance}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}: SCP Nrfproxy Queues Utilization Above Critical Threshold' |
Summary | 'SCP Nrfproxy Queues Utilization Above Critical Threshold, instancename: {{$labels.instance}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Critical |
Condition | This alert is raised when the task queues in
scp-nrfproxy service are utilized above 85% of the maximum size (user
configure critical threshold value).
ocscp_nrfproxy_queue_alert{severity="CRITICAL"} == 1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.9012 |
Metric Used | ocscp_nrfproxy_queue_alert |
Recommended Action |
Cause: NrfProxy task queues are getting filled and the traffic is more than expected. Diagnostic Information:
Recovery: This alert clears automatically when the traffic goes below critical threshold. For any assistance, contact My Oracle Support. |
6.2.52 SCPWorkerQueuesUtilizationAboveMinorThreshold
Table 6-56 SCPWorkerQueuesUtilizationAboveMinorThreshold
Field | Details |
---|---|
Description | 'instancename: {{$labels.instance}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}: SCP Worker Queues Utilization Above Minor Threshold' |
Summary | 'SCP Worker Queues Utilization Above Minor Threshold, instancename: {{$labels.instance}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Minor |
Condition | This alert is raised when task queues in scp-worker
service are utilized above 65% of the maximum size (user configure minor
threshold value).
ocscp_worker_queue_alert{severity="MINOR"} == 1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.9007 |
Metric Used | ocscp_worker_queue_alert |
Recommended Action |
Cause: Worker task queues are getting filled and the traffic is more than expected. Diagnostic Information:
Recovery: This alert clears automatically when the traffic goes below minor threshold or above major threshold. For any assistance, contact My Oracle Support. |
6.2.53 SCPWorkerQueuesUtilizationAboveMajorThreshold
Table 6-57 SCPWorkerQueuesUtilizationAboveMajorThreshold
Field | Details |
---|---|
Description | 'instancename: {{$labels.instance}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}: SCP Worker Queues Utilization Above Major Threshold' |
Summary | 'SCP Worker Queues Utilization Above Major Threshold, instancename: {{$labels.instance}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Major |
Condition | This alert is raised when task queues in scp-worker
service are utilized above 75% of the maximum size (user configure major
threshold value).
ocscp_worker_queue_alert{severity="MAJOR"} == 1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.9008 |
Metric Used | ocscp_worker_queue_alert |
Recommended Action |
Cause: Worker task queues are getting filled and the traffic is more than expected. Diagnostic Information:
Recovery: This alert clears automatically when the traffic goes below major threshold or goes above critical threshold. For any assistance, contact My Oracle Support. |
6.2.54 SCPWorkerQueuesUtilizationAboveCriticalThreshold
Table 6-58 SCPWorkerQueuesUtilizationAboveCriticalThreshold
Field | Details |
---|---|
Description | 'instancename: {{$labels.instance}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}: SCP Worker Queues Utilization Above Critical Threshold' |
Summary | 'SCP Worker Queues Utilization Above Critical Threshold, instancename: {{$labels.instance}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Critical |
Condition | This alert is raised when task queues in scp-worker
service are utilized above 85% of the maximum size (user configure
critical threshold value).
ocscp_worker_queue_alert{severity="CRITICAL"} == 1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.9009 |
Metric Used | ocscp_worker_queue_alert |
Recommended Action |
Cause: Worker task queues are getting filled and the traffic is more than expected. Diagnostic Information:
Recovery: This alert clears automatically when the traffic goes below critical threshold. For any assistance, contact My Oracle Support. |
6.2.55 SCPCacheQueuesUtilizationAboveMinorThreshold
Table 6-59 SCPCacheQueuesUtilizationAboveMinorThreshold
Field | Details |
---|---|
Description | 'instancename: {{$labels.instance}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}: SCP Cache Queues Utilization Above Minor Threshold' |
Summary | 'SCP Cache Queues Utilization Above Minor Threshold, instancename: {{$labels.instance}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Minor |
Condition | This alert is raised when the task queues in the
scp-cache service are utilized above 65% of their maximum size (the
user-configured minor threshold value).
ocscp_cache_queue_alert{severity="MINOR"} == 1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.13002 |
Metric Used | ocscp_cache_queue_utilization |
Recommended Action |
Cause: When the cache task queues are getting filled, and traffic is higher than expected. Diagnostic Information:
Recovery: The alert is cleared automatically when minor threshold or goes above major threshold. For any assistance, contact My Oracle Support. |
6.2.56 SCPCacheQueuesUtilizationAboveMajorThreshold
Table 6-60 SCPCacheQueuesUtilizationAboveMajorThreshold
Field | Details |
---|---|
Description | 'instancename: {{$labels.instance}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}: SCP Cache Queues Utilization Above Major Threshold' |
Summary | SCP Cache Queues Utilization Above Major Threshold, instancename: {{$labels.instance}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Major |
Condition | This alert is raised when the task queues in the
scp-cache service are utilized above 75% of their maximum size (the
user-configured major threshold value).
ocscp_cache_queue_alert{severity="MAJOR"} == 1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.13001 |
Metric Used | ocscp_cache_queue_utilization |
Recommended Action |
Cause: When the cache task queues are getting filled, and traffic is higher than expected. Diagnostic Information:
Recovery: The alert is cleared automatically when traffic falls below a major threshold or goes above a critical threshold. For any assistance, contact My Oracle Support. |
6.2.57 SCPCacheQueuesUtilizationAboveCriticalThreshold
Table 6-61 SCPCacheQueuesUtilizationAboveCriticalThreshold
Field | Details |
---|---|
Description | 'instancename: {{$labels.instance}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}: SCP Cache Queues Utilization Above Critical Threshold' |
Summary | 'SCP Cache Queues Utilization Above Critical Threshold, instancename: {{$labels.instance}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Critical |
Condition | This alert is raised when the task queues in the
scp-cache service are utilized above 85% of their maximum size (the
user-configured critical threshold value).
ocscp_cache_queue_alert{severity="CRITICAL"} == 1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.13000 |
Metric Used | ocscp_cache_queue_utilization |
Recommended Action |
Cause: When the cache task queues are getting filled, and traffic is higher than expected. Diagnostic Information:
Recovery: The alert is cleared automatically when traffic falls below a critical threshold. For any assistance, contact My Oracle Support. |
6.2.58 SCPLoadManagerQueuesUtilizationAboveMinorThreshold
Table 6-62 SCPLoadManagerQueuesUtilizationAboveMinorThreshold
Field | Details |
---|---|
Description | 'instancename: {{$labels.instance}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}: SCP Load Manage Queues Utilization Above Minor Threshold' |
Summary | SCP Load Manager Queues Utilization Above Minor Threshold, instancename: {{$labels.instance}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} |
Severity | Minor |
Condition |
This alert is raised when the task queues in the scp-load-manager service are utilized above 65% of their maximum size (the user-configured major threshold value). ocscp_load_manager_queue_alert{severity="MINOR"} == 1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.11002 |
Metric Used | ocscp_load_manager_queue_utilization |
Recommended Action |
Cause: When the cache task queues are getting filled, and traffic is higher than expected. Diagnostic Information:
Recovery: The alert is cleared automatically when traffic falls below a minor threshold. For any assistance, contact My Oracle Support. |
6.2.59 SCPLoadManagerQueuesUtilizationAboveMajorThreshold
Table 6-63 SCPLoadManagerQueuesUtilizationAboveMajorThreshold
Field | Details |
---|---|
Description | 'instancename: {{$labels.instance}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}: SCP Load Manage Queues Utilization Above Major Threshold' |
Summary | 'SCP Load Manager Queues Utilization Above Major Threshold, instancename: {{$labels.instance}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Major |
Condition | This alert is raised when the task queues in the
scp-load-manager service are utilized above 75% of their maximum size
(the user-configured major threshold value).
ocscp_load_manager_queue_alert{severity="MAJOR"} == 1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.11001 |
Metric Used | ocscp_load_manager_queue_utilization |
Recommended Action |
Cause: When the cache task queues are getting filled, and traffic is higher than expected. Diagnostic Information:
Recovery: The alert is cleared automatically when traffic falls below a major threshold. For any assistance, contact My Oracle Support. |
6.2.60 SCPLoadManagerQueuesUtilizationAboveCriticalThreshold
Table 6-64 SCPLoadManagerQueuesUtilizationAboveCriticalThreshold
Field | Details |
---|---|
Description | 'instancename: {{$labels.instance}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}: SCP Load Manage Queues Utilization Above Critical Threshold' |
Summary | 'SCP Load Manager Queues Utilization Above Critical Threshold, instancename: {{$labels.instance}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Critical |
Condition | This alert is raised when the task queues in the
scp-load-manager service are utilized above 85% of their maximum size
(the user-configured critical threshold value).
ocscp_load_manager_queue_alert{severity="CRITICAL"} == 1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.11000 |
Metric Used | ocscp_load_manager_queue_utilization |
Recommended Action |
Cause: When the cache task queues are getting filled, and traffic is higher than expected. Diagnostic Information:
Recovery: The alert is cleared automatically when traffic falls below a critical threshold. For any assistance, contact My Oracle Support. |
6.2.61 SCPProducerNfSetUnhealthy
Table 6-65 SCPProducerNfSetUnhealthy
Field | Details |
---|---|
Description | All producer NFs in NF set are marked unhealthy |
Summary | All producer NFs in NF set are marked unhealthy. nfSet: {{$labels.ocscp_nf_setid}}, scpFqdn: {{$labels.ocscp_fqdn}} , namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} |
Severity | Info |
Condition | ocscp_metric_nf_set_unhealthy == 1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.7027 |
Metric Used | ocscp_metric_nf_set_unhealthy |
Recommended Action |
Cause: All the producer NFs are marked unhealthy because of consecutive failure responses. Diagnostic Information:
Recovery: This alert is automatically cleared after the degradation time is over. Degradation time = Number of consecutive degradations multiplied by configured base ejection. For any assistance, contact My Oracle Support. |
6.2.62 SCPPeerSeppUnhealthy
Table 6-66 SCPPeerSeppUnhealthy
Field | Details |
---|---|
Description | Peer Sepp is marked unhealthy |
Summary | Peer Sepp is marked unhealthy. seppFqdn: {{$labels.ocscp_sepp_fqdn}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} |
Severity | Info |
Condition | ocscp_sepp_unhealthy == 1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.7028 |
Metric Used | ocscp_sepp_unhealthy |
Recommended Action |
Cause: The peer SEPP is marked unhealthy because of consecutive failure responses. Diagnostic Information:
Recovery: This alert is automatically cleared after the degradation time is over. Degradation time = Number of consecutive degradations multiplied by configured base ejection. For any assistance, contact My Oracle Support. |
6.2.63 SCPMicroServiceUnreachable
Table 6-67 SCPMicroServiceUnreachable
Field | Details |
---|---|
Description | 'instancename: {{$labels.instance}}, namespace: {{$labels.namespace}}, podname: {{$labels.pod}}: SCP communication between the micro-services indicated by source and destination has failed' |
Summary | Summary: 'SCP communication between the micro-services indicated by source and destination has failed: {{$labels.instance}}, namespace: {{$labels.namespace}}, source:{{$labels.source}}, destination: {{$labels.destination}}, podname: {{$labels.pod}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Critical |
Condition | This alert is raised when the communication between SCP
microservices indicated by source and destination has failed.
ocscp_metric_svc_unreachable==1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.7029 |
Metric Used | ocscp_metric_svc_unreachable |
Recommended Action |
Cause: Communication between SCP microservices has failed. Diagnostic Information: Verify whether endpoints of all the services are in Running and Ready state. If not, restart the services. Recovery: This alert clears automatically when the required services are in Running and Ready state. For any assistance, contact My Oracle Support. |
6.2.64 SCPTrafficFeedSendFailed
Table 6-68 SCPTrafficFeedSendFailed
Field | Details |
---|---|
Description | Sending messages to Traffic Feed failed. Cause : {{$labels.ocscp_cause}} |
Summary | Sending messages to Traffic Feed failed, cause: {{$labels.ocscp_cause}}, scp_fqdn: {{$labels.ocscp_fqdn}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} |
Severity | Minor |
Condition | increase(ocscp_metric_trafficfeed_failed_total{app_kubernetes_io_name="scp-worker"}[1h]) > 0 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.9003 |
Metric Used | ocscp_metric_trafficfeed_attempted_total |
Recommended Action |
Cause: Sending of message to traffic feed failed. Diagnostic Information:
Recovery: This alert clears automatically after 24 hrs if sending messages to traffic feed stops failing. For any assistance, contact My Oracle Support. |
6.2.65 SCPTrafficFeedKafkaClusterUnhealthy
Table 6-69 SCPTrafficFeedKafkaClusterUnhealthy
Field | Details |
---|---|
Description | 'Kafka cluster is marked unhealthy, Cause : {{$labels.ocscp_cause}}' |
Summary | 'Kafka cluster is marked unhealthy, cause: {{$labels.ocscp_cause}}, scp_fqdn: {{$labels.ocscp_fqdn}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Critical |
Condition | This alert is raised when the Kafka cluster is
unhealthy.
ocscp_metric_trafficfeed_cluster_unhealthy == 1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.9026 |
Metric Used | ocscp_metric_trafficfeed_cluster_unhealthy |
Recommended Action |
Cause: The Kafka cluster is unhealthy. Diagnostic Information:
Recovery: This alert clears when the Kafka cluster recovers from the failure condition. For any assistance, contact My Oracle Support. |
6.2.66 SCPTrafficFeedPartitionUnhealthy
Table 6-70 SCPTrafficFeedPartitionUnhealthy
Field | Details |
---|---|
Description | 'Kafka partition {{$labels.kafka_partition_id}} is marked unhealthy, Cause : {{$labels.ocscp_cause}}' |
Summary | 'Kafka cluster is marked unhealthy, cause: {{$labels.ocscp_cause}}, partition_id: {{$labels.kafka_partition_id}}, topic: {{$labels.topic}}, scp_fqdn: {{$labels.ocscp_fqdn}}, namespace: {{$labels.namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Major |
Condition | This alert is raised when the Kafka partition is
unhealthy.
ocscp_metric_trafficfeed_partition_unhealthy == 1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.9025 |
Metric Used | ocscp_metric_trafficfeed_partition_unhealthy |
Recommended Action |
Cause: The Kafka partition is unhealthy. Diagnostic Information:
Recovery: This alert clears when the Kafka partition recovers from the failure condition. For any assistance, contact My Oracle Support. |
6.2.67 SCPServiceMeshFailure
Table 6-71 SCPServiceMeshFailure
Field | Details |
---|---|
Description | 'SCP servicemesh failure encountered' |
Summary | 'SCP servicemesh failure encountered for nfservicetype: {{$labels.ocscp_nf_service_type}}, nftype: {{$labels.ocscp_nf_type}}, nfinstanceid: {{$labels.ocscp_nf_instance_id}}, serviceinstanceid: {{$labels.ocscp_service_instance_id}}, producerfqdn: {{$labels.ocscp_producer_host}}, responsecode: {{$labels.ocscp_response_code}} serverheader:{{$labels.ocscp_server_header}}, namespace: {{$labels.namespace}}, podname: {{$labels.pod}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}'SeverityInfo |
Severity | Info |
Condition |
This alert is raised when service mesh failure occurs. increase(ocscp_metric_sidecarproxy_failures_total[2m]) > 0 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.7030 |
Metric Used | ocscp_metric_sidecarproxy_failures_total |
Recommended Action |
Cause: Service mesh failure observed at SCP. Diagnostic Information:
Recovery: This alert clears automatically after 2 minutes if there is no service mesh failure observed by SCP with the same dimensions. For any assistance, contact My Oracle Support. |
6.2.68 SCPHealthCheckFailedForPeerSCP
Table 6-72 SCPHealthCheckFailedForPeerSCP
Field | Details |
---|---|
Description | 'SCP HealthCheck failed for peer SCP' |
Summary | 'SCP HealthCheck failed for peer SCP. namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Info |
Condition | This alert is raised when peer SCP or inter-SCP
becomes unhealthy due to health check status and outlier
detection.
ocscp_interscp_health_check_status_failed == 1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.9023 |
Metric Used | ocscp_interscp_health_check_status_failed |
Recommended Action |
Cause: When peer SCP is unhealthy to recieve any SBI message requests due to health check and outlier detection. Diagnostic Information:
Recovery: This alert clears automatically if SCP-C decides SCP-P is healthy or available based on the current and previous status of outlier detection and health check. For any assistance, contact My Oracle Support. |
6.2.69 SCPHealthCheckFailed
Table 6-73 SCPHealthCheckFailed
Field | Details |
---|---|
Description | 'SCP HealthCheck failed' |
Summary |
'SCP HealthCheck failed. namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Info |
Condition | This alert is raised when SCP is unhealthy because
the overall average load of SCP is greater than the configured
threshold.
ocscp_health_check_status_failed == 1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.9024 |
Metric Used | ocscp_health_check_status_failed |
Recommended Action |
Cause: When SCP is unhealthy to receive any SBI message requests due to the overall average load. Diagnostic Information: Monitor if the overall average load of SCP is greater than the configured threshold value. Recovery: This alert clears automatically when the overall average load of SCP is less than the configured threshold value. For any assistance, contact My Oracle Support. |
6.2.70 ScpWorkerPodPendingTransUtilizationAboveMinorThreshold
Table 6-74 ScpWorkerPodPendingTransUtilizationAboveMinorThreshold
Field | Details |
---|---|
Description | Worker Pending Transaction lead to minor level |
Summary | 'Worker Pending Transaction lead to minor level.namespace: {{$labels.namespace}}, podname: {{$labels.pod}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Minor |
Condition | This alert is raised when pending transaction
utilization of SCP-Worker reaches MINOR level.
ocscp_worker_pod_overload_control_pendingTrans_utilization_minor > 0 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.9014 |
Metric Used | ocscp_worker_pod_overload_control_pendingTrans_utilization_minor |
Recommended Action |
Cause: When pending transactions utilization of SCP-Worker reaches MINOR level. Diagnostic Information:
Recovery: This alert clears automatically when pending transaction utilization is below MINOR threshold level. For any assistance, contact My Oracle Support. |
6.2.71 ScpWorkerPodPendingTransUtilizationAboveMajorThreshold
Table 6-75 ScpWorkerPodPendingTransUtilizationAboveMajorThreshold
Field | Details |
---|---|
Description | Worker Pending Transaction lead to major level |
Summary | Worker Pending Transaction lead to major level. namespace: {{$labels.namespace}}, podname: {{$labels.pod}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Major |
Condition | This alert is raised when pending transaction
utilization of SCP-Worker reaches MAJOR level.
ocscp_worker_pod_overload_control_pendingTrans_utilization_major > 0 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.9015 |
Metric Used | ocscp_worker_pod_overload_control_pendingTrans_utilization_major |
Recommended Action |
Cause: When pending transactions utilization of SCP-Worker reaches MAJOR level. Diagnostic Information:
Recovery: This alert clears automatically when pending transaction utilization is below MAJOR threshold level. For any assistance, contact My Oracle Support. |
6.2.72 ScpWorkerPodPendingTransUtilizationAboveCriticalThreshold
Table 6-76 ScpWorkerPodPendingTransUtilizationAboveCriticalThreshold
Field | Details |
---|---|
Description | Worker Pending Transaction lead to critical level |
Summary | 'Worker Pending Transaction lead to critical level. namespace: {{$labels.namespace}}, podname: {{$labels.pod}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Critical |
Condition | This alert is raised when pending transaction
utilization of SCP-Worker reaches CRITICAL level.
ocscp_worker_pod_overload_control_pendingTrans_utilization_critical > 0 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.9016 |
Metric Used | ocscp_worker_pod_overload_control_pendingTrans_utilization_critical |
Recommended Action |
Cause: When pending transactions utilization of SCP-Worker reaches CRITICAL level. Diagnostic Information:
Recovery: This alert clears automatically when pending transaction utilization is below CRITICAL threshold level. For any assistance, contact My Oracle Support. |
6.2.73 ScpWorkerPodPendingTransUtilizationAboveWarnThreshold
Table 6-77 ScpWorkerPodPendingTransUtilizationAboveWarnThreshold
Field | Details |
---|---|
Description | Worker Pending Transaction lead to Warn level |
Summary | 'Worker Pending Transaction lead to warn level. namespace: {{$labels.namespace}}, podname: {{$labels.pod}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Warn |
Condition | This alert is raised when pending transaction
utilization of SCP-Worker reaches WARN level.
ocscp_worker_pod_overload_control_pendingTrans_utilization_warn > 0 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.9017 |
Metric Used | ocscp_worker_pod_overload_control_pendingTrans_utilization_warn |
Recommended Action |
Cause: When pending transactions utilization of SCP-Worker reaches WARN level. Diagnostic Information:
Recovery: This alert clears automatically when pending transaction utilization is below WARN threshold level. For any assistance, contact My Oracle Support. |
6.2.74 ScpWorkerPodResourceUtilizationAboveMinorThreshold
Table 6-78 ScpWorkerPodResourceUtilizationAboveMinorThreshold
Field | Details |
---|---|
Description | Worker overload control lead to minor level |
Summary | 'Worker overload control lead to minor level.namespace: {{$labels.namespace}}, podname: {{$labels.pod}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Minor |
Condition | This alert is raised when overload control resource
utilization of SCP-Worker reaches MINOR level.
ocscp_worker_pod_overload_control_resource_utilization_minor > 0 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.9018 |
Metric Used | ocscp_worker_pod_overload_control_resource_utilization_minor |
Recommended Action |
Cause: When overload control resource utilization of SCP-Worker reaches MINOR level. Diagnostic Information:
Recovery: This alert clears automatically when overload control resource utilization is below MINOR threshold level. For any assistance, contact My Oracle Support. |
6.2.75 ScpWorkerPodResourceUtilizationAboveMajorThreshold
Table 6-79 ScpWorkerPodResourceUtilizationAboveMajorThreshold
Field | Details |
---|---|
Description | Worker overload control lead to major level |
Summary | 'Worker overload control lead to major level. namespace: {{$labels.namespace}}, podname: {{$labels.pod}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Major |
Condition | This alert is raised when overload control resource
utilization of SCP-Worker reaches MAJOR level.
ocscp_worker_pod_overload_control_resource_utilization_major > 0 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.9019 |
Metric Used | ocscp_worker_pod_overload_control_resource_utilization_major |
Recommended Action |
Cause: When overload control resource utilization of SCP-Worker reaches MAJOR level. Diagnostic Information:
Recovery: This alert clears automatically when overload control resource utilization is below MAJOR threshold level. For any assistance, contact My Oracle Support. |
6.2.76 ScpWorkerPodResourceUtilizationAboveWarnThreshold
Table 6-80 ScpWorkerPodResourceUtilizationAboveWarnThreshold
Field | Details |
---|---|
Description | 'Worker overload control lead to Warn level' |
Summary | 'Worker overload control lead to warn level. namespace: {{$labels.namespace}}, podname: {{$labels.pod}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Warning |
Condition | This alert is raised when overload control resource
utilization of SCP-Worker reaches WARN level.
ocscp_worker_pod_overload_control_resource_utilization_warn > 0 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.9021 |
Metric Used | ocscp_worker_pod_overload_control_resource_utilization_warn |
Recommended Action |
Cause: When overload control resource utilization of SCP-Worker reaches WARN level. Diagnostic Information:
Recovery: This alert clears automatically when overload control resource utilization is below WARN threshold level. For any assistance, contact My Oracle Support. |
6.2.77 ScpWorkerPodResourceUtilizationAboveCriticalThreshold
Table 6-81 ScpWorkerPodResourceUtilizationAboveCriticalThreshold
Field | Details |
---|---|
Description | Worker overload control lead to critical level |
Summary | 'Worker overload control lead to critical level. namespace: {{$labels.namespace}}, podname: {{$labels.pod}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Critical |
Condition | This alert is raised when overload control resource
utilization of SCP-Worker reaches CRITICAL level.
ocscp_worker_pod_overload_control_resource_utilization_critical > 0 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.9020 |
Metric Used | ocscp_worker_pod_overload_control_resource_utilization_critical |
Recommended Action |
Cause: When overload control resource utilization of SCP-Worker reaches CRITICAL level. Diagnostic Information:
Recovery: This alert clears automatically when overload control resource utilization is below CRITICAL threshold level. For any assistance, contact My Oracle Support. |
6.2.78 SCPDNSSRVNRFMigrationTaskFailure
Table 6-82 SCPDNSSRVNRFMigrationTaskFailure
Field | Description |
---|---|
Severity | critical |
Condition | ocscp_configuration_dnssrv_nrf_migration_task_failure == 1 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.15001 |
Description | An alert is raised to notify that migration from static to DNS has failed. |
Recommended Actions |
Cause:
Diagnostic Information: Monitor that all the DNS SRV configurations are proper and that all SCP pods are up and running in the proper state. Recovery:
For any assistance, contact My Oracle Support.
|
6.2.79 SCPDNSSRVNRFNonMigrationTaskFailure
Table 6-83 SCPDNSSRVNRFNonMigrationTaskFailure
Field | Description |
---|---|
Severity | critical |
Condition | ocscp_configuration_dnssrv_nrf_non_migration_task_failure == 1 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.15003 |
Description | An alert is raised to notify that the non-migrated task has failed. |
Recommended Actions |
Cause:
Diagnostic Information: Monitor that all the DNS SRV configurations are proper and that all SCP pods are up and running in the proper state. Recovery:
For any assistance, contact My Oracle Support.
|
6.2.80 SCPDNSSRVNRFDuplicateTargetDetected
Table 6-84 SCPDNSSRVNRFDuplicateTargetDetected
Field | Description |
---|---|
Severity | critical |
Condition | ocscp_configuration_dnssrv_nrf_duplicate_target_detected == 1 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.15002 |
Description | An alert is raised to notify that a duplicate target NRF has been detected in the DNS SRV records. |
Recommended Actions |
Cause: This alert is raised when a duplicate target FQDN is received from the DNS SRV for different NRF SRV FQDN(s). In this case, the target FQDNs received against the first NRF SRV FQDN in the scpc-configuration service from the scpc-alternate-resolution service shall be processed, but the target FQNDS received against the subsequent NRF SRV FQDN will be ignored, and this alert shall be raised. Diagnostic Information: Monitor that all the DNS SRV configurations are proper and that all SCP pods are up and running in the proper state. Recovery:
For any assistance, contact My Oracle Support.
|
6.2.81 SCPHighResponseTimeFromProducer
Table 6-85 SCPHighResponseTimeFromProducer
Field | Description |
---|---|
Description | It notifies when the traffic exceeds 200 messages per second and the response delay from the Producer NF takes more than 50 seconds. |
Summary | More than 200 responses received by the SCP have a response time exceeding 50,000 milliseconds. Instance name: {{$labels.instance}}, Namespace: {{$labels.namespace}}, Pod name: {{$labels.pod}}, Timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} |
Severity | Info |
Condition | (sum(rate(ocscp_metric_upstream_service_time_total{ocscp_upstream_service_time="50000ms"}[2m])) by (kubernetes_namespace) + sum(rate(ocscp_metric_upstream_service_time_total{ocscp_upstream_service_time=">50000ms"}[2m])) by (kubernetes_namespace)) > 200 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.15004 |
Metric Used | ocscp_metric_upstream_service_time_total |
Recommended Actions |
Cause: More than 200 messages per second have an upstream response time above 50 seconds. Diagnostic Information: Monitor metric metricocscp_metric_upstream_service_time_total with ocscp_upstream_service_time="50000ms" and ocscp_upstream_service_time=">50000ms". Recovery: The alert is cleared automatically when the number of responses with delays exceeding 50 seconds falls below 200 messages per second. If the alert does not clear, check for any producer NFs or specific service request types that are taking more than 50 seconds to respond and take corrective actions if necessary. Note that immediate action may not be required, as this alert is informational. However, a high number of requests with long response delays could lead to performance degradation at the SCP. For any assistance, contact My Oracle Support. |
6.2.82 SCPCGroupVersionDetectionFailed
Table 6-86 SCPCGroupVersionDetectionFailed
Field | Description |
---|---|
Severity | critical |
Condition | ocscp_worker_cgroup_version_detection_failed == 1 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.15005 |
Description | Notify that cgroup version detection has failed. |
Recommended Actions |
Cause: SCP is unable to detect the cgroup version from the underlying kernel with the command "stat -fc %T /sys/fs/cgroup/." The possible expected value is either tmpfs or cgroup2fs. Diagnostic Information:
Recovery:
For any assistance, contact My Oracle Support.
|
6.2.83 SCPCPUUsageFileReadFailed
Table 6-87 SCPCPUUsageFileReadFailed
Field | Description |
---|---|
Severity | critical |
Condition | ocscp_worker_cpu_usage_file_read_failed == 1 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.15006 |
Description | Notify that the CPU usage file read operation failed within the detected cgroup version. |
Recommended Actions |
Cause: SCP encountered a failure in performing a read operation for the CPU usage file within the detected cgroup version. The file path is determined based on the detected cgroup version. Diagnostic
Information:
Recovery:
For any assistance, contact My Oracle Support.
|
6.2.84 SCPIgnoreUnknownService
Table 6-88 SCPIgnoreUnknownService
Field | Description |
---|---|
Severity | Info |
Condition | increase(ocscp_ignore_unknown_service_total[24h]) > 0 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.15000 |
Description | An alert is raised to notify that SCP ignored an unknown service in the NF profile. |
Recommended Actions |
Cause: SCP has received the NF profile with an unknown service and processed the profile by ignoring this unknown service. Diagnostic Information: Check the received NF profile for the unknown services. Recovery: If the unknown services are not present in the NF profile in the next scrapping interval, then the alert will be cleared. For any assistance, contact My Oracle Support. |
6.2.85 SCPWorkerSSLCertificateOnCriticalExpiry
Table 6-89 SCPWorkerSSLCertificateOnCriticalExpiry
Field | Description |
---|---|
Severity | Critical |
Condition | ocscp_metric_ssl_certificate_expire_total == 1 |
OID used for SNMP Traps | 11.3.6.1.4.1.323.5.3.35.1.2.15010 |
Description | An alert is raised whenever the SCP SSL certificate is about to expire, based on the configured threshold values. |
Recommended Actions |
Cause: The SSL certificate expiration is approaching the configured critical expiry time. Diagnostic Information:
Recovery: The SCP SSL secret needs to be updated with renewed SSL certificates. For any assistance, contact My Oracle Support. |
6.2.86 SCPWorkerSSLCertificateOnMajorExpiry
Table 6-90 SCPWorkerSSLCertificateOnMajorExpiry
Field | Description |
---|---|
Severity | major |
Condition | ocscp_metric_ssl_certificate_expire_total == 2 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.15011 |
Description | An alert is raised whenever the SCP SSL certificate is about to expire, based on the configured threshold values. |
Recommended Actions |
Cause: The SSL certificate expiration is approaching the configured major expiry time. Diagnostic Information:
Recovery: The SCP SSL secret needs to be updated with renewed SSL certificates. For any assistance, contact My Oracle Support. |
6.2.87 SCPWorkerSSLCertificateOnMinorExpiry
Table 6-91 SCPWorkerSSLCertificateOnMinorExpiry
Field | Description |
---|---|
Severity | minor |
Condition | ocscp_metric_ssl_certificate_expire_total == 3 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.15012 |
Description | An alert is raised whenever the SCP SSL certificate is about to expire, based on the configured threshold values. |
Recommended Actions |
Cause: The SSL certificate expiration is approaching the configured minor expiry time. Diagnostic Information:
Recovery: The SCP SSL secret needs to be updated with renewed SSL certificates. For any assistance, contact My Oracle Support. |
6.2.88 SCPIngressConnectionEstablishmentFailure
Table 6-92 SCPIngressConnectionEstablishmentFailure
Field | Description |
---|---|
Severity | info |
Condition | increase(ocscp_worker_https_ingress_connection_failure_total[2m]) > 0 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.15007 |
Description | An alert is raised whenever any ingress HTTPS connection is failed. |
Recommended Actions |
Cause: Whenever an Ingress HTTPS connection establishment fails. Diagnostic Information: This alert may be raised if any connection establishment fails or if a handshake fails. Recovery: The alert will be cleared if there are no ingress HTTPS connection failures in the next scrape interval (2 minutes). For any assistance, contact My Oracle Support. |
6.2.89 SCPEgressConnectionEstablishmentFailure
Table 6-93 SCPEgressConnectionEstablishmentFailure
Field | Description |
---|---|
Severity | info |
Condition | increase(ocscp_worker_https_egress_connection_failure_total[2m]) > 0 |
OID used for SNMP Traps | 1.3.6.1.4.1.323.5.3.35.1.2.15008 |
Description | An alert is raised whenever any egress HTTPS connection is failed. |
Recommended Actions |
Cause: Whenever an egress HTTPS connection establishment fails to send the request to producer NFs. Diagnostic Information: This alert may be raised if any connection establishment fails or if a handshake fails. Recovery: The alert will be cleared if there are no engress HTTPS connection failures in the next scrape interval (2 minutes). For any assistance, contact My Oracle Support. |
6.2.90 SCPNrfProxyOauthQueuesUtilizationAboveCriticalThreshold
Table 6-94 SCPNrfProxyOauthQueuesUtilizationAboveCriticalThreshold
Field | Details |
---|---|
Description | 'instancename: {{$labels.instance}}, namespace: {{$labels.namespace}}, podname: {{$labels.pod}}: SCP NrfProxy Oauth Queues Utilization Above Critical Threshold' |
Summary | 'SCP NrfProxy Oauth Queues Utilization Above Critical Threshold, instancename: {{$labels.instance}}, namespace: {{$labels.namespace}}, podname: {{$labels.pod}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Critical |
Condition | This alert is raised when SCP NrfProxy Oauth Queues
Utilization is above the critical threshold.
ocscp_nrfproxy_oauth_queue_alert{severity="CRITICAL"} == 1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.14000 |
Metric Used | ocscp_nrfproxy_oauth_queue_alert |
Recommended Action |
Cause: When NrfProxy Task queues are filled, the traffic exceeds the limit. Diagnostic Information:
Recovery: This alert clears automatically when the traffic decreases below the critical threshold. For any assistance, contact My Oracle Support. |
6.2.91 SCPNrfProxyOauthQueuesUtilizationAboveMajorThreshold
Table 6-95 SCPNrfProxyOauthQueuesUtilizationAboveMajorThreshold
Field | Details |
---|---|
Description | 'instancename: {{$labels.instance}}, namespace: {{$labels.namespace}}, podname: {{$labels.pod}}: SCP NrfProxyOauth Queues Utilization Above Major Threshold' |
Summary | 'SCP NrfProxy Oauth Queues Utilization Above Major Threshold, instancename: {{$labels.instance}}, namespace: {{$labels.namespace}}, podname: {{$labels.pod}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Major |
Condition | This alert is raised when SCP NrfProxy Oauth Queues
Utilization is above the major threshold.
ocscp_nrfproxy_oauth_queue_alert{severity="MAJOR"} == 1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.14001 |
Metric Used | ocscp_nrfproxy_oauth_queue_alert |
Recommended Action |
Cause: When NrfProxy Task queues are filled, the traffic exceeds the limit. Diagnostic Information:
Recovery: This alert clears automatically when the traffic decreases below the Major threshold. For any assistance, contact My Oracle Support. |
6.2.92 SCPNrfProxyOauthQueuesUtilizationAboveMinorThreshold
Table 6-96 SCPNrfProxyOauthQueuesUtilizationAboveMinorThreshold
Field | Details |
---|---|
Description | description: 'instancename: {{$labels.instance}}, namespace: {{$labels.namespace}}, podname: {{$labels.pod}}: SCP NrfProxyOauth Queues Utilization Above Minor Threshold' |
Summary | 'SCP NrfProxy Oauth Queues Utilization Above Minor Threshold, instancename: {{$labels.instance}}, namespace: {{$labels.namespace}}, podname: {{$labels.pod}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}' |
Severity | Minor |
Condition | This alert is raised when SCP NrfProxy Oauth Queues
Utilization is above the minor threshold.
ocscp_nrfproxy_oauth_queue_alert{severity="MINOR"} == 1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.14002 |
Metric Used | ocscp_nrfproxy_oauth_queue_alert |
Recommended Action |
Cause: When NrfProxy Task queues are filled, the traffic exceeds the limit. Diagnostic Information:
Recovery: This alert clears automatically when the traffic decreases below the Major threshold. For any assistance, contact My Oracle Support. |
6.2.93 ScpSelfOCIThresholdAboveWarn
Table 6-97 ScpSelfOCIThresholdAboveWarn
Field | Details |
---|---|
Description | An alert is raised whenever the configured self-OCI for SCP reaches a warn level. |
Summary | SCP load level lead to Warn level. namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} |
Severity | warning |
Condition |
increase(ocscp_worker_self_oci_above_oci_conveyance_threshold_total{ocscp_oci_threshold_level="WARN"}[2m]) > 0 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.15013 |
Metric Used | ocscp_worker_self_oci_above_oci_conveyance_threshold_total |
Recommended Action |
Cause: Whenever the configured self-OCI for SCP reaches a warn level. Diagnostic Information: SCP may be experiencing high load. Recovery: The alert will clear automatically when the load on SCP decreases and falls below the configured warn level. For any assistance, contact My Oracle Support. |
6.2.94 ScpSelfOCIThresholdAboveMinor
Table 6-98 ScpSelfOCIThresholdAboveMinor
Field | Details |
---|---|
Description | An alert is raised whenever the configured self-OCI for SCP reaches a minor level. |
Summary | SCP load level lead to Minor level. namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} |
Severity | minor |
Condition |
increase(ocscp_worker_self_oci_above_oci_conveyance_threshold_total{ocscp_oci_threshold_level="MINOR"}[2m]) > 0 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.15014 |
Metric Used | ocscp_worker_self_oci_above_oci_conveyance_threshold_total |
Recommended Action |
Cause: Whenever the configured self-OCI for SCP reaches a minor level. Diagnostic Information: SCP may be experiencing high load. Recovery: The alert will clear automatically when the load on SCP decreases and falls below the configured minor level. For any assistance, contact My Oracle Support. |
6.2.95 ScpSelfOCIThresholdAboveMajor
Table 6-99 ScpSelfOCIThresholdAboveMajor
Field | Details |
---|---|
Description | An alert is raised whenever the configured self-OCI for SCP reaches a major level. |
Summary | SCP load level lead to Major level. namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} |
Severity | major |
Condition |
increase(ocscp_worker_self_oci_above_oci_conveyance_threshold_total{ocscp_oci_threshold_level="MAJOR"}[2m]) > 0 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.15015 |
Metric Used | ocscp_worker_self_oci_above_oci_conveyance_threshold_total |
Recommended Action |
Cause: Whenever the configured self-OCI for SCP reaches a major level. Diagnostic Information: SCP may be experiencing high load. Recovery: The alert will clear automatically when the load on SCP decreases and falls below the configured major level. For any assistance, contact My Oracle Support. |
6.2.96 ScpSelfOCIThresholdAboveCritical
Table 6-100 ScpSelfOCIThresholdAboveCritical
Field | Details |
---|---|
Description | This alert will be activated whenever the configured self-OCI for SCP reaches a critical level. |
Summary | SCP load level lead to Critical level. namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} |
Severity | critical |
Condition |
increase(ocscp_worker_self_oci_above_oci_conveyance_threshold_total{ocscp_oci_threshold_level="CRITICAL"}[2m]) > 0 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.15016 |
Metric Used | ocscp_worker_self_oci_above_oci_conveyance_threshold_total |
Recommended Action |
Cause: Whenever the configured self-OCI for SCP reaches a critical level. Diagnostic Information: SCP may be experiencing high load. Recovery: The alert will clear automatically when the load on SCP decreases and falls below the configured critical level. For any assistance, contact My Oracle Support. |
6.2.97 SCPWorkerSSLPartialSecret
Table 6-101 SCPWorkerSSLPartialSecret
Field | Details |
---|---|
Description | SCP Worker SSL Secret is Invalid or Partial' |
Summary | SCP Worker SSL Secret is Invalid or Partial, ocscp_worker_fqdn: {{$labels.ocscp_producer_host}}, scpfqdn: {{$labels.ocscp_fqdn}}, namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}, secretName: {{$labels.secretName}} {{ . | first | value | humanizeTimestamp }}{{ end }} |
Severity | Major |
Condition | This alert is raised when any SCP-Worker secret is
patched with partial data.
ocscp_worker_ssl_partial_secret == 1 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.15017 |
Metric Used | ocscp_worker_ssl_partial_secret |
Recommended Action |
Cause: Whenever any SCP-Worker secret is patched with the partial data. Diagnostic Information: Verify the patched secret whether it contains all the required data. Recovery: Alert is cleared automatically when valid secret is patched which contains all the data. For any assistance, contact My Oracle Support. |
6.2.98 SCPWorkerAndNFTimeSyncFailure
Table 6-102 SCPWorkerAndNFTimeSyncFailure
Field | Details |
---|---|
Description | 'Consumer NF and SCP are not time synchronized' |
Summary | Consumer NF and SCP are not time synchronized: consumernftype = {{$labels.ocscp_consumer_nf_type}},consumernfinstanceid = {{$labels.ocscp_consumer_nf_instance_id}}, namespace: {{$labels.namespace}}, podname: {{$labels.pod}}, scpfqdn: {{$labels.ocscp_fqdn}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} |
Severity | Major |
Condition |
This alert is raised when there is a time synchronization difference between SCP-Worker and the consumer NF received request with the timestamp_headers_support feature is enabled. increase(ocscp_worker_timestamp_headers_validation_fail_total{ocscp_validation_failure="TIME_SYNC_FAILURE"}[2m]) > 0 |
OID | 1.3.6.1.4.1.323.5.3.35.1.2.15019 |
Metric Used | ocscp_worker_timestamp_headers_validation_fail_total |
Recommended Action |
Cause: When there is a time synchronization difference between SCP-Worker and the consumer NF received request with the timestamp_headers_support feature is enabled. Diagnostic Information: SCP may have received a request from the consumer NF which is not in time sync when the timestamp_headers_support is enabled. Recovery: Alert is cleared automatically when no such read failure occurs over the next scrape interval. For any assistance, contact My Oracle Support. |
6.3 Configuring Alerts
This section lists the configuring alerts.
6.3.1 Applying Alerts Rule to CNE without Prometheus Operator
SCP Helm Chart Release Name: _NAME_
Prometheus NameSpace: _Namespace _
- Run the following command to check the name of the config map used
by
Prometheus:
$kubectl get configmap -n <_Namespace_>
Example:$kubectl get configmap -n prometheus-alert2 NAME DATA AGE lisa-prometheus-alert2-alertmanager 1 146d lisa-prometheus-alert2-server 4 146d
- Take a backup of the current config map of Prometheus. This command
saves the configmap in the provided file. In the following command, the
configmap is stored in the /tmp/tempConfig.yaml
file:
$ kubectl get configmaps <_NAME_>-server -o yaml -n <_Namespace_> /tmp/tempConfig.yaml
Example:$ kubectl get configmaps lisa-prometheus-alert2-server -o yaml -n prometheus-alert2 > /tmp/tempConfig.yaml
- Check and delete the "alertsscp" rule if it has already configured
in the prometheus config map. If configured, this step removes the " alertsscp "
rule. This is an optional step if configuring the alerts for the first time.
$ sed -i '/etc\/config\/alertsscp/d' /tmp/tempConfig.yaml
- Add the "alertsscp" rule in the configmap dump file under the '
rule_files ' tag.
$ sed -i '/rule_files:/a\ \- /etc/config/alertsscp' /tmp/tempConfig.yaml
- Update the configmap using below command. Ensure to use the same
configmap name that was used to take a backup of the prometheus configmap.
$ kubectl replace configmap <_NAME_>-server -f /tmp/tempConfig.yaml
Example:$ kubectl replace configmap lisa-prometheus-alert2-server -f /tmp/tempConfig.yaml
- Run the following command to patch the configmap with a new "alertsscp"
rule:
Note:
The patch file provided is theocscp_csar_23_2_0_0_0.zip
folder provided with SCP, that is,SCPAlertrules.yaml
.$ kubectl patch configmap _NAME_-server -n _Namespace_ --type merge --patch "$(cat ~/SCPAlertrules.yaml)"
Example:$ kubectl replace configmap lisa-prometheus-alert2-server -f /tmp/tempConfig.yaml
Note:
Prometheus takes about 20 seconds to apply the updated Config map.6.3.2 Applying Alerts Rule to CNE with Prometheus Operator
6.3.3 Configuring Service Communication Proxy Alert using the SCPAlertrules.yaml file
Note:
Default NameSpace is scpsvc for Service Communication Proxy. You can update the NameSpace as per the deployment.To access the scpAlertsrules_<scp release
number>.yaml
file from the Scripts
folder of
ocscp_csar_25_1_1_0_0_0.zip
, download the SCP package from
My Oracle Support as described in "Downloading
the SCP Package " in Oracle Communications Cloud Native Core, Service Communication Proxy
Installation, Upgrade, and Fault Recovery Guide.
Alerts Details
Description and summary for alerts are added by the Prometheus alert manager.
- SCPIngress Traffic Rate
Above Threshold
- Has three threshold level Minor (above 9800 mps to 11200 mps), Major (11200 to 13300 mps), Critical (above 13300 mps). These values are configurable.
- In the description, information is presented similar to: "Ingress Traffic Rate at Locality: <Locality of scp> is above <threshold level (minor/major/critical> threshold (i.e. <value of threshold>)"
- In Summary: "Namespace: <Namespace of scp
deployment that Locality>, Pod: <SCP-worker Pod name>:
Current Ingress Traffic Rate is <Current rate of Ingress traffic
> mps which is above 70 Percent of Max MPS(<upper limit of
ingress traffic rate per pod>)"
Note:
Ingress traffic rate is per scp-worker pod in a namespace at particular SCP-Locality. Currently, 14000mps is the upper limit for per scp-worker pod.
- SCP Routing Failed For
Service
- It alerts for which NF Service Type and NF Type at particular locality, Routing failed
- Description: "Routing failed for service"
- Summary: "Routing failed for service: NFService
Type = <Message NF Service Type>, NFType = <Message NF Type>, Locality =
<SCP Locality where Routing Failed> and value = <Accumulated failure till
now, of such message for NFType and NFService Type>"
Note:
The value field currently does not provide the number of failures in particular time interval, instead it provides the total number of Routing failures.
- SCP Pod Memory Usage: Type of alert is
SCPWorkerPodMemoryUsage.
- Pod memory usage for SCP Pods (Soothsayer and Worker) deployed at a particular node instance is provided.
- The Soothsayer pod threshold is 8 GB
- The Worker pod threshold is 16 Gi
- Summary: Instance: "<Node Instance name>, NameSpace: <Namespace of SCP deployment>, Pod: <(Soothsayer/Worker) Pod name>: <Soothsayer/Worker> Pod High Memory usage detected"
- Summary: "Instance: "<Node Instance name>, Namespace: <Namespace of SCP deployment>, Pod: <(Soothsayer/Worker) Pod name>: Memory usage is above <threshold value>G (current value is: <current value of memory usage>)"
6.3.4 Configuring Alert Manager for SNMP Notifier
Grouping of alerts is based on:
- podname
- alertname
- severity
- namespace
- nfServiceType
- nfServiceInstanceId
- Take a backup of the current config map of Alertmanager by
running the following command:
kubectl get configmaps <NAME-alertmanager> -oyaml -n <Namespace> > /tmp/bkupAlertManagerConfig.yaml
Example:
kubectl get configmaps occne-prometheus-alertmanager -oyaml -n occne-infra > /tmp/bkupAlertManagerConfig.yaml
- Edit Configmap to add subroute for SCP Trap
OID:
Example:kubectl edit configmaps <NAME-alertmanager> -n <Namespace>
kubectl edit configmaps occne-prometheus-alertmanager -n occne-infra
- Add the subroute under 'route' in
configmap:
routes: - receiver: default-receiver group_interval: 1m group_wait: 10s repeat_interval: 9y group_by: [podname, alertname, severity, namespace, nfservicetype, nfserviceinstanceid, servingscope, nftype] match_re: oid: ^1.3.6.1.4.1.323.5.3.35.(.*)
MIB Files for SCP
- ocscp_mib_tc_25.1.100.mib: This is considered as SCP top level mib file, where the Objects and their data types are defined.
- ocscp_mib_25.1.100.mib: This file fetches the Objects from the top level mib file and based on the Alert notification, these objects can be selected for display.
Note:
MIB files are packaged withocscp_csar_23_2_0_0_0.zip
. You can download the file
from MOS as described in Oracle Communications Cloud Native Core, Service Communication Proxy
Installation, Upgrade, and Fault Recovery Guide.
6.4 Configuring SCP Alerts for OCI
To configure SCP alerts for OCI, OCI supports metric expressions written
in MQL (Metric Query Language) and therefore requires ocscp_oci_alertrules_25.1.100.zip
file for configuring alerts in OCI
observability platform. For more information, see Oracle Communications Cloud Native Core, OCI Deployment Guide.
6.5 Mediation Alerts
This section provides detailed information on all mediation alerts, including their descriptions, severity, and recommended actions.
6.5.1 NFMediationUserDefinedVariablesMaxSizeLimitExceeded
Table 6-103 NFMediationUserDefinedVariablesMaxSizeLimitExceeded
Field | Details |
---|---|
Description |
This alert is raised when the total size of all user-defined variables and their values exceeds the configured size. |
Summary | namespace: {{$labels.namespace}}, podname: {{$labels.pod}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Currently user defined variables maximum size limit exceeded the configured limit {{ $value | printf "%.2f" }} times |
Severity | minor |
Condition |
increase(ocscp_med_udvs_max_size_limit_exceeded_total[5m]) > 0 |
OID | 1.3.6.1.4.1.323.5.3.47.1.2.1005 |
Metric Used | ocscp_med_udvs_max_size_limit_exceeded_total |
Recommended Action |
Cause: This alert is raised when the total size of all user-defined variables and their values exceeds the configured limit. Diagnostic Information: The size of the user-defined variables list (med-user-defined-var-list) during mediation may have exceeded the configured limit. Recovery: Reduce the size or number of user-defined variables to bring the size within the limit. For any assistance, contact My Oracle Support. |