NSSF Metrics, KPIs, and Alerts

6 NSSF Metrics, KPIs, and Alerts

This chapter includes information about Metrics, KPIs, and Alerts for Oracle Communications Cloud Native Core, Network Slice Selection Function.

Note:

The performance and capacity of the NSSF system may vary based on the call model, Feature or Interface configuration, and underlying CNE and hardware environment.

6.1 NSSF Metrics

This section includes information about dimensions, common attributes, and metrics for NSSF.

Metric Types

The following table describes the NSSF metric types used to measure the health and performance of NSSF and its core functionalities:

Table 6-1 Metric Type

Metric Type	Suffix	Description
Counter	_total	Represents the total number of occurrences of an event or traffic, such as measuring the total amount of traffic received and transmitted by NSSF, and so on.
Gauge	NA	Represents a single numerical value that changes randomly. This metric type is used to measure various parameters, such as SCP load values, memory usage, and so on.
Histogram	_max, _bucket, _count, or _sum	Represents samples of observations (such as request durations or response sizes) and counts them in configurable buckets. It also provides a sum of all observed values.

Latency Metrics Format Change for NSSF Microservices

With the migration of Spring boot to Micronaut, the support for latency metric *_latency_seconds has been deprecated. The below metrics *_latency_seconds_[suffix] continue to be supported and can be used in lieu of *_latency_seconds

*_latency_seconds_max
*_latency_seconds_bucket
*_latency_seconds_count
*_latency_seconds_sum

This update applies to the metrics of all NSSF associated microservices, as well as Ingress and Egress Gateway microservices.

Note:

Support for the metric *_latency_seconds continues to be present only in Ingress and Egress Gateway.

Dimensions

The following table describes different types of metric dimensions:

Table 6-2 Dimensions

Dimension	Description	Values
AMF Instance Id	NF-Id of AMF	NA
authority	Used in Gateway metrics. Indicates the destination address.	NA
BackendSvc	Used in Gateway metrics. Indicates the address of destination.	NA
BackendSvcAddressType	Used in Gateway metrics. Indicates the IP type (IPv4/IPv6) of the destination from the Egress Gateway.	IPv4, IPv6
CauseCode	It specifies the cause code of an error response.	Cause Code of the error response. For example, "SUBSCRIPTION_NOT_FOUND"
certificateName	Determines the certificate name inside a specific secret that is configured via persistent configuration when oauth is enabled.	NA
ClientCertIdentity	Cerificate Identity of the client.	SAN=127.0.0.1,localhost CN=localhost, N/A if data is not available
ConfigurationType	Determines the type of configuration in place for OAuth Client in Egress Gateway. If nrfClientQueryEnabled Helm parameter in oauthClient Helm configurations at Egress Gateway is false then the ConfigurationType is STATIC, else DYNAMIC.	STATIC, DYNAMIC
configVersion	Indicates the configuration version that Ingress or gateway is currently maintaining.	Value received from config server (1, 2...)
ConsumerNFInstanceId	NF instance id of the NF service consumer.	NA
ConsumerNFType	The NF type of the NF service consumer.	NRF, UDM, AMF, SMF, AUSF, NEF, PCF, SMSF, NSSF, UDR, LMF, GMLC,5G_EIR, SEPP, UPF, N3IWF, AF, UDSF, BSF, CHF, NWDAF
DestinationHost	Used in Gateway metrics. Indicates the destination IP address or FQDN of the host.	NA
destinationHostAddressType	Used in Gateway metrics. Indicates the destination IP type (IPv4 or IPv6) from Egress Gateway.	IPv4, IPv6
Direction	Indicates the direction of connection established, that is, whether it is incoming or outgoing.	ingress, egressOut
dnsResolvedType	Used in Gateway metrics. Indicates the actual DNS resolved IP type (IPv4 or IPv6) of the destination.	IPv4, IPv6
duration_type	Used in NSSF cache metrics to denote the type of duration measured by the timer metric.	total_cache_processing_duration, cache_fetch_all_records_db_query_duration, cache_stale_fetch_db_query_duration, populate_tainssaimap_cache_duration
egressRoutingMode	Used in Gateway metrics. Indicates the value of the egressRoutingMode configured in Egress Gateway.	IPv4, IPv6, IPv4_IPv6, IPv6_IPv4, None
error_reason	Indicates the reason for failure response received. If message is sent in the response, then it is filled with the message otherwise exception class is filled. In case of successful response it is filled with "no-error".	"no_error" (In case successful response is received) "java.nio.channels.ClosedChannelException" "unable to find valid certification path to requested target" "SSL handshake failed due to invalid SNI"
ErrorOriginator	Captures the ErrorOriginator.	ServiceProducer, Nrf, IngresGW, None
ERRORTYPE	Determines the type of error.	DB_ERROR/MISSING_CONFIGURATION/UNKNOWN
Host	Specifies IP or FQDN port of ingress gateway.	NA
HttpVersion	Specifies Http protocol version.	HTTP/1.1, HTTP/2.0
id	Determines the keyid or instance id that is configured via persistent configuration when oauth is enabled.	NA
InstanceIdentifier	Prefix of the pod configured in helm when there are multiple instances in same deployment.	Prefix configured in helm, UNKNOWN
issuer	NF instance ID of NRF	NA
managed_object	The managed object of NSSF	plmnconfig, supportedslicesmapping, barredslicesmapping, nsiprofiles, systemoptions, grsites, deleteConfiguration, allconfig, unsupported_managed_object, configuredNssai, supportedplmnlist
Message Type	This specifies the type of NS-Selection query message.	INITIAL_REGISTRATION/PDU_SESSION/UE_CONFIG_
Message Type	This specifies the type of NS-Selection query message.	INITIAL_REGISTRATION/PDU_SESSION/UE_CONFIG_UPDATE
Method	HTTP method	POST/PUT/PATCH/DELETE/GET/OPTIONS
NegotiatedTLSVersion	This denotes the TLS version used for communication between the server and the client.	TLSv1.2, TLSv1.3.
NFServiceType	Name of the Service within the NF.	For Eg: Path is /nxxx-yyy/vz/....... Where nxxx-yyy is NFServiceType UNKNOWN if unable to extract NFServiceType from the path.
NFType	It specifies the name of the NF Type.	For example: Path is /nxxx-yyy/vz/....... Where XXX(Upper Case) is NFType UNKNOWN if unable to extract NFType from the path.
NrfUri	URI of the Network Repository Function Instance.	For example: nrf-stubserver.ocnssf-site:8080
number_of_records	The number of records fetched or updated.	Integer values
Operation	NSAvailability Operation	UPDATE/DELETE/SUBSCRIBE/UNSUBSCRIBE
Port	Port number	Integer values
quantile	Captures the latency values with ranges as 10ms, 20ms, 40ms, 80ms, 100ms, 200ms, 500ms, 1000ms and 5000ms.	Integer values
query_type	Type of DB read query	applypolicy_reg/applypolicy_pdu/evaluate_amfset/evaluate_resolution
reason	The reason contains the human readable message for oauth validation failure.	NA
receivedAddressType	Used in Gateway metrics. Indicates the IP type (IPv4/IPv6) of the remote client connected to the Ingress Gateway.	IPv4, IPv6
releaseVersion	Indicates the current release version of Ingress or Egress gateway.	Picked from helm chart {{ .Chart.Version }}
ResponseCode	HTTP response code.	Bad Request, Internal Server Error etc. (HttpStatus.*)
retryCount	The attempt number to send a notification.	Depends on the helm parameter httpMaxRetries (1, 2...)
Route_Path	Path predicate or Header predicate that matches the current request.	NA
Scheme	Specifies the Http protocol scheme.	HTTP, HTTPS, UNKNOWN
scope	NF service name(s) of the NF service producer(s), separated by whitespaces.	NA
secretName	Determines the secret name that is configured via persistent configuration when oauth is enabled	NA
serialNumber	Indicates the type of the certificate.	serialNumber=4661 is used for RSA and serialNumber =4662 is used for ECDSA
Source	Determines if the configuration is done by the operator or fetched from AMF.	OperatorConfig/LearnedConfigAMF
Status	HTTP response code	NA
StatusCode	Status code of NRF access token request.	Bad Request, Internal Server Error etc. (HttpStatus.*)
status_code	Status Code of any response	All supported HTTP status code like 2xx, 4xx, 5xx, etc.
subject	NF instance ID of service consumer	NA
Subscription_removed	The dimension indicates the status of a subscription upon receiving a 404 response from the AMF after a notification is sent.	"false": The subscription was not deleted. This value applies if the feature is disabled, indicating no deletion attempt was made. "true": The subscription was successfully deleted. This value applies if the feature is enabled and the deletion process completed successfully. "error": The subscription was not deleted due to internal issues, such as a database error, despite the feature being enabled and a deletion attempt being made.
Subscription- Id	Subscription -ID	NA
TargetNFInstanceId	NF instance ID of the NF service producer	NA
TargetNFType	The NF type of the NF service producer.	NRF, UDM, AMF, SMF, AUSF, NEF, PCF, SMSF, NSSF, UDR, LMF, GMLC,5G_EIR, SEPP, UPF, N3IWF, AF, UDSF, BSF, CHF, NWDAF
type	For NSSF Cache Metrics, the type denotes the type of cache operation done	cache_create, cache_update, cache_refresh
updated	Indicates whether the configuration is updated or not.	True, False
VirtualFqdn	FQDN that shall be used by the alternate service for the DNS lookup	Valid FQDN

Common Attributes

The following table includes information about common attributes for NSSF.

Table 6-3 Common Attributes

Attribute	Description
application	The name of the application that the microservice is a part of.
eng_version	The engineering version of the application.
microservice	The name of the microservice.
namespace	The namespace in which microservice is running.
node	The name of the worker node that the microservice is running on.

6.1.1 NSSF Success Metrics

This section provides details about the NSSF success metrics.

Table 6-4 nssf_subscription_to_nrf_successful

Field	Details
Description	Indicates if subscription to NRF for nfType NSSF is successful in case of GR enabled NSSF setup.
Type	Gauge Note: value is 1 when subscription is successful and 0 if it fails and retry for subscription.
Service Operation	NSConfig
Dimension	retryCount

Table 6-5 ocnssf_nsselection_rx_total

Field	Details
Description	Count of request messages received by NSSF for the Nnssf_NSSelection service.
Type	Counter
Service Operation	NSSelection
Dimension	AMF Instance Id Message Type plmn

Table 6-6 ocnssf_nsselection_success_tx_total

Field	Details
Description	Count of success response messages sent by NSSF for requests for the Nnssf_NSSelection service.
Type	Counter
Service Operation	NSSelection
Dimension	AMF Instance Id Message Type plmn

Table 6-7 ocnssf_nssaiavailability_rx_total

Field	Details
Description	Count of request messages received by NSSF for the Nnssf_NSSAIAvailability service.
Type	Counter
Service Operation	NSAvailability
Dimension	NfId Method Message Type

Table 6-8 ocnssf_nssaiavailability_success_tx_total

Field	Details
Description	Count of success response messages sent by NSSF for requests for the Nnssf_NSSAIAvailability service.
Type	Counter
Service Operation	NSAvailability
Dimension	NfId Method Message Type

Table 6-9 ocnssf_nssaiavailability_options_rx_total

Field	Details
Description	Count of HTTP options received at NSAvailability service.
Type	Counter
Service Operation	NSAvailability
Dimension	Method Message Type

Table 6-10 ocnssf_nssaiavailability_options_tx_status_ok_total

Field	Details
Description	Count of HTTP options response with status 200 OK.
Type	Counter
Service Operation	NSAvailability
Dimension	Method Message Type

Table 6-11 ocnssf_nssaiavailability_notification_indirect_communication_rx_total

Field	Details
Description	Count of request notification messages sent by NSSF using indirect communication.
Type	Counter
Service Operation	NSAvailability
Dimension	Method Message Type

Table 6-12 ocnssf_nssaiavailability_notification_indirect_communication_tx_total

Field	Details
Description	Count of notification response messages received by NSSF using indirect communication.
Type	Counter
Service Operation	NSAvailability
Dimension	Method Message Type

Table 6-13 ocnssf_nssaiavailability_notification_success_response_rx_total

Field	Details
Description	Count of success notification response messages received by NSSF for requests for the Nnssf_NSSAIAvailability service.
Type	Counter
Service Operation	NSSubscription
Dimension	Method Message Type

Table 6-14 ocnssf_nssaiavailability_notification_tx_total

Field	Details
Description	Count of notification messages sent by NSSF as part of Nnssf_NSSAIAvailability service.
Type	Counter
Service Operation	NSSubscription
Dimension	Method Message Type

Table 6-15 ocnssf_notification_trigger_rx_total

Field	Details
Description	Count of notification triggers received by NSSF.
Type	Counter
Service Operation	NSSubscription
Dimension	Trigger type

Table 6-16 ocnssf_notification_trigger_tx_success_total

Field	Details
Description	Count of success notification trigger responses sent by NsSubscription.
Type	Counter
Service Operation	NSSubscription
Dimension	Trigger type returnCode

Table 6-17 ocnssf_indirect_communication_request_rx_total

Field	Details
Description	Count of subscription creation requests received from AMF when indirect communication was enabled.
Type	Counter
Service Operation	NSSubscription
Dimension	method bindingHeaderPresent

Table 6-18 ocnssf_indirect_communication_response_tx_success_total

Field	Details
Description	Count of subscription creation responses sent by NSSF when indirect communication was enabled.
Type	Counter
Service Operation	NSSubscription
Dimension	method returnCode bindingHeaderPresent

Table 6-19 ocnssf_subscription_request_rx_total

Field	Details
Description	Count of subscription creation requests received from AMF
Type	Counter
Service Operation	NSSubscription
Dimension	method

Table 6-20 ocnssf_subscription_response_tx_success_total

Field	Details
Description	Count of subscription creation responses sent by NSSF.
Type	Counter
Service Operation	NSSubscription
Dimension	method returnCode

Table 6-21 stale_records_computation_DQD_amf_tai_slice_availability_data_seconds_count

Field	Details
Description	This metric captures the amount of time taken by availability fetch query for computing stale records. It starts as soon as the scheduler starts the task.
Type	Timer
Service Operation	NsAuditor
Dimension	NA

Table 6-22 stale_records_computation_DQD_amf_tai_slice_availability_data_seconds_max

Field	Details
Description	This metric shows the longest (maximum) observed duration in seconds. It starts as soon as the scheduler starts the task.
Type	Timer
Service Operation	NsAuditor
Dimension	NA

Table 6-23 stale_records_computation_DQD_amf_tai_slice_availability_data_seconds_sum

Field	Details
Description	This metric is the sum of all observed durations. It starts as soon as the scheduler starts the task.
Type	Timer
Service Operation	NsAuditor
Dimension	NA

Table 6-24 stale_records_computation_JL_amf_tai_slice_availability_data_seconds_count

Field	Details
Description	This metric captures the amount of time taken by code execution to compute stale records to be deleted. It starts after fetch availability query.
Type	Timer
Service Operation	NsAuditor
Dimension	NA

Table 6-25 stale_records_computation_JL_amf_tai_slice_availability_data_seconds_max

Field	Details
Description	This metric shows the longest (maximum) observed duration in seconds. It starts as soon as the scheduler starts the task.
Type	Timer
Service Operation	NsAuditor
Dimension	NA

Table 6-26 stale_records_computation_JL_amf_tai_slice_availability_data_seconds_sum

Field	Details
Description	This metric is the sum of all observed durations. It starts as soon as the scheduler starts the task.
Type	Timer
Service Operation	NsAuditor
Dimension	NA

Table 6-27 total_stale_records_APD_amf_tai_slice_availability_data_seconds_count

Field	Details
Description	This metric captures the total number of observations or events recorded for the timer. It starts as soon as the scheduler starts the task.
Type	Timer
Service Operation	NsAuditor
Dimension	NA

Table 6-28 total_stale_records_APD_amf_tai_slice_availability_data_seconds_max

Field	Details
Description	This metric shows the longest (maximum) observed duration in seconds. It starts as soon as the scheduler starts the task.
Type	Timer
Service Operation	NsAuditor
Dimension	NA

Table 6-29 total_stale_records_APD_amf_tai_slice_availability_data_seconds_sum

Field	Details
Description	This metric is the sum of all observed durations. It starts as soon as the scheduler starts the task.
Type	Timer
Service Operation	NsAuditor
Dimension	NA

Table 6-30 count_amf_tai_slice_availability_data_stale_records_deleted_total

Field	Details
Description	This is the counter metrics for the deleted records. It is being pegged when application delete the stale records.
Type	Counter
Service Operation	NsAuditor
Dimension	NA

Table 6-31 stale_records_computation_DQD_nssai_subscriptions_seconds_count

Field	Details
Description	This metric captures the total number of observations or events recorded for the timer for subscription fetch query for computing stale records. It starts as soon as the scheduler starts the task.
Type	Timer
Service Operation	NsAuditor
Dimension	NA

Table 6-32 stale_records_computation_DQD_nssai_subscriptions_seconds_max

Field	Details
Description	This metric shows the longest (maximum) observed duration in seconds for subscription fetch query for computing stale records. It starts as soon as the scheduler starts the task.
Type	Timer
Service Operation	NsAuditor
Dimension	NA

Table 6-33 stale_records_computation_DQD_nssai_subscriptions_seconds_sum

Field	Details
Description	This metric is the sum of all observed durations. It starts as soon as the scheduler starts the task.
Type	Timer
Service Operation	NsAuditor
Dimension	NA

Table 6-34 stale_records_computation_JL_nssai_subscriptions_seconds_count

Field	Details
Description	This metric captures the total number of observations or events recorded for the timer for code execution to compute subscription stale records to be deleted. It starts after fetch subscription query.
Type	Timer
Service Operation	NsAuditor
Dimension	NA

Table 6-35 stale_records_computation_JL_nssai_subscriptions_seconds_max

Field	Details
Description	This metric shows the longest (maximum) observed duration in seconds for code execution to compute subscription stale records to be deleted. It starts after fetch subscription query.
Type	Timer
Service Operation	NsAuditor
Dimension	NA

Table 6-36 stale_records_computation_JL_nssai_subscriptions_seconds_sum

Field	Details
Description	This metric is the sum of all observed durations for code execution to compute subscription stale records to be deleted. It starts after fetch subscription query.
Type	Timer
Service Operation	NsAuditor
Dimension	NA

Table 6-37 total_stale_records_APD_nssai_subscriptions_seconds_count

Field	Details
Description	This metric captures the total number of observations or events recorded for the timer for whole process including fetch subscription query, code execution and delete stale records query execution time. It starts as soon as the scheduler starts the task.
Type	Timer
Service Operation	NsAuditor
Dimension	NA

Table 6-38 total_stale_records_APD_nssai_subscriptions_seconds_max

Field	Details
Description	This metric shows the longest (maximum) observed duration in seconds for whole process including fetch subscription query, code execution and delete stale records query execution time. It starts as soon as the scheduler starts the task.
Type	Timer
Service Operation	NsAuditor
Dimension	NA

Table 6-39 total_stale_records_APD_nssai_subscriptions_seconds_sum

Field	Details
Description	This metric is the sum of all observed durations for whole process including fetch subscription query, code execution and delete stale records query execution time. It starts as soon as the scheduler starts the task.
Type	Timer
Service Operation	NsAuditor
Dimension	NA

Table 6-40 count_nssai_subscriptions_stale_records_deleted_total

Field	Details
Description	This is the counter metrics for the subscription deleted records. It is being pegged when application delete the subscription stale records.
Type	Counter
Service Operation	NsAuditor
Dimension	NA

Table 6-41 ocnssf_nsselection_latency_seconds_count

Field	Details
Description	This metric captures the total number of observations or events recorded for the timer.
Type	Histogram
Service Operation	NsSelection
Dimension	NA

Table 6-42 ocnssf_nsselection_latency_seconds_max

Field	Details
Description	This metric shows the longest (maximum) observed duration in seconds.
Type	Histogram
Service Operation	NsSelection
Dimension	NA

Table 6-43 ocnssf_nsselection_latency_seconds_sum

Field	Details
Description	This metric is the sum of all observed durations.
Type	Histogram
Service Operation	NsSelection
Dimension	NA

Table 6-44 ocnssf_nsselection_latency_seconds_bucket

Field	Details
Description	This metric records the count of events (or observations) whose value falls below or equal to a specific duration threshold (bucket).
Type	Histogram
Service Operation	NsSelection
Dimension	NA

Table 6-45 ocnssf_nsavailability_latency_seconds_count

Field	Details
Description	This metric captures the total number of observations or events recorded for the timer.
Type	Histogram
Service Operation	NsAvailability
Dimension	NA

Table 6-46 ocnssf_nsavailability_latency_seconds_max

Field	Details
Description	This metric shows the longest (maximum) observed duration in seconds.
Type	Histogram
Service Operation	NsAvailability
Dimension	NA

Table 6-47 ocnssf_nsavailability_latency_seconds_sum

Field	Details
Description	This metric is the sum of all observed durations.
Type	Histogram
Service Operation	NsAvailability
Dimension	NA

Table 6-48 ocnssf_nsavailability_latency_seconds_bucket

Field	Details
Description	This metric records the count of events (or observations) whose value falls below or equal to a specific duration threshold (bucket).
Type	Histogram
Service Operation	NsAvailability
Dimension	NA

6.1.2 NSSF Error Metrics

This section provides details about the NSSF error metrics.

Table 6-49 ocnssf_configuration_database_read_error_total

Field	Details
Description	Count of errors encountered when trying to read the configuration database.
Type	Counter
Service Operation	NSSelection
Dimension	None

Table 6-50 ocnssf_state_data_write_error_total

Field	Details
Description	Count of errors encountered when trying to write to the state database.
Type	Counter
Service Operation	NSAvailability
Dimension	None

Table 6-51 ocnssf_nsselection_unsupported_plmn_total

Field	Details
Description	Count of request messages that did not find mcc and mnc in the PLMN list.
Type	Counter
Service Operation	NSSelection
Dimension	None

Table 6-52 ocnssf_nssaiavailability_notification_error_response_rx_total

Field	Details
Description	Count of failure notification response messages received by NSSF for requests by the Nnssf_NSSAIAvailability service.
Type	Counter
Service Operation	NSSubscription
Dimension	MessageType Method ResponseCode CauseCode retryCount

Table 6-53 ocnssf_nsavailability_unsupported_plmn_total

Field	Details
Description	Count of request messages with unsupported PLMN received by NSSF for the ocnssf_NSAvailability service.
Type	Counter
Service Operation	NSAvailability
Dimension	Message Type Method

Table 6-54 ocnssf_nsavailability_invalid_location_url_total

Field	Details
Description	Count of invalid location header.
Type	Counter
Service Operation	NSAvailability
Dimension	Message Type Method

Table 6-55 ocnssf_nssaiavailability_submod_error_response_tx_total

Field	Details
Description	Count of error response messages sent by NSSF for HTTP patch for subscription (SUBMOD) requests for ocnssf_NSSAIAvailability service.
Type	Counter
Service Operation	NSAvailability
Dimension	ReturnCode SubscriptionId Method

Table 6-56 ocnssf_nssaiavailability_submod_unimplemented_op_total

Field	Details
Description	Count of HTTP patch request messages received by NSSF for ocnssf_NSSAIAvailability service for which PATCH operation (op) is not implemented.
Type	Counter
Service Operation	NSAvailability
Dimension	ReturnCode SubscriptionId Method

Table 6-57 ocnssf_nssaiavailability_submod_patch_apply_error_total

Field	Details
Description	Count of HTTP patch request messages received by OCNSSFfor ocnssf_NSSAIAvailability service for which PATCH application returned error.
Type	Counter
Service Operation	NSAvailability
Dimension	ReturnCode SubscriptionId Method

Table 6-58 ocnssf_nssaiavailability_notification_delete_on_subscription_not_found_total

Field	Details
Description	Triggered when 404 Subscription with SUBSCRIPTION_NOT_FOUND is received by AMF.
Type	Counter
Service Operation	NsSubscription
Dimension	Subscription_Removed

Table 6-59 ocnssf_nssaiavailability_notification_db_error_total

Field	Details
Description	Triggered when DB error or exception occurs when trying to delete NssaiSubscription.
Type	Counter
Service Operation	NsSubscription
Dimension	None

Table 6-60 ocnssf_nssaiavailability_indirect_communication_subscription_failure_total

Field	Details
Description	Count of failure when subscription messages sent by NSSF using indirect communication.
Type	Counter
Service Operation	NSAvailability
Dimension	Message Type Method

Table 6-61 ocnssf_nssaiavailability_indirect_communication_notification_failure_total

Field	Details
Description	Count of failure when notification messages sent by NSSF using indirect communication.
Type	Counter
Service Operation	NSSubscription
Dimension	ReturnCode Message Type Method

Table 6-62 ocnssf_notification_trigger_tx_failure_total

Field	Details
Description	Count of failing notification triggers responses sent by NsSubscription.
Type	Counter
Service Operation	NSSubscription
Dimension	Trigger type returnCode

Table 6-63 ocnssf_indirect_communication_response_tx_failure_total

Field	Details
Description	Count of subscription creation responses sent by NSSF when indirect communication was enabled.
Type	Counter
Service Operation	NSSubscription
Dimension	method returnCode bindingHeaderPresent exceptionType

Table 6-64 ocnssf_subscription_response_tx_failure_total

Field	Details
Description	Count of subscription creation responses sent by NSSF
Type	Counter
Service Operation	NSSubscription
Dimension	method returnCode exceptionType

Table 6-65 ocnssf_nssaiavailability_error_tx_total

Field	Details
Description	Count of error response messages sent by NSSF for requests for the Nnssf_NSSAIAvailability service.
Type	Counter
Service Operation	NSAvailability
Dimension	AMF Instance Id Method Message Type ReturnCode

Table 6-66 ocnssf_nssaiavailability_options_tx_status_unsupportedmediatype_total

Field	Details
Description	Count of HTTP OPTIONS response with status 415 Unsupported Media type.
Type	Counter
Service Operation	NSAvailability
Dimension	Message Type Method

Table 6-67 ocnssf_nsavailability_unsupported_plmn_total

Field	Details
Description	Count of request messages with unsupported PLMN received by NSSF for the ocnssf_NSAvailability service.
Type	Counter
Service Operation	NSAvailability
Dimension	AMF Instance Id Message Type Method

6.1.3 NSSF OAuth Metrics

This section provides details about the NSSF OAuth metrics.

Table 6-68 oc_oauth_nrf_request_total

Field	Details
Description	This is pegged in the OAuth client implementation if the request is sent to NRF for requesting the OAuth token. OAuth client implementation is used in Egress gateway.
Type	Counter
Dimension	ConsumerNFInstanceId ConsumerNFType TargetNFType TargetNFInstanceId scope NrfFqdn

Table 6-69 oc_oauth_nrf_response_success_total

Field	Details
Description	This is pegged in the OAuth client implementation if an OAuth token is successfully received from the NRF. OAuth client implementation is used in Egress gateway.
Type	Counter
Dimension	ConsumerNFInstanceId ConsumerNFType TargetNFType TargetNFInstanceId scope StatusCode NrfFqdn

Table 6-70 oc_oauth_nrf_response_failure_total

Field	Details
Description	This is pegged in the OAuthClientFilter in Egress gateway whenever GetAccessTokenFailedException is captured.
Type	Counter
Dimension	ConsumerNFInstanceId ConsumerNFType TargetNFType TargetNFInstanceId scope StatusCode NrfFqdn

Table 6-71 oc_oauth_nrf_response_failure_total

Field	Details
Description	This is pegged in the OAuthClientFilter in Egress gateway whenever GetAccessTokenFailedException is captured.
Type	Counter
Dimension	ConsumerNFInstanceId ConsumerNFType TargetNFType TargetNFInstanceId Scope StatusCode ErrorOriginator NrfFqdn

Table 6-72 oc_oauth_request_failed_internal_total

Field	Details
Description	This is pegged in the OAuthClientFilter in Egress gateway whenever InternalServerErrorException is captured.
Type	Counter
Dimension	ConsumerNFInstanceId ConsumerNFType TargetNFType TargetNFInstanceId scope StatusCode ErrorOriginator NrfFqdn

Table 6-73 oc_oauth_token_cache_total

Field	Details
Description	This is pegged in the OAuth Client Implementation if the OAuth token is found in the cache.
Type	Counter
Dimension	ConsumerNFInstanceId ConsumerNFType TargetNFType TargetNFInstanceId scope

Table 6-74 oc_oauth_request_invalid_total

Field	Details
Description	This is pegged in the OAuthClientFilter in Egress gateway whenever a BadAccessTokenRequestException/JsonProcessingException is captured.
Type	Counter
Dimension	ConsumerNFInstanceId ConsumerNFType TargetNFType TargetNFInstanceId scope StatusCode ErrorOriginator

Table 6-75 oc_oauth_validation_successful_total

Field	Details
Description	This is pegged in OAuth validator implementation if the received OAuth token is validated successfully. OAuth validator implementation is used in Ingress gateway.
Type	Counter
Dimension	issuer subject scope

Table 6-76 oc_oauth_validation_failure_total

Field	Details
Description	This is pegged in OAuth validator implementation if the validation of the received OAuth token is failed. OAuth validator implementation is used in Ingress gateway.
Type	Counter
Dimension	issuer subject scope reason

Table 6-77 oc_oauth_cert_expiryStatus

Field	Details
Description	Metric used to peg expiry date of the certificate. This metric is further used for raising alarms if certificate expires within 30 days or 7 days.
Type	Gauge
Dimension	id certificateName secretName

Table 6-78 oc_oauth_cert_loadStatus

Field	Details
Description	Metric used to peg whether given certificate can be loaded from secret or not. If it is loadable then "0" is pegged otherwise "1" is pegged. This metric is further used for raising alarms when certificate is not loadable.
Type	Gauge
Dimension	id certificateName secretName

Table 6-79 oc_oauth_request_failed_cert_expiry

Field	Details
Description	Metric used to keep track of number of requests with keyId in token that failed due to certificate expiry. Pegged whenever oAuth Validator module throws oauth custom exception due to certificate expiry for an incoming request.
Type	Metric
Dimension	target nf type target nf instance id consumer nf instance id nrf instance id service name of nf producer service key id

Table 6-80 oc_oauth_keyid_count

Field	Details
Description	Metric used to keep track of number of requests received with keyId in token. Pegged whenever a request with an access token containing kid in header comes to oAuth Validator. This is independent of whether the validation failed or was successful.
Type	Metric
Dimension	target nf type target nf instance id consumer nf instance id nrf instance id service name of nf producer service key id

6.1.4 NsConfig Metrics

This section provides details about the NSSF Managed Object (MO) metrics.

Table 6-81 ocnssf_nsconfig_config_added_total

Field	Details
Description	Count of number of managed object which were added by NSConfig. Trigger Condition: Operator configuration of the Managed Object. Operator configuration of the Managed Object. This is pegged when HTTP POST request is received by NSSF's NsConfig.
Type	Counter
Service Operation	POST operation on plmnConfig, supportedSlicesMapping, barredSlicesMapping, nsiProfileConfig, grSites
Dimension	managed_object

Table 6-82 ocnssf_nsconfig_config_updated_total

Field	Details
Description	Count of number of managed object which were updated by NSConfig.Trigger Condition: Operator configuration of the Managed Object. Operator configuration of the Managed Object. This is pegged when HTTP PUT request is received by NSSF's NsConfig.
Type	Counter
Service Operation	PUT operation on plmnConfig, supportedSlicesMapping, barredSlicesMapping, nsiProfileConfig, grSites, systemOptions, allConfig
Dimension	managed_object

Table 6-83 ocnssf_nsconfig_config_deleted_total

Field	Details
Description	Count of number of managed object which were deleted by NSConfig.Trigger Condition: Operator configuration of the Managed Object. Operator configuration of the Managed Object. This is pegged when HTTP DELETE, or PUT request is received by NSSF's NsConfig.
Type	Counter
Service Operation	DELETE operation on plmnConfig, supportedSlicesMapping, barredSlicesMapping, nsiProfileConfig, grSites, deleteConfig and PUT operation on allConfig
Dimension	managed_object

Table 6-84 ocnssf_nsconfig_requests_rx_total

Field	Details
Description	Count of number of requests which were received by NSConfig.Trigger Condition: Operator configuration of the Managed Object. Operator configuration of the Managed Object. This is pegged when HTTP DELETE, or PUT request is received by NSSF's NsConfig.
Type	Counter
Service Operation	This is pegged when any HTTP request is received by NSSF's NsConfig.
Dimension	managed_object method

Table 6-85 ocnssf_nsconfig_response_tx_total

Field	Details
Description	Count of number of requests which were returned successful by NSConfig.Trigger Condition: Operator configuration of the Managed Object. Operator configuration of the Managed Object.
Type	Counter
Service Operation	This is pegged when any HTTP request is received by NSSF's NsConfig.
Dimension	managed_object method status_code

Table 6-86 ocnssf_nsconfig_error_response_tx_total

Field	Details
Description	Count of number of requests which returned Error by NSConfig.Trigger Condition: Operator configuration of the Managed Object. Operator configuration of the Managed Object.
Type	Counter
Service Operation	This is pegged when any HTTP request is received by NSSF's NsConfig.
Dimension	managed_object method status_code

6.1.5 Perf-info metrics for Overload Control

This section provides details about Perf-info metrics for overload control.

Table 6-87 cgroup_cpu_nanoseconds

Field	Details
Description	Reports the total CPU time (in nanoseconds) on each CPU core for all the tasks in the cgroup.
Type	Gauge
Dimension	NA

Table 6-88 cgroup_memory_bytes

Field	Details
Description	Reports the memory usage.
Type	Gauge
Dimension	NA

Table 6-89 load_level

Field	Details
Description	Provides information about the overload manager load level.
Type	Gauge
Dimension	service namespace

6.1.6 Egress Gateway Metrics

This section provides details about Egress Gateway metrics.

Table 6-90 oc_egressgateway_incoming_ip_type

Field	Details
Description	This is incremented when the IP type of the active incoming connections from the NSSF microservices to the Egress Gateway.
Type	Gauge
Dimension	host receivedAddressType

Table 6-91 oc_egressgateway_outgoing_ip_type

Field	Details
Description	This is incremented when the IP type of the active outgoing connections from Egress Gateway to the destination.
Type	Gauge
Dimension	destinationHost destinationHostAddressType

Table 6-92 oc_egressgateway_dual stack_ip_rejected_total

Field	Details
Description	This is incremented by counting the total IP rejections which are caused by a IP mismatch between the IP type configured in the `egressRoutingMode` and the IP type returned by DNS resolution.
Type	Gauge
Dimension	authority egressRoutingMode dnsResolvedType

Table 6-93 oc_egressgateway_connection_failure_total

Field	Details
Description	Metric to capture failure when the destination is not reachable by Egress Gateway. Here, the destination is producer NF.
Type	Counter
Service Operation	Egress Gateway
Dimensions	Host Port InstanceIdentifier Direction error_reason

Table 6-94 oc_egressgateway_outgoing_tls_connections

Field	Details
Description	Number of TLS connections received on the Egress Gateway and their negotiated TLS versions. The versions can be TLSv1.3 or TLSv1.2
Type	Gauge
Service Operation	Egress Gateway
Dimension	NegotiatedTLSVersion Host Direction InstanceIdentifier

Table 6-95 oc_fqdn_alternate_route_total

Field	Details
Description	Tracks number of registration, deregistration and GET calls received for a given scheme and FQDN. Note: Registration does not reflect active registration numbers. It captured number of registration requests received.
Type	Counter
Service Operation	Egress Gateway
Dimension	type: Register/Deregister/GET binding_value: <scheme>+<FQDN>

Table 6-96 oc_dns_srv_lookup_total

Field	Details
Description	Track number of time DNS SRV lookup was done for a given scheme and FQDN.
Type	Counter
Service Operation	Egress Gateway
Dimension	binding_value: <scheme>+<FQDN>

Table 6-97 oc_alternate_route_resultset

Field	Details
Description	Value provides number of alternate routes known for a given scheme and FQDN. Whenever DNS SRV lookup or static configuration is done, this metric provide number of known alternate route for a given pair. For example, <"http", "abc.oracle.com">: 2.
Type	Gauge
Service Operation	Egress Gateway
Dimension	binding_value: <scheme>+<FQDN>

Table 6-98 oc_configclient_request_total

Field	Details
Description	This metric is pegged whenever a polling request is made from config client to the server for configuration updates.
Type	Counter
Service Operation	Egress Gateway
Dimension	Tags: releaseVersion, configVersion. releaseVersion tag indicates the current chart version of alternate route service deployed. configVersion tag indicates the current configuration version of alternate route service.

Table 6-99 oc_configclient_response_total

Field	Details
Description	This metric is pegged whenever a response is received from the server to client.
Type	Counter
Service Operation	Egress Gateway
Dimension	Tags: releaseVersion, configVersion, updated. releaseVersion tag indicates the current chart version of alternate route service deployed. configVersion tag indicates the current configuration version of alternate route service. updated tag indicates whether there is a configuration update or not.

Table 6-100 oc_egressgateway_peer_health_status

Field	Details
Description	It defines Egress Gateway peer health status. This metric is set to 1, if a peer is unhealthy. This metric is reset to 0, when it becomes healthy again.
Type	Gauge
Service Operation	Egress Gateway
Dimension	peer vfqdn

Table 6-101 oc_egressgateway_peer_health_ping_request_total

Field	Details
Description	It defines Egress Gateway peer health ping request. This metric is incremented every time Egress Gateway send a health ping towards a peer.
Type	Counter
Service Operation	Egress Gateway
Dimension	peer vfqdn statusCode cause

Table 6-102 oc_egressgateway_peer_health_ping_response_total

Field	Details
Description	Egress Gateway Peer health ping response. This metric is incremented every time a Egress Gateway receives a health ping response (irrespective of success or failure) from a peer.
Type	Counter
Service Operation	Egress Gateway
Dimension	peer vfqdn statusCode cause

Table 6-103 oc_egressgateway_peer_health_status_transitions_total

Field	Details
Description	It defines Egress Gateway peer health status transitions. Egress Gateway increments this metric every time a peer transitions from available to unavailable or unavailable to available.
Type	Counter
Service Operation	Egress Gateway
Dimension	peer vfqdn from to

Table 6-104 oc_egressgateway_peer_count

Field	Details
Description	It defines Egress Gateway peer count. This metric is incremented every time for the peer count.
Type	Gauge
Service Operation	Egress Gateway
Dimension	peerset

Table 6-105 oc_egressgateway_peer_available_count

Field	Details
Description	It defines Egress Gateway available peer count. This metric is incremented every time for the available peer count.
Type	Gauge
Service Operation	Egress Gateway
Dimension	peerset

Table 6-106 oc_egressgateway_user_agent_consumer

Field	Details
Description	Whenever the feature is enabled and User-Agent Header is getting generated.
Type	Counter
Service Operation	Egress Gateway
Dimension	ConsumerNfInstanceId: ID of consumer NF (NSSF) as configured in Egress Gateway.

Table 6-107 oc_egressgateway_ip_addresses_fetch_failure

Field	Details
Description	This metric is pegged when an exception occurs while fetching IP addresses of the services from kube-api-server.
Type	Counter
Service Operation	Egress Gateway
Dimension	NA

6.1.7 Ingress Gateway Metrics

This section provides details about Ingress Gateway metrics.

Table 6-108 oc_ingressgateway_incoming_ip_type

Field	Details
Description	This is incremented when the IP type of the active incoming connections from the client to Ingress Gateway.
Type	Gauge
Dimension	host receivedAddressType

Table 6-109 oc_ingressgateway_outgoing_ip_type

Field	Details
Description	This is incremented when the IP type of the active outgoing connections from Ingress Gateway to the backend services.
Type	Gauge
Dimension	BackendSvc BackendSvcAddressType

Table 6-110 oc_ingressgateway_connection_failure_total

Field	Details
Description	Metric to capture the connection failures when connected to the destination service fails. Here in case of Ingress Gateway, the destination service is a backend microservice of the NF.
Type	Counter
Service Operation	Ingress Gateway
Dimensions	Host Port InstanceIdentifier Direction error_reason ErrorOriginator

Table 6-111 oc_ingressgateway_incoming_tls_connections

Field	Details
Description	Number of TLS connections received on the Ingress Gateway and their negotiated TLS versions. The versions can be TLSv1.3 or TLSv1.2.
Type	Gauge
Service Operation	Ingress Gateway
Dimension	NegotiatedTLSVersion Host Direction InstanceIdentifier

Table 6-112 oc_ingressgateway_ip_addresses_fetch_failure

Field	Details
Description	This metric is pegged when an exception occurs while fetching IP addresses of the services from kube-api-server
Type	Counter
Service Operation	Ingress Gateway
Dimension	NA

6.1.8 NSSF Common metrics

This section provides details about the NSSF common metrics.

Table 6-113 security_cert_x509_expiration_seconds

Field	Details
Description	Indicates the time to certificate expiry in epoch seconds.
Type	Histogram
Dimension	serialNumber

Table 6-114 http_requests_total

Field	Details
Description	This is pegged as soon as the request reaches the Ingress or Egress gateway in the first custom filter of the application.
Type	Counter
Dimension	direction: ingress or egress method: the method from the request line uri: the URI from the request line http_version: the HTTP version from the request line host: the value of the Host header field NFType NFServiceType HttpVersion Scheme Route_path InstanceIdentifier ClientCertIdentity

Table 6-115 http_responses_total

Field	Details
Description	Responses received or sent from the microservice .
Type	Counter
Dimension	Status Method Route_path NFType NFServiceType Host HttpVersion Scheme InstanceIdentifier ClientCertIdentity

Table 6-116 http_request_bytes

Field	Details
Description	Size of requests, including header and body. Grouped in 100 byte buckets.
Type	Histogram
Dimension	direction method uri http_version

Table 6-117 http_response_bytes

Field	Details
Description	Size of responses, including header and body. Grouped in 100 byte buckets.
Type	Histogram
Dimension	direction http_version

Table 6-118 bandwidth_bytes

Field	Details
Description	Amount of ingress and egress traffic sent and received by the microservice.
Type	Counter
Dimension	direction

Table 6-119 request_latency_seconds

Field	Details
Description	This metric is pegged in the last custom filter of the Ingress or Egress gateway while the response is being sent back to the consumer NF. It tracks the amount of time taken for processing the request. It starts as soon the request reaches the first custom filter of the application and lasts till the response is sent back to the consumer NF from the last custom filter of the application.
Type	Histogram
Dimension	quantile InstanceIdentifier Route_path Method

Table 6-120 connection_failure_total

Field	Details
Description	This metric is pegged by jetty client when the destination is not reachable by Ingress or Egress gateway. In case of Ingress gateway, the destination service will be a back-end microservice of the NF, and TLS connection failure metrics when connecting to ingress with direction as ingress. For Egress gateway, the destination is producer NF.
Type	Counter
Dimension	Host Port InstanceIdentifier Direction error_reason

Table 6-121 request_processing_latency_seconds

Field	Details
Description	This metric is pegged in the last custom filter of the Ingress or Egress gateway while the response is being sent back to the consumer NF. This metric captures the amount of time taken for processing of the request only within Ingress or Egress gateway. It starts as soon the request reaches the first custom filter of the application and lasts till the request is forwarded to the destination.
Type	Timer
Dimension	quantile InstanceIdentifier Route_path Method

Table 6-122 jetty_request_stat_metrics_total

Field	Details
Description	This metric is pegged for every event occurred when a request is sent to Ingress or Egress gateway.
Type	Counter
Dimension	event client_type InstanceIdentifier

Table 6-123 jetty_response_stat_metrics_total

Field	Details
Description	This metric is pegged for every event occurred when a response is received by Ingress or Egress gateway.
Type	Counter
Dimension	event client_type InstanceIdentifier

Table 6-124 server_latency_seconds

Field	Details
Description	This metric is pegged in Jetty response listener that captures the amount of time taken for processing of the request by jetty client
Type	Timer
Dimension	quantile InstanceIdentifier Method

Table 6-125 roundtrip_latency_seconds

Field	Details
Description	This metric is pegged in Netty outbound handler that captures the amount of time taken for processing of the request by netty server.
Type	Timer
Dimension	quantile InstanceIdentifier Method

Table 6-126 oc_configclient_request_total

Field	Details
Description	This metric is pegged whenever config client is polling for configuration update from common configuration server.
Type	Counter
Dimension	Release version Config version

Table 6-127 oc_configclient_response_total

Field	Details
Description	This metrics is pegged whenever config client receives response from common configuration server.
Type	Counter
Dimension	Release version Config version Updated

Table 6-128 incoming_connections

Field	Details
Description	This metric pegs active incoming connections from client to Ingress or Egress gateway.
Type	Gauge
Dimension	Direction Host InstanceIdentifier

Table 6-129 outgoing_connections

Field	Details
Description	This metric pegs active outgoing connections from Ingress gateway or Egress gateway to destination
Type	Gauge
Dimension	Direction Host InstanceIdentifier

Table 6-130 sbitimer_timezone_mismatch

Field	Details
Description	This metric pegs when sbiTimerTimezone is set to ANY and time zone is not specified in the header then above metric is pegged in ingress and egress gateways.
Type	Gauge
Dimension	Route_path Method

Table 6-131 nrfclient_nrf_operative_status

Field Details

Description

The current operative status of the NRF Instance.

Note: The HealthCheck mechanism is an important component that allows monitoring and managing the health of NRF services.

When enabled, it makes periodic HTTP requests to NRF services to check their availability and updates their status accordingly so that the metric nrfclient_nrf_operative_status updates properly.

When disabled, for each NRF route, it is checked whether the retry time has expired. If so, the health state is reset to "HEALTHY", and the retry time is cleared.

Type Gauge

Dimension NrfUri - URI of the NRF Instance

Table 6-132 nrfclient_dns_lookup_request_total

Field	Details
Description	Total number of times a DNS lookup request is sent to the alternate route service.
Type	Counter
Dimension	Scheme VirtualFqdn

Table 6-133 oc_certificatemanagement_tls_certificate_info

Field

Details

Description

This metric pegs the status as 1 if any file in the configured secret fails to load its name, and Reason is pegged in dimension.

When the secret is loaded again, the status of the current entry is changed to 0.

If the certificate is loaded without any issue, there will be no entry in Prometheus and the previous entry value will be changed to 0.

Type

Gauge

Dimension

Name
SecretName
Reason (CERT_CORRUPT,CERT_MISSING,CERT_EXPIRED)
Service (IngressGateway)

Note: From this release, the Status dimension is renamed as Reason and the CertificateName dimension is renamed as Name.

Table 6-134 oc_certificatemanagement_tls_secret_status

Field

Details

Description

This metric is used to determine whether the configured secret is available in the namespace.

If the secret is available in the namespace, the metric is pegged with value 0.

If the secret is unavailable or if it is removed, then the metric is pegged with status 1.

Type

Gauge

Dimension

SecretName
Namespace
Service

Table 6-135 oc_certificatemanagement_tls_certs_reload_failure

Field

Details

Description

This metric is pegged when any exception occurs while reloading certificates.

For example, failure in the creation of a keystore.

If there is an error, the metric is pegged with value 1.

If the reload is successful, the metric is pegged with value 0.

Type

Gauge

Dimension

Table 6-136 cgiu_jetty_ip_address_fetch_failure

Field	Details
Description	This metric would be pegged when exception occurs during fetching of IP addresses of services from kube-api-server.
Type	Counter
Dimension	NA

6.1.9 NSSF Cache Metrics

Table 6-137 ocnssf_cache_latency_seconds_count

Field

Details

Description

This metric is used to show the time taken in seconds to create/update/rebuild the cache.

NOTE: Right now only rebuild cache's duration will be shown via this metric

Type

Histogram

Dimension

type
duration_type
number_of_records

Table 6-138 ocnssf_cache_latency_seconds_max

Field

Details

Description

This metric is used to show the time taken in seconds to create/update/rebuild the cache.

NOTE: Right now only rebuild cache's duration will be shown via this metric

Type

Histogram

Dimension

type
duration_type
number_of_records

Table 6-139 ocnssf_cache_latency_seconds_sum

Field

Details

Description

This metric is used to show the time taken in seconds to create/update/rebuild the cache.

NOTE: Right now only rebuild cache's duration will be shown via this metric

Type

Histogram

Dimension

type
duration_type
number_of_records

Table 6-140 ocnssf_cache_latency_seconds_bucket

Field

Details

Description

This metric is used to show the time taken in seconds to create/update/rebuild the cache.

NOTE: Right now only rebuild cache's duration will be shown via this metric

Type

Histogram

Dimension

type
duration_type
number_of_records

Table 6-141 ocnssf_cache_rebuild_success_total

Field

Details

Description

This metric is used to show the time taken in seconds to create/update/rebuild the cache for NsSelection and NsSubscription.

NOTE: Right now only rebuild cache's success will be shown via this metric

Type

Counter

Dimension

Table 6-142 ocnssf_cache_rebuild_error_total

Field

Details

Description

This metric is used to show the time taken in seconds to create/update/rebuild the cache for NsSelection and NsSubscription.

NOTE: Right now only rebuild cache's failure will be shown via this metric

Type

Counter

Dimension

Table 6-143 cache_create_success_total

Description	Count of fresh cache created successfully in selection service for required tables
Type	Counter
Service Operation	NSSF Service Name
Dimensions	table_name no_of_records query_duration cache_update_duration

Table 6-144 cache_update_success_total

Description	Count of cache updated successfully in selection service for required tables
Type	Counter
Service Operation	NSSF Service Name
Dimensions	table_name no_of_records query_duration cache_update_duration

Table 6-145 cache_create_error_total

Description	Count of fresh cache creation error in selection service for required tables
Type	Counter
Service Operation	NSSF Service Name
Dimensions	table_name error_cause query_duration

Table 6-146 cache_update_error_total

Description	Count of cache updation error in selection service for required tables
Type	Counter
Service Operation	NSSF Service Name
Dimensions	table_name error_cause query_duration

6.2 NSSF KPIs

This section includes information about KPIs for Oracle Communications Cloud Native Core, Network Slice Selection Function.

The following are the NSSF KPIs:

6.2.1 NSSelection KPIs

Table 6-147 NSSF NSSelection Initial Registration Success Rate

Field	Details
Description	Percentage of NSSelection Initial registration messages with success response
Expression	sum(ocnssf_nsselection_success_tx_total{message_type=\"registration\"})/ sum(ocnssf_nsselection_rx_total{message_type=\"registration\"}))*100"

Table 6-148 NSSF NSSelection PDU establishment success rate

Field	Details
Description	Percentage of NSSelection PDU establishment messages with success response
Expression	sum(ocnssf_nsselection_success_tx_total{message_type=\"pdu_session\"})/ sum(ocnssf_nsselection_rx_total{message_type=\"pdu_session\"}))*100"

Table 6-149 NSSF NSSelection UE-Config Update success rate

Field	Details
Description	Percentage of NSSelection UE-Config Update messages with success response
Expression	sum(ocnssf_nsselection_success_tx_total{message_type=\"ue_config_update\"})/ sum(ocnssf_nsselection_rx_total{message_type=\"ue_config_update\"}))*100",

Table 6-150 4xx Responses (NSSelection)

Field	Details
Description	Rate of 4xx response for NSSelection
Expression	sum(increase(oc_ingressgateway_http_responses_total{Status=~"4.* ",Uri=~".nnssf-nsselection.",Method="GET"}[5m]))

Table 6-151 5xx Responses (NSSelection)

Field	Details
Description	Rate of 5xx response for NSSelection
Expression	sum(increase(oc_ingressgateway_http_responses_total{Status=~"5.* ",Uri=~".nnssf-nsselection.",Method="GET"}[5m])

6.2.2 NSAvailability KPIs

Table 6-152 NSSF NSAvailability PUT success rate

Field	Details
Description	Percentage of NSAvailability UPDATE PUT messages with success response
Expression	sum(ocnssf_nssaiavailability_success_tx_total{message_type=\"availability_update\"}{method=\"PUT"})/sum(ocnssf_nssaiavailability_rx_total{message_type=\"availability_update\"}{method=\"PUT"}))*100"

Table 6-153 NSSF NSAvailability PATCH success rate

Field	Details
Description	Percentage of NSAvailability UPDATE PATCH messages with success response
Expression	sum(ocnssf_nssaiavailability_success_tx_total{message_type=\"availability_update\"}{method=\"PATCH"})/sum(ocnssf_nssaiavailability_rx_total{message_type=\"availability_update\"}{method=\"PATCH"}))*100"

Table 6-154 NSSF NSAvailability Delete success rate

Field	Details
Description	Percentage of NSAvailability Delete messages with success response
Expression	sum(ocnssf_nssaiavailability_success_tx_total{message_type=\"availability_update\"}{method=\"DELETE"})/sum(ocnssf_nssaiavailability_rx_total{message_type=\"availability_update\"}{method=\"DELETE"}))*100""

Table 6-155 NSSF NSAvailability Subscribe success rate

Field	Details
Description	Percentage of NSAvailability Subscribe messages with success response
Expression	sum(ocnssf_nssaiavailability_success_tx_total{message_type=\"availability_subscribe\"}{method=\"POST"})/sum(ocnssf_nssaiavailability_rx_total{message_type=\"availability_subscribe\"}{method=\"POST"}))*100"

Table 6-156 NSSF NSAvailability Unsubscribe success rate

Field	Details
Description	Percentage of NSAvailability Unsubscribe messages with success response
Expression	sum(ocnssf_nssaiavailability_success_tx_total{message_type=\"availability_subscribe\"}{method=\"DELETE"})/sum(ocnssf_nssaiavailability_rx_total{message_type=\"availability_subscribe\"}{method=\"DELETE"}))*100"

Table 6-157 4xx Responses (NSAvailability)

Field	Details
Description	Rate of 4xx response for NSAvailability
Expression	sum(increase(oc_ingressgateway_http_responses_total{Status=~"4.* ",Uri=~".nnssf-nsavailability.",Method="GET"}[5m]))

Table 6-158 5xx Responses (NSAvailability)

Field	Details
Description	Rate of 5xx response for NSAvailability
Expression	sum(increase(oc_ingressgateway_http_responses_total{Status=~"4.* ",Uri=~".nnssf-nsavailability.",Method="GET"}[5m]))

6.2.3 Ingress Gateway KPIs

Table 6-159 NSSF Ingress Request

Field	Details
Description	Rate of HTTP requests received at NSSF Ingress Gateway
Expression	oc_ingressgateway_http_requests

6.3 NSSF Alerts

This section includes information about alerts for Oracle Communications Network Slice Selection Function.

Note:

The performance and capacity of the NSSF system may vary based on the call model, feature or interface configuration, network conditions, and underlying CNE and hardware environment.

You can configure alerts in Prometheus and ocnssf_alert_rules_25.2.200.yaml file.

The following table describes the various severity types of alerts generated by NSSF:

Table 6-160 Alerts Levels or Severity Types

Alerts Levels / Severity Types	Definition
Critical	Indicates a severe issue that poses a significant risk to safety, security, or operational integrity. It requires immediate response to address the situation and prevent serious consequences. Raised for conditions may affect the service of NSSF.
Major	Indicates a more significant issue that has an impact on operations or poses a moderate risk. It requires prompt attention and action to mitigate potential escalation. Raised for conditions may affect the service of NSSF.
Minor	Indicates a situation that is low in severity and does not pose an immediate risk to safety, security, or operations. It requires attention but does not demand urgent action. Raised for conditions may affect the service of NSSF.
Info or Warn (Informational)	Provides general information or updates that are not related to immediate risks or actions. These alerts are for awareness and do not typically require any specific response. WARN and INFO alerts may not impact the service of NSSF.

Caution:

User, computer and applications, and character encoding settings may cause an issue when copy-pasting commands or any content from PDF. The PDF reader version also affects the copy-pasting functionality. It is recommended to verify the pasted content when the hyphens or any special characters are part of the copied content.

Note:

kubectl commands might vary based on the platform deployment. Replace kubectl with Kubernetes environment-specific command line tool to configure Kubernetes resources through kube-api server. The instructions provided in this document are as per the Oracle Communications Cloud Native Environment (OCCNE) version of kube-api server.
The alert file can be customized as required by the deployment environment. For example, namespace can be added as a filtered criteria to the alert expression to filter alerts only for a specific namespace.

6.3.1 Alert Configuration

This section describes how to configure alert rules for the NSSF in Prometheus. It provides guidance on setting up measurement-based alert rules, where the alerting system evaluates metrics reported by NSSF microservices against specified rule conditions to generate alerts as needed.

Prometheus Alert Configuration

In a Prometheus environment, NSSF alert rules are configured based on metrics reported by NSSF components. The alerting workflow monitors these metrics and issues notifications when the defined conditions are met.

For more information about configuring NSSF alerts in Prometheus, see the “Alert Configuration” section in Oracle Communications Cloud Native Core, Network Slice Selection Function Installation, Upgrade, and Fault Recovery Guide.

6.3.2 System Level Alerts

This section lists the system level alerts.

6.3.2.1 OcnssfNfStatusUnavailable

Table 6-161 OcnssfNfStatusUnavailable

Field	Details
Description	'OCNSSF services unavailable'
Summary	'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . \| first \| value \| humanizeTimestamp }}{{ end }} : All OCNSSF services are unavailable.'
Severity	Critical
Condition	All the NSSF services are unavailable, either because the NSSF is getting deployed or purged. These NSSF services considered are nssfselection, nssfsubscription, nssfavailability, nssfconfiguration, appinfo, ingressgateway and egressgateway.
OID	1.3.6.1.4.1.323.5.3.40.1.2.9001
Metric Used	'up' Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use the similar metric as exposed by the monitoring system.
Recommended Actions	The alert is cleared automatically when the NSSF services start becoming available. Steps: Check for service specific alerts which may be causing the issues with service exposure. Run the following command to check if the pod’s status is in “Running” state: `kubectl –n <namespace> get pod` If it is not in running state, capture the pod logs and events. Run the following command to fetch the events as follows: `kubectl get events --sort-by=.metadata.creationTimestamp -n <namespace>` Refer to the application logs on Kibana and check for database related failures such as connectivity, invalid secrets, and so on. The logs can be filtered based on the services. Run the following command to check Helm status and make sure there are no errors: `helm status <helm release name of the desired NF> -n <namespace>` If it is not in “STATUS: DEPLOYED”, then again capture logs and events. If the issue persists, capture all the outputs from the above steps and contact My Oracle Support. Note: Use Cloud Native Core Network Function Data Collector tool for capturing the logs. For more information, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide.

6.3.2.2 OcnssfPodsRestart

Table 6-162 OcnssfPodsRestart

Field	Details
Description	'Pod <Pod Name> has restarted.
Summary	'kubernetes_namespace: {{$labels.namespace}}, podname: {{$labels.pod}}, timestamp: {{ with query "time()" }}{{ . \| first \| value \| humanizeTimestamp }}{{ end }} : A Pod has restarted'
Severity	Major
Condition	A pod belonging to any of the NSSF services has restarted.
OID	1.3.6.1.4.1.323.5.3.40.1.2.9002
Metric Used	'kube_pod_container_status_restarts_total'Note: This is a Kubernetes metric. If this metric is not available, use the similar metric as exposed by the monitoring system.
Recommended Actions	The alert is cleared automatically if the specific pod is up. Steps: Refer to the application logs on Kibana and filter based on the pod name. Check for database related failures such as connectivity, Kubernetes secrets, and so on. Run the following command to check orchestration logs for liveness or readiness probe failures: `kubectl get po -n <namespace>` Note the full name of the pod that is not running, and use it in the following command: `kubectl describe pod <desired full pod name> -n <namespace>` Check the database status. For more information, see "Oracle Communications Cloud Native Core, cnDBTier User Guide". If the issue persists, capture all the outputs from the above steps and contact My Oracle Support. Note: Use Cloud Native Core Network Function Data Collector tool for capturing the logs. For more information, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide.

6.3.2.3 OcnssfSubscriptionServiceDown

Table 6-163 OcnssfSubscriptionServiceDown

Field	Details
Description	'OCNSSF Subscription service <ocnssf-nssubscription> is down'
Summary	'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . \| first \| value \| humanizeTimestamp }}{{ end }} : NssfSubscriptionServiceDown service down'
Severity	Critical
Condition	NssfSubscription services is unavailable.
OID	1.3.6.1.4.1.323.5.3.40.1.2.9003
Metric Used	''up' Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use the similar metric as exposed by the monitoring system.
Recommended Actions	The alert is cleared when the NssfSubscription services is available. Steps: Check if NfService specific alerts are generated to understand which service is down. If the following alerts are generated based on which service is down OcnssfSubscriptionServiceDown Run the following command to check the orchestration log nfsubscription service and check for liveness or readiness probe failures: `kubectl get po -n <namespace>` Note the full name of the pod that is not running, and use it in the following command: `kubectl describe pod <specific desired full pod name> -n <namespace>` Run the following command to check if the pod’s status is in “Running” state: `kubectl –n <namespace> get pod` If it is not in running state, capture the pod logs and events . Run the following command to fetch events: `kubectl get events --sort-by=.metadata.creationTimestamp -n <namespace>` Refer to the application logs on Kibana and filter based on above service names. Check for ERROR WARNING logs for each of these services. Check the database status. For more information, see "Oracle Communications Cloud Native Core, cnDBTier User Guide". Refer to the application logs on Kibana and check for the service status of the nssfConfig service. If the issue persists, capture all the outputs for the above steps and contact My Oracle Support. Note: Use Cloud Native Core Network Function Data Collector tool for capturing the logs. For more information, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide.

6.3.2.4 OcnssfSelectionServiceDown

Table 6-164 OcnssfSelectionServiceDown

Field	Details
Description	'OCNSSF Selection service <ocnssf-nsselection> is down'.
Summary	'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . \| first \| value \| humanizeTimestamp }}{{ end }} : OcnssfSelectionServiceDown service down'
Severity	Critical
Condition	None of the pods of the NSSFSelection microservice is available.
OID	1.3.6.1.4.1.323.5.3.40.1.2.9004
Metric Used	'up' Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use the similar metric as exposed by the monitoring system.
Recommended Actions	The alert is cleared when the nfsubscription service is available. Steps: Run the following command to check the orchestration logs of ocnssf-nsselection service and check for liveness or readiness probe failures: `kubectl get po -n <namespace>` Note the full name of the pod that is not running, and use it in the following command: `kubectl describe pod <specific desired full pod name> -n <namespace>` Refer to the application logs on Kibana and filter based on ocnssf-nsselection service names. Check for ERROR WARNING logs. Check the database status. For more information, see "Oracle Communications Cloud Native Core, cnDBTier User Guide". Depending on the failure reason, take the resolution steps. If the issue persists, capture all the outputs for the above steps and contact My Oracle Support. Note: Use Cloud Native Core Network Function Data Collector tool for capturing the logs. For more information, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide.

6.3.2.5 OcnssfAvailabilityServiceDown

Table 6-165 OcnssfAvailabilityServiceDown

Field	Details
Description	'Ocnssf Availability service ocnssf-nsavailability is down'
Summary	'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . \| first \| value \| humanizeTimestamp }}{{ end }} : NssfAvailability service down'
Severity	Critical
Condition	None of the pods of the OcnssfAvailabilityServiceDown microservice is available.
OID	1.3.6.1.4.1.323.5.3.40.1.2.9005
Metric Used	'up' Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use the similar metric as exposed by the monitoring system.
Recommended Actions	The alert is cleared when the ocnssf-nsavailability service is available. Steps: Run the following command to check the orchestration logs of ocnssf-nsavailability service and check for liveness or readiness probe failures: `kubectl get po -n <namespace>` Note the full name of the pod that is not running, and use it in the following command: `kubectl describe pod <specific desired full pod name> -n <namespace>` Refer to the application logs on Kibana and filter based on ocnssf-nsavailability service names. Check for ERROR WARNING logs. Check the database status. For more information, see "Oracle Communications Cloud Native Core, cnDBTier User Guide". Depending on the failure reason, take the resolution steps. If the issue persists, capture all the outputs for the above steps and contact My Oracle Support. Note: Use Cloud Native Core Network Function Data Collector tool for capturing the logs. For more information, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide.

6.3.2.6 OcnssfConfigurationServiceDown

Table 6-166 OcnssfConfigurationServiceDown

Field	Details
Description	'OCNSSF Config service nssfconfiguration is down'
Summary	'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . \| first \| value \| humanizeTimestamp }}{{ end }} : OcnssfConfigServiceDown service down'
Severity	Critical
Condition	None of the pods of the NssfConfiguration microservice is available.
OID	1.3.6.1.4.1.323.5.3.40.1.2.9006
Metric Used	'up' Note: : This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use the similar metric as exposed by the monitoring system.
Recommended Actions	The alert is cleared when the nssfconfiguration service is available. Steps: Run the following command to check the orchestration logs of nssfconfiguration service and check for liveness or readiness probe failures: `kubectl get po -n <namespace>` Note the full name of the pod that is not running, and use in the following command: `kubectl describe pod <specific desired full pod name> -n <namespace>` Refer the application logs on Kibana and filter based on nssfconfiguration service names. Check for ERROR WARNING logs related to thread exceptions. Check the database status. For more information, see "Oracle Communications Cloud Native Core, cnDBTier User Guide". Depending on the reason of failure, take the resolution steps. If the issue persists, capture all the outputs for the above steps and contact My Oracle Support. Note: Use Cloud Native Core Network Function Data Collector tool for capturing the logs. For more information, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide.

6.3.2.7 OcnssfAppInfoServiceDown

Table 6-167 OcnssfAppInfoServiceDown

Field	Details
Description	OCNSSF Appinfo service appinfo is down'
Summary	kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . \| first \| value \| humanizeTimestamp }}{{ end }} : Appinfo service down'
Severity	Critical
Condition	None of the pods of the App Info microservice is available.
OID	1.3.6.1.4.1.323.5.3.40.1.2.9025
Metric Used	'up' Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use the similar metric as exposed by the monitoring system.
Recommended Actions	The alert is cleared when the app-info service is available. Steps: Run the following command to check the orchestration logs of appinfo service and check for liveness or readiness probe failures: `kubectl get po -n <namespace>` Note the full name of the pod that is not running, and use it in the following command: `kubectl describe pod <specific desired full pod name> -n <namespace>` Refer to the application logs on Kibana and filter based on appinfo service names. Check for ERROR WARNING logs related to thread exceptions. Depending on the failure reason, take the resolution steps. If the issue persists, capture all the outputs for the above steps and contact My Oracle Support. Note: Use Cloud Native Core Network Function Data Collector tool for capturing the logs. For more information, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide.

6.3.2.8 OcnssfIngressGatewayServiceDown

Table 6-168 OcnssfIngressGatewayServiceDown

Field	Details
Description	'Ocnssf Ingress-Gateway service ingressgateway is down'
Summary	'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . \| first \| value \| humanizeTimestamp }}{{ end }} : OcnssfIngressGwServiceDown service down'
Severity	Critical
Condition	None of the pods of the Ingress-Gateway microservice is available.
OID	1.3.6.1.4.1.323.5.3.40.1.2.9007
Metric Used	'up' Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use the similar metric as exposed by the monitoring system.
Recommended Actions	The alert is cleared when the ingressgateway service is available. Steps: Run the following command to check the orchestration logs of ingress-gateway service and check for liveness or readiness probe failures: `kubectl get po -n <namespace>` Note the full name of the pod that is not running, and use it in the following command: `kubectl describe pod <specific desired full pod name> -n <namespace>` Refer to the application logs on Kibana and filter based on ingress-gateway service names. Check for ERROR WARNING logs related to thread exceptions. Depending on the failure reason, take the resolution steps. If the issue persists, capture all the outputs for the above steps and contact My Oracle Support. Note: Use Cloud Native Core Network Function Data Collector tool for capturing the logs. For more information, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide.

6.3.2.9 OcnssfEgressGatewayServiceDown

Table 6-169 OcnssfEgressGatewayServiceDown

Field	Details
Description	'OCNSSF Egress service egressgateway is down'
Summary	'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . \| first \| value \| humanizeTimestamp }}{{ end }} : OcnssfEgressGwServiceDown service down'
Severity	Critical
Condition	None of the pods of the Egress-Gateway microservice is available.
OID	1.3.6.1.4.1.323.5.3.40.1.2.9008
Metric Used	'up' Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use the similar metric as exposed by the monitoring system.
Recommended Actions	The alert is cleared when the egressgateway service is available. Note: The threshold is configurable in the alerts.yaml Steps: Run the following command to check the orchestration logs of egress-gateway service and check for liveness or readiness probe failures: `kubectl get po -n <namespace>` Note the full name of the pod that is not running, and use it in the following command: `kubectl describe pod <specific desired full pod name> -n <namespace>` Refer to the application logs on Kibana and filter based on egress-gateway service names. Check for ERROR WARNING logs related to thread exceptions. Depending on the failure reason, take the resolution steps. If the issue persists, capture all the outputs for the above steps and contact My Oracle Support. Note: Use Cloud Native Core Network Function Data Collector tool for capturing the logs. For more information, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide.

6.3.2.10 OcnssfOcpmConfigServiceDown

Table 6-170 OcnssfOcpmConfigServiceDown

Field	Details
Description	'OCNSSF OCPM Config service is down'
Summary	'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . \| first \| value \| humanizeTimestamp }}{{ end }} : Ocnssf OCPM Config service down'
Severity	Critical
Condition	None of the pods of the ConfigService is available.
OID	1.3.6.1.4.1.323.5.3.40.1.2.9027
Metric Used	'up' Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use the similar metric as exposed by the monitoring system.
Recommended Actions	The alert is cleared when the ConfigService is available. Note: The threshold is configurable in the alerts.yaml Steps: Run the following command to check the orchestration logs of ConfigService service and check for liveness or readiness probe failures: `kubectl get po -n <namespace>` Note the full name of the pod that is not running, and use it in the following command: `kubectl describe pod <specific desired full pod name> -n <namespace>` Refer to the application logs on Kibana and filter based on PerfInfo service names. Check for ERROR WARNING logs related to thread exceptions. Depending on the failure reason, take the resolution steps. If the issue persists, capture all the outputs for the above steps and contact My Oracle Support. Note: Use Cloud Native Core Network Function Data Collector tool for capturing the logs. For more information, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide.

6.3.2.11 OcnssfPerfInfoServiceDown

Table 6-171 OcnssfPerfInfoServiceDown

Field	Details
Description	OCNSSF PerfInfo service is down'
Summary	'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . \| first \| value \| humanizeTimestamp }}{{ end }} : Ocnssf PerfInfo service down'
Severity	Critical
Condition	None of the pods of the PerfInfo service is available.
OID	1.3.6.1.4.1.323.5.3.40.1.2.9026
Metric Used	'up' Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use the similar metric as exposed by the monitoring system.
Recommended Actions	The alert is cleared when the PerfInfo service is available. Note: The threshold is configurable in the alerts.yaml Steps: Run the following command to check the orchestration logs of PerfInfo service and check for liveness or readiness probe failures: `kubectl get po -n <namespace>` Note the full name of the pod that is not running, and use it in the following command: `kubectl describe pod <specific desired full pod name> -n <namespace>` Refer to the application logs on Kibana and filter based on PerfInfo service names. Check for ERROR WARNING logs related to thread exceptions. Depending on the failure reason, take the resolution steps. If the issue persists, capture all the outputs for the above steps and contact My Oracle Support. Note: Use Cloud Native Core Network Function Data Collector tool for capturing the logs. For more information, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide.

6.3.2.12 OcnssfNrfClientManagementServiceDown

Table 6-172 OcnssfNrfClientManagementServiceDown

Field	Details
Description	'OCNSSF NrfClient Management service is down'
Summary	'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . \| first \| value \| humanizeTimestamp }}{{ end }} : Ocnssf NrfClient Management service down'
Severity	Critical
Condition	None of the pods of the NrfClientManagement service is available.
OID	1.3.6.1.4.1.323.5.3.40.1.2.9024
Metric Used	'up' Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use the similar metric as exposed by the monitoring system.
Recommended Actions	The alert is cleared when the NrfClientManagement service is available. Note: The threshold is configurable in the alerts.yaml Steps: Run the following command to check the orchestration logs of NrfClientManagement service and check for liveness or readiness probe failures: `kubectl get po -n <namespace>` Note the full name of the pod that is not running, and use it in the following command: `kubectl describe pod <specific desired full pod name> -n <namespace>` Refer to the application logs on Kibana and filter based on NrfClientManagement service names. Check for ERROR WARNING logs related to thread exceptions. Depending on the failure reason, take the resolution steps. If the issue persists, capture all the outputs for the above steps and contact My Oracle Support. Note: Use Cloud Native Core Network Function Data Collector tool for capturing the logs. For more information, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide.

6.3.2.13 OcnssfAlternateRouteServiceDown

Table 6-173 OcnssfAlternateRouteServiceDown

Field	Details
Description	'OCNSSF Alternate Route service is down'
Summary	'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . \| first \| value \| humanizeTimestamp }}{{ end }} : Ocnssf Alternate Route service down'
Severity	Critical
Condition	None of the pods of the Alternate Route service is available.
OID	1.3.6.1.4.1.323.5.3.40.1.2.9023
Metric Used	'up' Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use the similar metric as exposed by the monitoring system.
Recommended Actions	The alert is cleared when the Alternate Route service is available. Note: The threshold is configurable in the alerts.yaml Steps: Run the following command to check the orchestration logs of Alternate Route service and check for liveness or readiness probe failures: `kubectl get po -n <namespace>` Note the full name of the pod that is not running, and use it in the following command: `kubectl describe pod <specific desired full pod name> -n <namespace>` Refer to the application logs on Kibana and filter based on Alternate Route service names. Check for ERROR WARNING logs related to thread exceptions. Depending on the failure reason, take the resolution steps. If the issue persists, capture all the outputs for the above steps and contact My Oracle Support. Note: Use Cloud Native Core Network Function Data Collector tool for capturing the logs. For more information, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide.

6.3.2.14 OcnssfAuditorServiceDown

Table 6-174 OcnssfAuditorServiceDown

Field	Details
Description	'OCNSSF NsAuditor service is down'
Summary	'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . \| first \| value \| humanizeTimestamp }}{{ end }} : Ocnssf NsAuditor service down'
Severity	Critical
Condition	None of the pods of the NsAuditor service is available.
OID	1.3.6.1.4.1.323.5.3.40.1.2.9022
Metric Used	'up' Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use the similar metric as exposed by the monitoring system.
Recommended Actions	The alert is cleared when the NsAuditor service is available. Note: The threshold is configurable in the alerts.yaml Steps: Run the following command to check the orchestration logs of NsAuditor service and check for liveness or readiness probe failures: `kubectl get po -n <namespace>` Note the full name of the pod that is not running, and use it in the following command: `kubectl describe pod <specific desired full pod name> -n <namespace>` Refer to the application logs on Kibana and filter based on NsAuditor service names. Check for ERROR WARNING logs related to thread exceptions. Depending on the failure reason, take the resolution steps. If the issue persists, capture all the outputs for the above steps and contact My Oracle Support. Note: Use Cloud Native Core Network Function Data Collector tool for capturing the logs. For more information, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide.

6.3.2.15 OcnssfTotalIngressTrafficRateAboveMinorThreshold

Table 6-175 OcnssfTotalIngressTrafficRateAboveMinorThreshold

Field	Details
Description	'Ingress traffic Rate is above the configured minor threshold i.e. 64000 requests per second (current value is: {{ $value }})'
Summary	'timestamp: {{ with query "time()" }}{{ . \| first \| value \| humanizeTimestamp }}{{ end }}: Traffic Rate is above 80 Percent of Max requests per second(80000)'
Severity	Minor
Condition	The total Ocnssf Ingress Message rate has crossed the configured minor threshold of 64000 TPS. Default value of this alert trigger point in NrfAlertValues.yaml is when Ocnssf Ingress Rate crosses 80 % of 80000 (Maximum ingress request rate).
OID	1.3.6.1.4.1.323.5.3.40.1.2.9009
Metric Used	'oc_ingressgateway_http_requests_total'
Recommended Actions	The alert is cleared either when the total Ingress Traffic rate falls below the Minor threshold or when the total traffic rate crosses the Major threshold, in which case the OcnssfTotalIngressTrafficRateAboveMinorThreshold alert shall be raised. Note: The threshold is configurable in the alerts.yaml Steps: Reassess the reason why the NSSF is receiving additional traffic, for example, the mated site NSSF is unavailable in the georedundancy scenario. If this is unexpected, contact My Oracle Support. Refer Grafana to determine which service is receiving high traffic. Refer Ingress Gateway section in Grafana to determine an increase in 4xx and 5xx error codes. Check Ingress Gateway logs on Kibana to determine the reason for the errors.

6.3.2.16 OcnssfTotalIngressTrafficRateAboveMajorThreshold

Table 6-176 OcnssfTotalIngressTrafficRateAboveMajorThreshold

Field	Details
Description	'Ingress traffic Rate is above the configured major threshold i.e. 72000 requests per second (current value is: {{ $value }})'
Summary	'timestamp: {{ with query "time()" }}{{ . \| first \| value \| humanizeTimestamp }}{{ end }}: Traffic Rate is above 90 Percent of Max requests per second(80000)'
Severity	Major
Condition	The total Ocnssf Ingress Message rate has crossed the configured major threshold of 72000 TPS. Default value of this alert trigger point in NssfAlertValues.yaml is when Ocnssf Ingress Rate crosses 90 % of 80000 (Maximum ingress request rate).
OID	1.3.6.1.4.1.323.5.3.40.1.2.9010
Metric Used	'oc_ingressgateway_http_requests_total'
Recommended Actions	The alert is cleared when the total Ingress traffic rate falls below the major threshold or when the total traffic rate crosses the critical threshold, in which case the alert shall be raised. OcnssfTotalIngressTrafficRateAboveCriticalThreshold Note: The threshold is configurable in the alerts.yaml Steps: Reassess the reason why the NSSF is receiving additional traffic, for example, the mated site NSSF is unavailable in the georedundancy scenario. If this is unexpected, contact My Oracle Support. Refer Grafana to determine which service is receiving high traffic. Refer Ingress Gateway section in Grafana to determine an increase in 4xx and 5xx error codes. Check Ingress Gateway logs on Kibana to determine the reason for the errors.

6.3.2.17 OcnssfTotalIngressTrafficRateAboveCriticalThreshold

Table 6-177 OcnssfTotalIngressTrafficRateAboveCriticalThreshold

Field	Details
Description	'Ingress traffic Rate is above the configured critical threshold i.e. 76000 requests per second (current value is: {{ $value }})'
Summary	'timestamp: {{ with query "time()" }}{{ . \| first \| value \| humanizeTimestamp }}{{ end }}: Traffic Rate is above 95 Percent of Max requests per second(80000)'
Severity	Critical
Condition	The total Ocnssf Ingress Message rate has crossed the configured critical threshold of 76000 TPS. Default value of this alert trigger point in NrfAlertValues.yaml is when Ocnssf Ingress Rate crosses 95 % of 80000 (Maximum ingress request rate).
OID	1.3.6.1.4.1.323.5.3.40.1.2.9011
Metric Used	'oc_ingressgateway_http_requests_total'
Recommended Actions	The alert is cleared when the Ingress traffic rate falls below the critical threshold. Note: The threshold is configurable in the alerts.yaml Steps: Reassess the reason why the NSSF is receiving additional traffic, for example, the mated site NSSF is unavailable in the georedundancy scenario. If this is unexpected, contact My Oracle Support. Refer Grafana to determine which service is receiving high traffic. Refer Ingress Gateway section in Grafana to determine an increase in 4xx and 5xx error codes. Check Ingress Gateway logs on Kibana to determine the reason for the errors.

6.3.2.18 OcnssfTransactionErrorRateAbove1Percent

Table 6-178 OcnssfTransactionErrorRateAbove1Percent

Field	Details
Description	Transaction Error rate is above 1 Percent of Total Transactions
Summary	Transaction Error Rate detected above 1 Percent of Total Transactions
Severity	Warning
Condition	The number of failed transactions has crossed the minor threshold of 1 percent of the total transactions.
OID	1.3.6.1.4.1.323.5.3.40.1.2.9012
Metric Used	oc_ingressgateway_http_responses_total
Recommended Actions	The alert is cleared when the number of failed transactions reduces from the 1% threshold of the total transactions or when the failed transactions crosses the 10% threshold in which case the OcnssfTransactionErrorRateAbove10Percent shall be raised. Steps: Check the Service specific metrics to understand the specific service request errors. For example: ocnssf_nsselection_success_tx_total with statusCode ~= 2xx. Verify the metrics per service, per method For example: Discovery requests can be deduced from the following metrics: Metrics="oc_ingressgateway_http_responses_total" Method="GET" NFServiceType="ocnssf-nsselection" Route_path="/nnssf-nsselection/v2/**" Status="503 SERVICE_UNAVAILABLE" If guidance is required, contact My Oracle Support.

6.3.2.19 OcnssfTransactionErrorRateAbove10Percent

Table 6-179 OcnssfTransactionErrorRateAbove10Percent

Field	Details
Description	'Transaction Error rate is above 10 Percent of Total Transactions (current value is {{ $value }})'
Summary	'timestamp: {{ with query "time()" }}{{ . \| first \| value \| humanizeTimestamp }}{{ end }}: Transaction Error Rate detected above 10 Percent of Total Transactions'
Severity	Minor
Condition	The number of failed transactions has crossed the minor threshold of 10 percent of the total transactions.
OID	1.3.6.1.4.1.323.5.3.40.1.2.9013
Metric Used	'oc_ingressgateway_http_responses_total'
Recommended Actions	The alert is cleared when the number of failed transactions reduces from the 10% threshold of the total transactions or when the failed transactions crosses the 25% threshold in which case the OcnssfTransactionErrorRateAbove25Percent shall be raised. Steps: Check the Service specific metrics to understand the specific service request errors. For example: ocnssf_nsselection_success_tx_total with statusCode ~= 2xx. Verify the metrics per service, per method For example: Discovery requests can be deduced from the following metrics: Metrics="oc_ingressgateway_http_responses_total" Method="GET" NFServiceType="ocnssf-nsselection" Route_path="/nnssf-nsselection/v2/**" Status="503 SERVICE_UNAVAILABLE" If guidance is required, contact My Oracle Support.

6.3.2.20 OcnssfTransactionErrorRateAbove25Percent

Table 6-180 OcnssfTransactionErrorRateAbove25Percent

Field	Details
Description	'Transaction Error rate is above 25 Percent of Total Transactions (current value is {{ $value }})'
summary	'timestamp: {{ with query "time()" }}{{ . \| first \| value \| humanizeTimestamp }}{{ end }}: Transaction Error Rate detected above 25 Percent of Total Transactions'
Severity	Major
Condition	The number of failed transactions has crossed the minor threshold of 25 percent of the total transactions.
OID	1.3.6.1.4.1.323.5.3.40.1.2.9014
Metric Used	'oc_ingressgateway_http_responses_total'
Recommended Actions	The alert is cleared when the number of failed transactions reduces from the 25% of the total transactions or when the number of failed transactions crosses the 50% threshold in which case the OcnssfTransactionErrorRateAbove50Percent shall be raised. Steps: Check the Service specific metrics to understand the specific service request errors. For example: ocnssf_nsselection_success_tx_total with statusCode ~= 2xx. Verify the metrics per service, per method For example: Discovery requests can be deduced from the following metrics: Metrics="oc_ingressgateway_http_responses_total" Method="GET" NFServiceType="ocnssf-nsselection" Route_path="/nnssf-nsselection/v2/**" Status="503 SERVICE_UNAVAILABLE" If guidance is required, contact My Oracle Support.

6.3.2.21 OcnssfTransactionErrorRateAbove50Percent

Table 6-181 OcnssfTransactionErrorRateAbove50Percent

Field	Details
Description	'Transaction Error rate is above 50 Percent of Total Transactions (current value is {{ $value }})'
Summary	'timestamp: {{ with query "time()" }}{{ . \| first \| value \| humanizeTimestamp }}{{ end }}: Transaction Error Rate detected above 50 Percent of Total Transactions'
Severity	Critical
Condition	The number of failed transactions has crossed the minor threshold of 50 percent of the total transactions.
OID	1.3.6.1.4.1.323.5.3.40.1.2.9015
Metric Used	'oc_ingressgateway_http_responses_total
Recommended Actions	The alert is cleared when the number of failed transactions is below 50 percent of the total transactions. Steps: Check for service specific metrics to understand the specific service request errors. For example: ocnssf_nsselection_success_tx_total with statusCode ~= 2xx. Verify the metrics per service, per method For example: Discovery requests can be deduced from the following metrics: Metrics="oc_ingressgateway_http_responses_total" Method="GET" NFServiceType="ocnssf-nsselection" Route_path="/nnssf-nsselection/v2/**" Status="503 SERVICE_UNAVAILABLE" If guidance is required, contact My Oracle Support.

6.3.3 Application Level Alerts

This section lists the application level alerts.

6.3.3.1 OcnssfOverloadThresholdBreachedL1

Table 6-182 OcnssfOverloadThresholdBreachedL1

Field	Details
Description	'Overload Level of {{$labels.app_kubernetes_io_name}} service is L1'
Summary	'kubernetes_namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}: Overload Level of {{$labels.app_kubernetes_io_name}} service is L1'
Severity	Warning
Condition	NSSF Services have breached their configured threshold of Level L1 for any of the aforementioned metrics. Thresholds are configured for CPU, svc_failure_count, svc_pending_count, and memory.
OID	1.3.6.1.4.1.323.5.3.40.1.2.9016
Metric Used	load_level
Recommended Actions	The alert is cleared when the Ingress Traffic rate falls below the configured L1 threshold. Note: The thresholds can be configured using REST API. Steps: Reassess the reasons leading to NSSF receiving additional traffic. If this is unexpected, contact My Oracle Support. 1. Refer to alert to determine which service is receiving high traffic. It may be due to a sudden spike in traffic. For example: When one mated site goes down, the NFs move to the given site. 2. Check the service pod logs on Kibana to determine the reason for the errors. 3. If this is expected traffic, then the thresholds levels may be reevaluated as per the call rate and reconfigured as mentioned in Oracle Communications Cloud Native Core, Network Slice Selection Function REST Specification Guide.

6.3.3.2 OcnssfOverloadThresholdBreachedL2

Table 6-183 OcnssfOverloadThresholdBreachedL2

Field	Details
Description	'Overload Level of {{$labels.app_kubernetes_io_name}} service is L2'
Summary	'kubernetes_namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}: Overload Level of {{$labels.app_kubernetes_io_name}} service is L2'
Severity	Minor
Condition	NSSF Services have breached their configured threshold of Level L2 for any of the aforementioned metrics. Thresholds are configured for CPU, svc_failure_count, svc_pending_count, and memory.
OID	1.3.6.1.4.1.323.5.3.40.1.2.9017
Metric Used	load_level
Recommended Actions	The alert is cleared when the Ingress Traffic rate falls below the configured L2 threshold. Note: The thresholds can be configured using REST API. Steps: Reassess the reasons leading to NSSF receiving additional traffic. If this is unexpected, contact My Oracle Support. 1. Refer to alert to determine which service is receiving high traffic. It may be due to a sudden spike in traffic. For example: When one mated site goes down, the NFs move to the given site. 2. Check the service pod logs on Kibana to determine the reason for the errors. 3. If this is expected traffic, then the thresholds levels may be reevaluated as per the call rate and reconfigured as mentioned in Oracle Communications Cloud Native Core, Network Slice Selection Function REST Specification Guide.

6.3.3.3 OcnssfOverloadThresholdBreachedL3

Table 6-184 OcnssfOverloadThresholdBreachedL3

Field	Details
Description	'Overload Level of {{$labels.app_kubernetes_io_name}} service is L3'
Summary	'kubernetes_namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}: Overload Level of {{$labels.app_kubernetes_io_name}} service is L3'
Severity	Major
Condition	NSSF Services have breached their configured threshold of Level L3 for any of the aforementioned metrics. Thresholds are configured for CPU, svc_failure_count, svc_pending_count, and memory.
OID	1.3.6.1.4.1.323.5.3.40.1.2.9018
Metric Used	load_level
Recommended Actions	The alert is cleared when the Ingress Traffic rate falls below the configured L3 threshold. Note: The thresholds can be configured using REST API. Steps: Reassess the reasons leading to NSSF receiving additional traffic. If this is unexpected, contact My Oracle Support. 1. Refer to alert to determine which service is receiving high traffic. It may be due to a sudden spike in traffic. For example: When one mated site goes down, the NFs move to the given site. 2. Check the service pod logs on Kibana to determine the reason for the errors. 3. If this is expected traffic, then the thresholds levels may be reevaluated as per the call rate and reconfigured as mentioned in Oracle Communications Cloud Native Core, Network Slice Selection Function REST Specification Guide.

6.3.3.4 OcnssfOverloadThresholdBreachedL4

Table 6-185 OcnssfOverloadThresholdBreachedL4

Field	Details
Description	'Overload Level of {{$labels.app_kubernetes_io_name}} service is L4'
Summary	'kubernetes_namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}: Overload Level of {{$labels.app_kubernetes_io_name}} service is L4'
Severity	Critical
Condition	NSSF Services have breached their configured threshold of Level L4 for any of the aforementioned metrics. Thresholds are configured for CPU, svc_failure_count, svc_pending_count, and memory.
OID	1.3.6.1.4.1.323.5.3.40.1.2.9019
Metric Used	load_level
Recommended Actions	The alert is cleared when the Ingress Traffic rate falls below the configured L4 threshold. Note: The thresholds can be configured using REST API. Steps: Reassess the reasons leading to NSSF receiving additional traffic. If this is unexpected, contact My Oracle Support. 1. Refer to alert to determine which service is receiving high traffic. It may be due to a sudden spike in traffic. For example: When one mated site goes down, the NFs move to the given site. 2. Check the service pod logs on Kibana to determine the reason for the errors. 3. If this is expected traffic, then the thresholds levels may be reevaluated as per the call rate and reconfigured as mentioned in Oracle Communications Cloud Native Core, Network Slice Selection Function REST Specification Guide.

6.3.3.5 OcnssfScpMarkedAsUnavailable

Table 6-186 OcnssfScpMarkedAsUnavailable

Field	Details
Description	'An SCP has been marked unavailable'
Summary	'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . \| first \| value \| humanizeTimestamp }}{{ end }} : One of the SCP has been marked unavailable'
Severity	Major
Condition	One of the SCPs has been marked unhealthy.
OID	1.3.6.1.4.1.323.5.3.40.1.2.9020
Metric Used	'oc_egressgateway_peer_health_status'
Recommended Actions	This alert get cleared when unavailable SCPs become available.

6.3.3.6 OcnssfAllScpMarkedAsUnavailable

Table 6-187 OcnssfAllScpMarkedAsUnavailable

Field	Details
Description	'All SCPs have been marked unavailable'
Summary	'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . \| first \| value \| humanizeTimestamp }}{{ end }} : All SCPs have been marked as unavailable'
Severity	Critical
Condition	All SCPs have been marked unavailable.
OID	1.3.6.1.4.1.323.5.3.40.1.2.9021
Metric Used	'oc_egressgateway_peer_count and oc_egressgateway_peer_available_count'
Recommended Actions	NF clears the critical alarm when at least one SCP peer in a peer set becomes available such that all other SCP or SEPP peers in the given peer set are still unavailable.

6.3.3.7 OcnssfTLSCertificateExpireMinor

Table 6-188 OcnssfTLSCertificateExpireMinor

Field	Details
Description	'TLS certificate to expire in 6 months'.
Summary	'namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}{{ . \| first \| value \| humanizeTimestamp }}{{ end }} : TLS certificate to expire in 6 months'
Severity	Minor
Condition	This alert is raised when the TLS certificate is about to expire in six months.
OID	1.3.6.1.4.1.323.5.3.40.1.2.9028
Metric Used	security_cert_x509_expiration_seconds
Recommended Actions	The alert is cleared when the TLS certificate is renewed. For more information about certificate renewal, see "Creating Private Keys and Certificate " section in the Oracle Communications Cloud Native Core, Network Slice Selection Function Installation, Upgrade, and Fault Recovery Guide.

6.3.3.8 OcnssfTLSCertificateExpireMajor

Table 6-189 OcnssfTLSCertificateExpireMajor

Field	Details
Description	'TLS certificate to expire in 3 months.'
Summary	'namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}{{ . \| first \| value \| humanizeTimestamp }}{{ end }} : TLS certificate to expire in 3 months'
Severity	Major
Condition	This alert is raised when the TLS certificate is about to expire in three months.
OID	1.3.6.1.4.1.323.5.3.40.1.2.9029
Metric Used	security_cert_x509_expiration_seconds
Recommended Actions	The alert is cleared when the TLS certificate is renewed. For more information about certificate renewal, see "Creating Private Keys and Certificate " section in the Oracle Communications Cloud Native Core, Network Slice Selection Function Installation, Upgrade, and Fault Recovery Guide.

6.3.3.9 OcnssfTLSCertificateExpireCritical

Table 6-190 OcnssfTLSCertificateExpireCritical

Field	Details
Description	'TLS certificate to expire in one month.'
Summary	'namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}{{ . \| first \| value \| humanizeTimestamp }}{{ end }} : TLS certificate to expire in 1 month'
Severity	Critical
Condition	This alert is raised when the TLS certificate is about to expire in one month.
OID	1.3.6.1.4.1.323.5.3.40.1.2.9030
Metric Used	security_cert_x509_expiration_seconds
Recommended Actions	The alert is cleared when the TLS certificate is renewed. For more information about certificate renewal, see "Creating Private Keys and Certificate " section in the Oracle Communications Cloud Native Core, Network Slice Selection Function Installation, Upgrade, and Fault Recovery Guide.

6.3.3.10 OcnssfNrfInstancesInDownStateMajor

Table 6-191 OcnssfNrfInstancesInDownStateMajor

Field	Details
Description	'When current operative status of any NRF Instance is unavailable/unhealthy'
Summary	'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . \| first \| value \| humanizeTimestamp }}{{ end }} : Few of the NRF instances are in unavailable state'
Severity	Major
Condition	When sum of the metric values of each NRF instance is greater than 0 but less than 3.
OID	1.3.6.1.4.1.323.5.3.40.1.2.9032
Metric Used	nrfclient_nrf_operative_status
Recommended Actions	This alert is cleared when operative status of all the NRF Instances is available/healthy. Steps: Check the nrfclient_nrf_operative_status metric value of each NRF instance. The instances for which the metric value is '0' are down. Bring up the NRF instances that are down. If the issue persists, capture all the outputs for the above steps and contact My Oracle Support. Note: Use Cloud Native Core Network Function Data Collector tool for capturing the logs. For more information, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide.

6.3.3.11 OcnssfAllNrfInstancesInDownStateCritical

Table 6-192 OcnssfAllNrfInstancesInDownStateCritical

Field	Details
Description	'When current operative status of all the NRF Instances is unavailable/unhealthy'
Summary	'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . \| first \| value \| humanizeTimestamp }}{{ end }} : All the NRF instances are in unavailable state'
Severity	Critical
Condition	When sum of the metric values of each NRF instance is equal to 0.
OID	1.3.6.1.4.1.323.5.3.40.1.2.9031
Metric Used	nrfclient_nrf_operative_status
Recommended Actions	This alert is cleared when current operative status of atleast one NRF Instance is available/healthy. Steps: Bring up at least one NRF Instance. If the issue persists, capture all the outputs for the above steps and contact My Oracle Support. Note: Use Cloud Native Core Network Function Data Collector tool for capturing the logs. For more information, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide.

6.3.4 Configuring SNMP Notifier

This section describes the procedure to configure SNMP Notifier.

The SNMP MIB files are used to define the MIB objects. When uploaded to Wireshark and MIB tools, such as MIB Browser and Trap Receiver, users can see the detailed MIB definition instead of just the OID. All tools require the valid syntax of MIB files, and even minor errors can cause the upload to fail.

Procedure to Validate MIB Files

This procedure explains how to validate the MIB files and how to fix some common errors.

Download MIB Files: Download the MIB files onto your PC. In the NSSF environment, the files are named as follows:
- NSSF-MIB.mib
- NSSF-TC.mib
- TEKELEC-TOPLEVEL-REG.mib
  These files are located in the path /ocnssf/observability/mib.
Open Simpleweb MIB Validator: Open the Simpleweb MIB validator page.
Upload MIB Files:
1. Under "Enter the local file name of your MIB module," click the "Choose File" button.
2. Select the MIB file you want to validate, click "Open," and the file will be added to the web page.
Inspect MIB Definitions:
1. Open the MIB file on your PC using any suitable application.
2. In the "IMPORTS" section, identify the MIB definitions listed after the word "FROM."
3. Skip standard MIBs such as "SNMPv2-SMI," "SNMPv2-TC," etc., as they are already included in the Simpleweb MIB validator by default.
Handling Private MIBs:
1. For other MIBs, especially private MIB files, locate the corresponding MIB file for each definition.
2. If the MIB file names differ from the MIB DEFINITIONS, rename the file as <MIB definition>.mib. For example:
  - Original MIB file name: Private-MIB-File.mib
  - MIB definition in "IMPORTS" section: FROM PRIVATE-MIB Rename the file to: PRIVATE-MIB.mib
Upload Corrected File Names:
1. Upload the corrected file names into the Simpleweb MIB validator. In NSSF, the corrected file names will be NSSF_MIB.mib, NSSF-TC.mib, TEKELEC-TOPLEVEL-REG.mib.
2. Click "Choose Files" and then click "Submit" to complete the process.

By following these steps, you can ensure the proper validation of MIB files, including the handling of standard and private MIBs.

Fixes for Common MIB Compliance Issues

Import SNMPv2-SMI:

Message: "Invalid status 'current' in SMIv1 MIB"

Fix Method: Add "MODULE-IDENTITY FROM SNMPv2-SMI"

Example format:


IMPORTS
     TEXTUAL-CONVENTION FROM SNMPv2-TC
     MODULE-IDENTITY FROM SNMPv2-SMI
     oracleCNE FROM TEKELEC-TOPLEVEL-REG;

Last Update and Revision:

Messages:

"Revision date after last update"
"Revision not in reverse chronological order"
"Revision for the last update is missing"

Fix Method:

Ensure LAST-UPDATED is exactly the same as the most recent REVISION.
Create a separate "REVISION HISTORY" section.
Put all REVISIONs in this section in reverse chronological order.

Example format:


oracleNssfMIB MODULE-IDENTITY
     LAST-UPDATED "202302091734Z"
     ...
     REVISION    "202302091734Z"
     DESCRIPTION "Updated."
     ::= { oracleNSSF 1 }

Case-Sensitive Names:

Message: "<name> should start with a lowercase letter"

Fix Method: Change the first letter to lowercase.

Example format:


OCNSSFConfigurationServiceDown NOTIFICATION-TYPE
     ...

Fix:


ocnssfConfigurationServiceDown NOTIFICATION-TYPE
     ...

Duplicate OID:

Message: "Identifier ocnssfIngressGatewayServiceDown' registers object identifier already registered by ocnssfConfigurationServiceDown'"

Fix Method: Change the OID to an unused number.

Example format:


ocnssfIngressGatewayServiceDown NOTIFICATION-TYPE
     ...
     ::= { oracleNssfMIBNotifications  9006 }

Fix:


ocnssfIngressGatewayServiceDown NOTIFICATION-TYPE
     ...
     ::= { oracleNssfMIBNotifications  9007 }

MIB File Missing "END":

Message: "

Syntax error, unexpected $end'"

Fix Method: Add the word "END" at the end of the MIB file.

Example format:


ocnssfPerfInfoServiceDown NOTIFICATION-TYPE
     ...
     ::= { oracleNssfMIBNotifications  9036 }
END