2.3.4 Maintenance

The DSR provides the following maintenance capabilities:

  • Alarms and Events
  • Measurements
  • Key Performance Indicators
  • Bulk Import/Export

Alarms and Events

The platform and DSR software raise minor, major, critical alarms, and events for a wide variety of conditions. These are immediately sent to the OAM system also sent to the operator’s network management system using SNMP. Alarm or event logs at the OAM are stored up to seven days. The OAM provides a dashboard view of all alarms on the downstream MPs. This information is maintained locally up to three days.

Figure 2-3 Flow of Alarms


Flow of Alarms

Following are some of the alarms and events supported by DSR:

  • Connection to peer failed/ restored
  • Peer unavailable/available
  • Connection to peer congested/not-congested
  • Route list available/unavailable
  • OAM server failed/ restored
  • MP failed/ restored
  • MP entered/exited/changed local congestion

A detailed list of all alarms supported in DSR can be found in Platform Feature Guide.

Key Performance Indicators

Key Performance Indicators (KPIs) allow the user to monitor system performance data, including CPU, memory, swap space, and uptime per server. This performance data is collected from all servers within the defined topology. Key Performance Indicators supported by the platform and DSR software are in the following tables.

Table 2-1 DSR KPI Summary

KPI Category KPI Examples
Server Element KPIs A group of KPIs that appear regardless of server role such as CPU and Network Element.
CAPM KPIs Counters related to computer-aided policy making such as active templates and test templates.
Charging Proxy Application KPIs KPIs related to the CPA feature such as CPA Answer Message Rate, CPA Ingress Message Rate, and cSBR Query Error Rate.
Communications Agent KPIs KPIs related to the communication agent such as user data ingress message rate.
Connection Maintenance KPIs KPIs pertaining to connection maintenance such as RxConnAvgMPS.
DIAM KPIs Basic Diameter KPIs such as Avg Rsp time and ingress trans success rate.
IPFE KPIs KPIs associated with IPFE such as CPU % and IPFE Mbytes/Sec.
MP KPIs KPIs relating to the message processor such as Avg Diameter Process CPU Util and average routing message rate.
FABR KPIs KPIs related to the full address based resolution feature such as Ingress Message Rate and DP Response Time Average.
RBAR KPIs KPIs related to the Range Based Address Resolution feature such as Average Resolved Message Rate and Ingress Message Rate.
SBR KPIs KPIs related to Session Binding Repository such as Current Session Bindings and Request Rate.

Table 2-2 Platform KPI Summary

KPI Name KPI Description
System.CPU_UtilPct Reflects current CPU usage, from 0-100%. (100% means all CPU Cores are completely busy).
System.RAM_UtilPct Reflects the current committed RAM usage as a percentage of total physical RAM. Based on the Committed_AS measurement from Linux /proc/meminfo. This metric can exceed 100% if the kernel has committed more resources than provided by physical RAM, in which case swapping will occur.
System.Swap_UtilPct Reflects the current usage of Swap space as a percentage of total configured swap space. This metric will be 0-100%.
System.Uptime_Srv Length of time since the last server reboot.

A detailed list of all KPIs supported in DSR can be found in the Platform Feature Guide found on the Oracle Help Center (OHC).

Measurements

All components of the DSR solution measure the amount and type of messages sent and received. Measurement data collected from all components of the solution can be used for multiple purposes, including discerning traffic patterns, user behavior, traffic modeling, size traffic sensitive resources, and troubleshooting.

The measurements framework allows applications to define, update, and produce reports for various measurements:

  • Measurements are ordinary counters that count occurrences of different events within the system, for example, the number of messages received. Measurement counters are also called pegs.
  • Applications simply peg (increment) measurements upon the occurrence of the event that needs to be measured.
  • Measurements are collected and merged at the OAM servers.
  • The GUI allows reports to be generated from measurements.

A subset of the measurements supported in DSR are listed in the following table. A detailed list of all KPIs supported in DSR can be found in the Platform Feature Guide found on the Oracle Help Center (OHC).

Table 2-3 DSR Measurements

Measurement Category Description
Application Routing Rules A set of measurements associated with the usage of application routing rules. These allow the user to determine which application routing rules are most commonly used and the percentage of times that messages were successfully or unsuccessfully routed.
Charging Proxy Application (CPA) Performance This group contains measurements that provide performance information that is specific to the CPA application.
Charging Proxy Application Exception These measurements provide information about exceptions and unexpected messages and events that are specific to the CPA application.
Charging Proxy Application Session DB These measurements provide information about events that occur when the CPA queries the SBR.
Computer Aided Policy Making (CAPM) A set of measurements containing usage-based measurements related to the Diameter Mediation feature.
Communication Agent Performance This group is a set of measurements that provide performance information that is specific to the ComAgent protocol. They allow the user to determine how many messages are successfully forwarded and received to and from each DSR application.
Communication Agent Exception This group is a set of measurements that provide information about exceptions and unexpected messages and events that are specific to the ComAgent protocol.
Connection Congestion These measurements contain per-connection measurements related to Diameter connection congestion states.
Connection Exception These measurements provide information about exceptions and unexpected messages and events for individual SCTP/TCP connections that are not specific to the Diameter protocol.
Connection Performance This group contains measurements that provide performance information for individual SCTP/TCP connections that are not specific to the Diameter protocol.
DSR Application Exception A set of measurements that provide information about exceptions and unexpected messages and events that are specific to the DSR protocol.
DSR Application Performance A set of measurements that provide performance information that is specific to the DSR protocol. These allow the user to determine how many messages are successfully forwarded and received to and from each DSR application.
Diameter Egress Transaction These are measurements providing information about Diameter peer-to-peer transactions forwarded to upstream peers.
Diameter Exception A set of measurements that provide information about exceptions and unexpected messages and events that are specific to the Diameter protocol.
Diameter Ingress Transaction Exception These measurements provide information about exceptions associate with the routing of Diameter transactions received from downstream peers.
Diameter Ingress Transaction Performance A set of measurements providing information about the outcome of Diameter transactions received from downstream peers.
Diameter Performance Measurements that provide performance information that is specific to the Diameter protocol.
Diameter Rerouting These measurements allow the user to evaluate the amount of message rerouting attempts which are occurring, the reasons for why message rerouting is occurring, and the success rate of message rerouting attempts.
Full Address Based Resolution (FABR) Application Performance A set of measurements that provide performance information that is specific to the FABR feature. They allow the user to determine how many messages are successfully forwarded and received to and from the FABR application.
Full Address Based Resolution (FABR) Application Exception A set of measurements that provide information about exceptions and unexpected messages and events that are specific to the FABR feature.
IP Front End (IPFE) Exception This group is a set of measurements that provide information about exceptions and unexpected messages and events specific to the IPFFE application.
IP Front End (IPFE) Performance This group contains measurements that provide performance information that is specific to the IPFE application. Counts for various expected/normal messages and events are included in this group.
Message Copy These measurements from the Diameter Application Server reflect the message copy performance. They allow the user to monitor the amount of traffic being copied and the percentage of times that messages were successfully or unsuccessfully copied.
Message Priority This group contains measurements that provide information on message priority assigned to ingress Diameter messages.
Message Processor (MP) Performance These measurements provide performance information for an MP server.
OAM Alarm General measurements about the alarm system such as number of critical, major, and minor alarms.
OAM System General measurements about the overall OAM system
Peer Node Performance Measurements that provide performance information that is specific to a Peer Node. These measurements allow users to determine how many messages are successfully forwarded and received to/from each peer node.
Peer Routing Rules These are measurements associated with the usage of peer routing rules. They allow the user to determine which peer routing rules are most commonly used and the percentage of times that messages were successfully or unsuccessfully routed using the route list.
Range Based Address Resolution (RBAR) Application Performance A set of measurements that provide performance information that is specific to the RBAR application. They allow the user to determine how many messages are successfully forwarded and received to/from each RBAR application.
Range Based Address Resolution (RBAR) Exception A set of measurements that provide information about exceptions and unexpected messages and events that are specific to the RBAR feature
Route List A set of measurements associated with the usage of route lists. They allow the user to determine which route lists are most commonly used and the percentage of times that messages were successfully or unsuccessfully routed using the route list.
Routing Usage This report allows the user to evaluate how ingress request messages are being routed internally within the relay agent.
Session Binding Repository (SBR) Exception A set of measurements that provide information about exceptions and unexpected messages and events specific to the SBR application.
Session Binding Repository (SBR) Performance This group contains measurements that provide performance information that is specific to the SBR application. Counts for various expected / normal messages and events are included in this group.

Bulk Import/Export

DSR supports bulk import and export of provisioning and configuration data using comma separated values (csv) file format. The import and export operations can be initiated from the DSR GUI. The import operation supports insertion, updating, and deletion of provisioned data. Both the import and export operations generate log files.