SNMP Monitoring

Simple Network Management Protocol (SNMP) polling (GET and SET) requests are used to retrieve operational data and modify configuration are supported by SBC. The SBC supports SNMPv1 and SNMPv2c for GET and SET requests. Oracle recommends using SNMPv3 as of the ECz8.0 release. Oracle release specific SNMP MIBs can be found on docs.oracle.com for the release in use.

Below is a recommended list of SNMP OIDs to GET every 5 minutes from the Oracle System Management MIB (ap-smgmt.mib). These will provide useful data on overall system performance and security issues.

apSysMgmtGeneralObjects (1.3.6.1.4.1.9148.3.2.1.1)

  • apSysCPUUtil (1.3.6.1.4.1.9148.3.2.1.1.1) - Percentage of CPU utilization
  • apSysMemoryUtil (1.3.6.1.4.1.9148.3.2.1.1.2) - Percentage of memory utilization
  • apSysHealthScore (1.3.6.1.4.1.9148.3.2.1.1.3) - System health percentage
  • apSysRedundancy (1.3.6.1.4.1.9148.3.2.1.1.4) - Active or Standby SD
  • apSysGlobalConSess (1.3.6.1.4.1.9148.3.2.1.1.5) - Total instant number of system concurrent sessions
  • apSysGlobalCPS (1.3.6.1.4.1.9148.3.2.1.1.6) - Instant number of system calls per second
  • apSysNATCapacity (1.3.6.1.4.1.9148.3.2.1.1.7) - Percentage of NAT table in CAM utilization
  • apSysARPCapacity (1.3.6.1.4.1.9148.3.2.1.1.8) - Percentage of ARP table in CAM utilization
  • apSysLicenseCapacity (1.3.6.1.4.1.9148.3.2.1.1.10) - Percentage of licensed sessions in use
  • apSysSipStatsActiveLocalContacts (1.3.6.1.4.1.9148.3.2.1.1.11) - Current number of cached SIP registered contacts
  • apSysApplicationCPULoadRate (1.3.6.1.4.1.9148.3.2.1.1.16) - Average load rate of applications over past 10 seconds
  • apSysSipEndptDemTrustToUntrust (1.3.6.1.4.1.9148.3.2.1.1.19) - Number of SIP endpoints demoted from trusted to untrusted queue
  • apSysSipEndptDemUntrustToDeny (1.3.6.1.4.1.9148.3.2.1.1.20) - Number of SIP endpoints demoted from untrusted queue to denied
  • apSysRejectedMessages (.1.3.6.1.4.1.9148.3.2.1.1.18.0)- Number of messages rejected by the SBC due to matching criteria

apSysStorageSpaceTable (1.3.6.1.4.1.9148.3.2.1.1.23), apSysStorageSpaceEntry (1.3.6.1.4.1.9148.3.2.1.1.23.1)

  • apSysVolumeAvailSpace (1.3.6.1.4.1.9148.3.2.1.1.23.1.4) - Space remaining on the Storage Expansion Module (in MB)

apSysMgmtInterfaceObjects (1.3.6.1.4.1.9148.3.2.1.8), apSysMgmtPhyUtilTable (11.3.6.1.4.1.9148.3.2.1.8.1)

  • apPhyUtilTableRxUtil (1.3.6.1.4.1.9148.3.2.1.8.1.1.1) - Received Network Interface utilization over one second period
  • apPhyUtilTableTxUtil (1.3.6.1.4.1.9148.3.2.1.8.1.1.2) - Transmitted Network Interface utilization over one second period

Realm Statistics

Below is a recommended list of SNMP OIDs to GET every 5 minutes from the Oracle System Management MIB (ap-smgmt.mib). These will provide useful SIP performance data on a per realm basis.

apSigRealmStatsTable (1.3.6.1.4.1.9148.3.2.1.2.4), apSigRealmStatsEntry (1.3.6.1.4.1.9148.3.2.1.2.4.1)

  • apSigRealmStatsRealmName (1.3.6.1.4.1.9148.3.2.1.2.4.1.2) - Realm name with corresponding stats
  • apSigRealmStatsCurrentActiveSessionsInbound (1.3.6.1.4.1.9148.3.2.1.2.4.1.3) - Number of active inbound sessions for this realm
  • apSigRealmStatsCurrentSessionRateInbound (1.3.6.1.4.1.9148.3.2.1.2.4.1.4) - CPS rate for active inbound sessions for this realm
  • apSigRealmStatsCurrentActiveSessionsOutbound (1.3.6.1.4.1.9148.3.2.1.2.4.1.5) - Number of active outbound sessions for this realm
  • apSigRealmStatsCurrentSessionRateOutbound (1.3.6.1.4.1.9148.3.2.1.2.4.1.6) - CPS rate for active outbound sessions for this realm
  • apSigRealmStatsTotalSessionsInbound (1.3.6.1.4.1.9148.3.2.1.2.4.1.7) - Total number of inbound sessions during the last 100 second sliding window period for this realm
  • apSigRealmStatsPeriodHighInbound (1.3.6.1.4.1.9148.3.2.1.2.4.1.9) - Highest number of concurrent inbound sessions during the last 100 second sliding window period for this realm
  • apSigRealmStatsTotalSessionsOutbound (1.3.6.1.4.1.9148.3.2.1.2.4.1.11) - Total number of outbound sessions during the last 100 second sliding window period for this realm
  • apSigRealmStatsPeriodHighOutbound (1.3.6.1.4.1.9148.3.2.1.2.4.1.13) - Highest number of concurrent outbound sessions during the last 100 second sliding window period for this realm
  • apSigRealmStatsMaxBurstRate (1.3.6.1.4.1.9148.3.2.1.2.4.1.15) - Maximum burst rate of traffic measured during the last 100 second sliding window period (combined inbound and outbound) for this realm
  • apSigRealmStatsPeriodASR (1.3.6.1.4.1.9148.3.2.1.2.4.1.18)- The answer-to-seizure ratio expressed as a percentage during the 100 second sliding window. For example, a value of 90 would represent 90% or 0.90
  • apSigRealmStatsRealmStatus (1.3.6.1.4.1.9148.3.2.1.2.4.1.30) - State of the specified realm (INS, constraintviolation, or callLoadReduction)

The same list of statistics is also available per Session Agent.

Environmental Statistics

Below is a recommended list of SNMP OIDs to GET every 5 minutes from the Oracle System Environment Monitor MIB (ap-env-monitor.mib). These will provide useful system environmental data.

apEnvMonObjects (1.3.6.1.4.1.9148.3.3.1)

  • apEnvMonI2CState (1.3.6.1.4.1.9148.3.3.1.1) - State of environmental sensor on system chassis. A value of 2 is normal, all others need to be investigated further [4].

apEnvMonTemperatureStatusEntry (1.3.6.1.4.1.9148.3.3.1.3.1.1)

  • apEnvMonTemperatureStatusValue (1.3.6.1.4.1.9148.3.3.1.3.1.1.4) - Current temperature of mainboard PROM (in Celsius).
  • apEnvMonTemperatureState (1.3.6.1.4.1.9148.3.3.1.3.1.1.5) - State of system temperature. A value of 2 is normal, all others need to be investigated further [4].

Enterprise SNMP OIDs in a High Availability (HA) environment

SNMP polling is recommended for both Active and Standby SDs. The data from both Active and Standby will be useful when troubleshooting. Also some data such as CPU, memory, interface stats, health score, etc is independent data between Active and Standby SDs.

Of particular interest is the mib-system-name field in the system-config object. This is used as part of the MIB-II sysName identifier. When performing a MIB-II sysName poll, an SBC will return a concatenation of its assigned target name (as specified in the boot params), a dot, and the common hostname from the system-config. Thus, systems in an HA pair named “acme1” and “acme2”, when assigned a mib-system-name of “sbc.bedford”, would return acme1.sbc.bedford and acme2.sbc.bedford when polled, respectively.

Enterprise SNMP Traps

SNMP traps enable an SNMP agent to notify the Network Management System (NMS) of significant events using an unsolicited SNMP message. The SBC can be configured to send unsolicited SNMP traps to a configured SNMP trap receiver [4] in SNMPv1, SNMPv2c and SNMPv3 formats. The snmp-agent-mode determines the trap format, and is set under system-config.

SNMP Configuration recommendations

Under the system-config element the following settings should be enabled to provide additional visibility to system events:

  • enable-snmp-auth-traps – sends a trap for a failed authentication as part of an SNMP request; used to detect abuse
  • enable-snmp-syslog-notify – enable syslog conversion to SNMP
  • enable-snmp-monitor-traps – enable unique trap-IDs for each syslog event

The SBC setting to enable environmental monitors may seem advantageous, but is not recommended, and should remain at the default – disabled. The same traps are already sent as part of the ap-smgmt MIB.

  • enable-env-monitor-traps – sends traps for environmental issues like temperature, voltage, fan speeds, etc.

SNMP Traps

The following are a selection of the most common or important traps sent by the SD. The full list of traps can be found in the MIB Reference Guide and MIB files for the release implemented:

  • apSwCfgActivateNotification (1.3.6.1.4.1.9148.3.4.3.0.1)- Generated when the activate-config command is issued at ACLI and configuration has been changed at run time. This trap may be seen often but is only informational and doesn’t necessarily mean there is an issue (unless the config changes are service affecting or the change was not authorized).
  • apEnvMonStatusChangeNotification (1.3.6.1.4.1.9148.3.3.2.1.0)- Generated when the environmental state of the SBC changes. Environment traps include main board PROM temperature, CPU voltage, state of power supplies, fan speeds, etc. To receive this trap, the system-config parameter enable-env-monitor-traps needs to be enabled. An example of this trap for voltage state change is found in [4].
  • apSysMgmtGroupTrap (1.3.6.1.4.1.9148.3.2.3.0.1)- Generated when an SBC resource threshold or health score is exceeded. For example, if NAT table usage, ARP table usage, memory usage, or CPU usage reaches 90% or greater. Also, If the health score for an HA pair falls below 60.
  • apSysMgmtExpDOSTrap (1.3.6.1.4.1.9148.3.2.8.0.2)- Generated when an endpoint exceeds configured thresholds and is denied access by the SD.
  • apSysMgmtInetAddrWithReasonDOSTrap (1.3.6.1.4.1.9148.3.2.8.0.4)- This trap is generated when the IDS Reporting Feature Group license (available in S-CX6.3 and above) is installed. This trap is generated when thresholds are exceeded, and contains further data on the end-point and reason why the trap was generated. When IDS Reporting is installed the apSysMgmtExpDOSTrap is disabled.
  • apSysMgmtInetAddrTrustedToUntrustedDOSTrap (1.3.6.1.4.1.9148.3.2.8.0.5.)- This trap is available in S-C[xz]6.4.0 and above. It will be generated when the number of rejected messages exceeds the configured threshold and the endpoint is demoted from the trusted to untrusted list. The trap-on-demote-to-untrusted setting under media-manager must be enabled for this trap to be sent.
  • apSysMgmtRejectedMesagesThresholdExeededTrap (.1.3.6.1.4.1.9148.3.2.6.0.57)- This trap is available in S-C[xz]6.4.0 and above. A trap will be generated when the number of rejected messages exceed the configured threshold and the endpoint is put into the untrusted queue.
  • apSysMgmtSipRejectionTrap (.1.3.6.1.4.1.9148.3.2.10.0.1)- Generated when a SIP INVITE or REGISTRATION request fails
  • apSysMgmtPowerTrap (1.3.6.1.4.1.9148.3.2.6.0.1)- Generated if a power supply is powered down, powered up, inserted (present) or removed (not present).
  • apSysMgmtTempTrap (1.3.6.1.4.1.9148.3.2.6.0.2)- Generated if the system temperature falls below the monitoring level.
  • apSysMgmtFanTrap (1.3.6.1.4.1.9148.3.2.6.0.3)- Generated if a fan unit speed falls below the monitoring level.
  • apSysMgmtTaskSuspendTrap (1.3.6.1.4.1.9148.3.2.6.0.4)- Generated if a critical task running on the system enters a suspended state.
  • apSysMgmtRedundancyTrap (1.3.6.1.4.1.9148.3.2.6.0.5)- Generated if either the primary or secondary SBC in a HA pair changes state.
  • apSysMgmtMediaPortsTrap (1.3.6.1.4.1.9148.3.2.6.0.6)- Generated if port allocation fails at a percentage higher or equal to the system’s default threshold rate. Port allocation failure rates are checked every 30 seconds. The trap is sent when the failure rate is at 50% or higher. After that time, the trap is sent every 30 seconds until the failure rate drops below 35%. The clear trap is sent once the failure rate drops below 5%.
  • apSysMgmtMediaBandwidthTrap (1.3.6.1.4.1.9148.3.2.6.0.7)- Generated if bandwidth allocation fails at a percentage higher or equal to the system’s default threshold rate. Bandwidth allocation failure rates are checked every 30 seconds. The trap is sent when the failure rate is at 50% or higher. After that time, the trap is sent every 30 seconds until the failure rate drops below 35%. The clear trap is sent once the failure rate drops below 5%.
  • apSysMgmtPhyUtilThresholdTrap (1.3.6.1.4.1.9148.3.2.6.0.66)- Generated when the media port’s utilization crosses a configured threshold. If overload protection is enabled, new requests will be refused when the threshold reaches a critical value. Thresholds can be configured for minor, major, and critical.
  • apSysMgmtGatewayUnreachableTrap (1.3.6.1.4.1.9148.3.2.6.0.10)- Generated if the SBC cannot reach a configured gateway. Only applicable when gateway heartbeat feature is configured [7].
  • apSysMgmtRadiusDownTrap (1.3.6.1.4.1.9148.3.2.6.0.11)- Generated if any configured RADIUS accounting server becomes unreachable.
  • apSysMgmtSAStatusChangeTrap (1.3.6.1.4.1.9148.3.2.6.0.15)- Generated when a session agent is declared unreachable or unresponsive for the following reasons:
    • signaling timeout (H.323 and SIP)
    • session agent does not respond to SIP pings (SIP only)- This causes the session agent to be placed out-of-service for a configurable period of time.
  • apSysMgmtInterfaceStatusChangeTrap (1.3.6.1.4.1.9148.3.2.6.0.26)- Generated when the SIP interface status changes from in service or constraints have been exceeded
    • apSysMgmtSipInterfaceRealmName — Realm identifier for the SIP interface (OID 1.3.6.1.4.1.9148.3 .2.5.24)
    • apSysMgmtSipInterfaceIP — IP address of the first SIP port in the SIP interface (OID 1.3.6.1.4.1.9148.3.2.5.25)
    • apSysMgmtSipInterfaceStatus — Code is 0 (OID 1.3.6.1.4.1.9148.3.2.5.26)
    • apSysMgmtSipInterfaceStatusReason — Status reasons are in-service (3) and constraintExceeded (4) (OID 1.3.6.1.4.1.9148.3.2.5.27)
  • apSysMgmtNTPServerUnreachableTrap (1.3.6.1.4.1.9148.3.2.6.0.30)- Generated if the NTP server becomes unreachable.
    • apSysMgmtNTPServer—Server that is unreachable (OID 1.3.6.1.4.1.9148.3.2.5.31)
  • apLicenseApproachingCapacityNotification (1.3.6.1.4.1.9148.3.5.3.0.1)- Generated when the total number of active sessions on the system (across all protocols) is within 98 - 100% of the licensed capacity
  • apSysMgmtAuthenticationFailedTrap (1.3.6.1.4.1.9148.3.2.6.0.16)- Generated when an attempt to login to the SBC through telnet, SSH, or by using the console fails for any reason
  • apSysMgmtAdminAuthLockoutTrap (1.3.6.1.4.1.9148.3.2.6.0.64)- Generated upon system lockout after multiple authentication failures.

SNMP Traps in HA environment

Once the trap-receiver has been configured, the SBC will monitor and send a trap according to the configured filter-level value. The same trap receiver is used by both units in an HA pair since the configuration is synchronized between the two. Furthermore, the Active or Standby SBC will send a trap independently if it is related to the hardware, interface status, gateway reachability, temperature, etc.