Go to primary content
Diameter Signaling Router Policy and Charging Application
Release 8.2
E89000
Go To Table Of Contents
Contents

Previous
Previous
Next
Next

PCA Data Auditing

P-DRA Binding/Session Database

In most cases, Binding and Session database records are successfully removed as a result of signaling to terminate Diameter sessions. There are, however, instances in which signaling incorrectly removed a session and did not remove a database record that should have been removed. Various cases can result in stale Binding or Session records:
  • No Diameter session termination message is received when the UE no longer wants the session.
  • IP signaling network issues prevent communication between MPs that would have resulted in one or more records being deleted.
  • SBR congestion could cause stack events to be discarded that would have resulted in removal of a Binding or Session record.

To limit the effects of stale Binding and Session records, all SBRs that own an active part of the database continually audit each table to detect and remove stale records. The audit is constrained by both minimum and maximum audit rates. The actual rate varies based on how busy the SBR server is. Audit has no impact on the engineered rate of signaling.

Generally, SBR servers are engineered to run at 80% of maximum capacity. The audit is pre-configured to run within the 20% of remaining capacity. Audit will yield to signaling. Audit can use the upper 20% only if signaling does not need it.

Binding table audits are confined to confirming with the Session SBR that the session still exists. If the session exists, the record is considered valid and the audit makes no changes. If the session does not exist, however, the record is considered to be an orphan and is removed by the audit. The Binding Audit Session Query Rate is the maximum rate at which a Binding SBR can send query messages to session servers to verify that sessions are still valid. This audit rate is configurable from the Policy and Charging > Configuration > General Options page so that the audit maximum rate can be tuned according to network traffic levels.

Session table audits work entirely based on valid session lifetime. When a session is created, it is given a lifetime for which the session will be considered to be valid regardless of any signaling activity. Each time an RAA is processed, the lifetime is renewed for a session. The duration of the lifetime defaults to 7 days, but can be configured in one of two ways:
  • A session lifetime can be configured per Access Point Name using the Policy and Charging > Configuration > Access Point Names page.
  • The Audit Operation Rate is configurable (with a default of 50,000) from the Policy and Charging > Configuration > General Options page and depends on if Session or Binding SBRs are being audited:
    • For Session SBRs, the maximum rate at which Diameter sessions are checked for staleness
    • For Binding SBRs, the maximum rate at which binding session references are examined, if not already throttled by the Binding Audit Session Query Rate
If the SBR signaling load plus the audit load cause an SBR server to exceed 100% capacity, that SBR server will report congestion, which will cause an automatic suspension of auditing. Any SBR on which audit is suspended will have minor alarm 22715 to report the suspension. The alarm is cleared only when congestion abates.
  • Local congestion refers to congestion at the SBR server that is walking through Binding or Session table records. Suspension of audit due to Local Congestion applies to both the binding audit and session audit.
  • Remote congestion refers to congestion at one of the Session SBR servers that a Binding SBR server is querying for the existence of session data (using sessionRef). Suspension of audit due to Remote Congestion only applies to binding SBR servers because only binding SBRs send stack events to session SBR servers, while session SBR servers do not.
  • Enhanced Suspect Binding Rate Control can also cause congestion. Suspension of audit due to Enhanced Suspect Binding Rate Control applies only to the binding audit.
When an SBR server starts up (for example, SBR process starts), or when an SBR's audit resumes from being suspended, the audit rate ramps up using an exponential slow-start algorithm. The audit rate starts at 1500 records per second and is doubled every 10 seconds until the configured maximum audit rate is reached.

Note:

If the binding audit resumes after a recovery from remote congestion, the slow-start algorithm is not applied.

In addition to the overall rate of record auditing, the frequency at which a given table audit can be started is also controlled. This is necessary to avoid needless frequent auditing of the same records when tables are small and can be audited quickly. A given table on an SBR server will be audited no more frequently than once every 10 minutes.

In order to have some visibility into what the audit is doing, the audit generates Event 22716 SBR Audit Statistics Report with audit statistics at the end of each pass of a table. The format of the report varies depending on which table the audit statistics are being reported for.

PCA Configuration Database

A number of Policy and Charging configuration database tables, for example, PCRFs, Policy Clients, OCSs and CTFs are configured at the SOAM but contain data that are required network-wide. The site-wide portions of the data are stored at the SOAM servers. The network-wide portions of the data are stored globally at the NOAM. Due to the distributed nature of this data (the split between SOAM and NOAM), there is a PCA Configuration Database Audit which executes in the background to verify that all the related configuration tables for this data are in sync between SOAMs and the NOAM.

The PCA Configuration Database Audit executes on the SOAM periodically every 30 seconds in the background and will audit all the related configuration tables between SOAM and NOAM for PCRFs, Policy Clients, OCSs and CTFs. If the audit detects that there are any discrepancies among these tables, it will automatically attempt to resolve the discrepancies and validate that they are back in sync.

The configuration database can get out of sync due to a database transaction failure or due to operator actions. If an operator performs a database restore at the NOAM using a database backup that does not have all the network-wide data corresponding to the current SOAM configuration, then the database will not be in sync between SOAM and NOAM. Similarly, if an operator performs a database restore at an SOAM using a database backup that does not have the configuration records corresponding to network-wide data stored at the NOAM, then the database again will not be in sync. The audit is designed to execute without operator intervention and correct these scenarios where configuration data is not in sync between SOAM and NOAM.

If the audit fails to correct the database tables, the audit will assert Alarm 22737 (Configuration Database Not Synced). The audit continues to execute periodically every 30 seconds to attempt to correct the database tables. If the audit successfully corrects and validates the tables during an audit pass, it will clear Alarm 22737.

Note:

All statements about database tables in this section only apply to configuration tables related to PCRFs, Policy Clients, OCSs and CTFs because the PCA Configuration Database Audit executes only on the database tables where it is necessary for the data to be split across SOAM and NOAM.

OC-DRA Session Database

The Session Database Audit is enhanced to detect and remove stale binding independent session (for example, Gy/Ro session) data stored in the Session SBR. Session state maintained in the Session SBR for Gy/Ro session-based credit-control is considered stale when a CCR/CCA-U or RAR/RAA has not been exchanged for the session for a length of time greater than or equal to the Stale Session Timeout value (in hours) as configured by the NOAM GUI. If the binding independent session is associated with an APN configured on the NOAM Main Menu > Policy and Charging > Configuration > Access Point Names page, then the Stale Session Timeout value associated with the APN is used. Otherwise, the default Stale Session Timeout value configured in the Network OAM Main Menu > Policy and Charging > Configuration > General Options page is used.

Stale Gy/Ro sessions can occur for various reasons:

  • OC-DRA did not receive the Diameter Credit-Control Session Termination Request (CCR-T) message from the OCS when the Gy/Ro session was to be terminated due to IP signaling network issues.
  • Session SBR did not receive the findAndRemoveOcSession stack event from OC-DRA to find and remove the Gy/Ro session due to IP signaling network issues.
  • Session SBR received the findAndRemoveOcSession stack event from OC-DRA, but discarded it due to congestion.
  • Session SBR database access errors
  • Internal software errors