3 Benchmarking Cloud Deployable DSR

This chapter is divided into the following sections:
  • Infrastructure Environment

    This section provides details of the infrastructures used for the benchmark testing, including the hardware and software. It also describes key settings and attributes, and some recommendations on configuration.

  • Benchmark section for each DSR server type

    Each DSR server type is treated independently for benchmarking. Each section describes the traffic setup, and the observed results. It also provides metrics and guidelines for assessing performance on any infrastructure.

Data Usage

This data is intended to provide guidance. Recommendations may need to be adapted to the conditions in a given operator’s network. Each of the following sections include metrics that provide feedback on the running performance of the application.

When planning to deploy a DSR into any cloud environment, a few steps are recommended:
  • Understand the initial deployment scenario for the DSR.
    • Which features are planned?
    • How much of what type of traffic?

      This may change once deployed, and the DSR can be grown or shrunk to meet the changing needs.

  • Use the DSR Cloud Dimensioning tool to get an estimate of the types of DSR virtual servers needed and an initial estimate of the quantity of the virtual machines and resources. Oracle Sales Consultant can run this tool based on DSR requirements:
    • The tool allows for a very detailed model to be built of your DSR requirements, including:
      • Required MPS by Diameter Application ID (S6a, Sd, Gx, Rx, so on).
      • Required DSR applications such as Full Address Based Resolution (FABR), Range Based Address Resolution (RBAR), Policy DRA (PDRA), and any required sizing information such as the number of subscribers supported for each application.
      • Any required DSR features such as Topology Hiding, Message Copy, IPSEC, or Mediation that can affect performance.
      • Network-level redundancy requirements, such as mated pair DSR deployments, where one DSR needs to support full traffic, when one of the DSRs is unavailable.
      • Infrastructure information, such as OpenStack or KVM, and Server parameters.
    • The tool then generates a recommended number of VMs for each of the required VM types.

    Note:

    These recommendations are just guidelines. Since the actual performance of the DSR can vary significantly based on the details of the infrastructure.
  • Based on the initial deployment scenario, determine if additional benchmarking is warranted:
    • For labs and trials, there is no need to benchmark performance and capacity if the goal of the lab is to test DSR functionality.
    • If the server hardware is different from the hardware used in this document then the performance differences can likely be estimated using industry standard metrics. This is done by comparing single-threaded processor performance of the CPUs used in this document with respect to the CPUs used in the customer’s infrastructure. This approach is most accurate for small differences in hardware (for instance, different clock speeds for the same generation of Intel processors) and least accurate across processor generations where other architectural differences such as networking interfaces could also affect the comparison.
    • It is the operator’s decision to determine if additional benchmarking in the operator’s infrastructure is desired. Here is a few things to consider when deciding:
      • Benchmark infrastructure is similar to the operator’s infrastructure, and the operator is satisfied with the benchmark data provided by Oracle.
      • Initial turn-up of the DSR is handling a relatively small amount of traffic and the operator prefers to measure and adjust once deployed.
      • Operator is satisfied with the high-availability and geo-diversity of the DSR, and is willing to risk initial overload conditions, and adjusts once the DSR is in production.
  • If required, perform benchmark testing on the target cloud infrastructure. Perform benchmark only on those types of DSR servers required for the deployment.

    For example, if full address resolution is not planned, do not waste time benchmarking the SDS, SDS SOAM, or DPs.

    • When the benchmark testing is complete, observe the data for each server type, and compare it with the baseline used for the estimate from the DSR Cloud Dimensioning tool.
      • If the performance estimate for a given DSR function is X and the observed performance is Y, then adjust the performance for that DSR function to Y.
      • Re-calculate the resources needed for deployment based on the updated values.
  • Deploy the DSR.
  • Monitor the DSR performance and capacity as described later in the document. As the network changes additional resources may be required. If needed, increase the DSR resources as described later in this document.

3.1 Infrastructure Environment

This section describes the infrastructure that was used for benchmarking. In general, the defaults or recommendations for hypervisor settings are available from the infrastructure vendors.

Whenever possible the DSR recommendations align with vendor defaults and recommendations. Benchmarking was performed with the settings described in this section. Operators may choose different values, better or worse performance compared to the benchmarks might be observed. When recommendations other than vendor defaults or recommendations are made, additional explanations are included in the applicable section.

There is a sub-section included for each infrastructure environment used in benchmarking.

3.1.1 General Rules for All Infrastructures

3.1.1.1 Hyper-Threading and CPU Over-Subscription
All of the tests were conducted with Hyper-Threading enabled, and with a 1:1 subscription ratio for vCPUs in the hypervisor. The hardware used for the testing were dual-processor servers with 32 physical cores each (Oracle X9-2). Thus, each server had:
(2 CPUs) x (32 cores per CPU) x (2 threads per core) = 128 vCPUs

It is not recommended to use over-subscribed vCPUs (for instance 4:1) in the hypervisor. Not only is the performance lower, but it makes the performance more dependent on the other loads running on each physical server.

Turning off Hyper-Threading is also not recommended. There is a small increase in performance of a given VM without Hyper-Threading for a given number of vCPUs. But since the number of vCPUs for each processor drops in half without Hyper-Threading, the overall throughput for each server also drops almost by half.

The vCPU sizing for each VM is provided in the DSR VM Configurations section.

Note:

The recommended configuration is: Hyper-Threading is enabled with 1:1 CPU subscription ratio.

CPU Technology

The CPUs in the servers used for the benchmarking were the Oracle X9-2. Servers with different processors does give different results. In general there are the following issues when mapping the results of the benchmarking data in this document to other CPUs:
  • The per-thread performance of a CPU is the main attribute that determines VM performance. The number of threads is fixed in the VM sizing as shown in DSR VM Configurations section. A good metric for comparing the per-thread performance of different CPUs is the integer performance measured by the SPECint2006 (CINT2006) defined by SPEC.ORG.

    The mapping of SPECint2006ratios to DSR VM performance ratios isn’t exact, but it’s a good measure to determine whether a different CPU is likely to run the VMs faster or slower than the benchmark results in this document.

    Conversely CPU clock speeds are a relatively poor indicator of relative CPU performance. Within a given Intel CPU generation (v2, v3, v4, so on) there are other factors that affect per-thread performance, such as potential turbo speeds of the CPU in comparison with the cooling solution in a given server.

    Comparing between Intel CPU generations, there is a generation over generation improvement of CPU throughput in comparison with the clock speed. This means that even a newer generation chip with a slower clock speed may run a DSR VM faster.

  • The processors must have enough cores that a given VM can fit entirely into a NUMA node. Splitting a VM across NUMA nodes greatly reduces the performance of that VM. The largest VM size (refer DSR VM Configurations section) is 18 vCPUs. Thus, the smallest processor that should be used is a 9-core processor. Using processors with more cores typically makes it easier to pack VMs more efficiently into NUMA nodes but should not affect individual VM CPU-related performance otherwise.
  • One caveat about CPUs with very high core counts is that the user must be aware of potential bottlenecks caused by many VMs contending for shared resources such as network interfaces and ephemeral storage on the server. These tests were run on relatively large CPUs (32 physical cores for each chip), and no such bottlenecks were encountered while running strictly DSR VMs. In clouds with VMs from other applications potentially running on the same physical server as DSR VMs, or in future processor generations with much higher core counts. This potential contention for shared server resources has to be watched closely.

Note:

The selected VM sizes should fit within a single NUMA node, for instance 9 physical cores for the VMs that required 18 vCPUs. Check the performance of the target CPU type against the benchmarked CPU using per-thread integer performance metrics.
3.1.1.2 VM Packing Rate

The DSR doesn’t require or use CPU pinning. Thus, the packing of the DSR VMs onto the physical servers is under the control of OpenStack using the affinity or anti-affinity rules given in DSR VM Configurations. Typically, the VMs do not fit exactly into the number of vCPUs available in each NUMA node, leaving some un-allocated vCPUs. The ratio of the allocated to the unallocated vCPUs is the VMPacking Ratio. For instance, on a given server if 102 out of 128 vCPUs on a server were allocated by the OpenStack, that server would have a packing ratio of ~80%. The achieved packing in a deployment depends on a lot of factors, including the mix of large VMs (DA-MPs, SBRs) with the smaller VMs, and whether the DSR is sharing the servers with other applications that have a lot or large or small VMs.

When planning the number of physical servers required for an DRS a target packing ratio of 80% is a good planning number. A packing ratio of 100% is hard to achieve and may affect the performance numbers shown in the benchmarks. Some amount of server capacity is necessary to run the Host OS for the VMs. While performing functions such as interrupt handling, a packing ratio of 95% or lower is desirable.

Note:

When planning for physical server capacity a packing ratio of 80% is a good guideline. Packing ratios of greater than 95% might affect the benchmark numbers since there aren’t sufficient server resources to handle the overhead of Host OSs.
3.1.1.3 Infrastructure Tuning
The following parameters should be set in the infrastructure to improve DSR VM performance. The instructions for setting them for a given infrastructure is including the DSR Cloud Installation Guide.
  • Txqueuelen: The default of 500 is too small. Recommendation is to set this parameter to 120000.
    • Tuned on the compute hosts.
    • Default value of 500 is too small. Our recommendation is to set to 120000. This increases the network throughput of a VM.
  • Ring buffer increase on the physical Ethernet interfaces: The default is too small. The recommendation is to set both receive and transmit values to 4096.
  • Multiqueue: Multiqueue should be enabled on any IPFE VMs to improve performance.

Note:

Refer to instructions in the DSR Cloud Installation Guide.

3.1.2 KVM (QEMU)/Oracle X9-2 – Infrastructure Environment

There are a number of settings that affect performance of the hosted virtual machines. A number of tests were performed to maximize the performance of the underlying virtual machines for the DSR application.

Host Hardware
  • Oracle Server X9-2
    • CPU Model: Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz
    • 2 CPUs
    • 32 physical cores per CPU
    • RAM: 768 GB
    • HDD: 3.8 TB of NVMe storage (with Software RAID-1 configured)
    • NIC
    • Oracle Quad Port 10G Base-T Adapter
Hypervisor
  • QEMU-KVM Version: QEMU 6.2.0, libvirt 8.0.0, API QEMU 8.0.0
3.1.2.1 Device Drivers

VirtIO is a virtualizing standard for network and disk device drivers where just the guest’s device driver knows it is running in a virtual environment and cooperates with the hypervisor. This enables guests to get high performance network and disk operations and gives most of the performance benefits of para-virtualization.

Vhost-net provides improved network performance over Virtio-net by totally by passing QEMU as a fast path for interruptions. The vhost-net runs as a kernel thread and interrupts with less overhead providing near native performance. The advantages of using the vhost-net approach are reduced copy operations, lower latency, and lower CPU usage.

Note:

The VirtIO driver was used for Test Bed setting.
3.1.2.2 BIOS Power Settings
Typical BIOS power settings (hardware vendor dependent, see relevant infrastructure hardware vendor documentation for details) provide three options for power settings:
  • Power Supply Maximum: The maximum power the available PSUs can draw.
  • Allocated Power: The power is allocated for installed and hot pluggable components.
  • Peak Permitted: The maximum power the system is permitted to consume.

Note:

Set to Allocated Power or equivalent for your Hardware vendor.

Disk Image Formats

The preferred disk image file formats available when deploying a KVM virtual machine:
  • QCOW2: Disk format supported by the QEMU emulator that can expand dynamically and supports Copy-on-write.
QCOW2 provides a number of benefits, such as:
  • Smaller file size, even on file systems which don’t support holes (such as, sparse files)
  • Copy-on-write support, where the image only represents changes made to an underlying disk image
  • Snapshot support, where the image can contain multiple snapshots of the images history

Test Bed Setting: QCOW2

3.1.2.3 Guest Caching Modes

The operating system maintains a page cache to improve the storage I/O performance. With the page cache, write operations to the storage system are considered completed after the data has been copied to the page cache. Read operations can be satisfied from the page cache if the data requested is in the cache. The page cache is copied to permanent storage using fsync. Direct I/O requests bypass the page cache. In the KVM environment, both the host and guest operating systems can maintain their own page caches, resulting in two copies of data in memory.

The following caching modes are supported for KVM guests:
  • Writethrough: I/O from the guest is cached on the host but written through to the physical medium. This mode is slower and prone to scaling problems. Best used for a small number of guests with lower I/O requirements. Suggested for guests that do not support a writeback cache (such as, Red Hat Enterprise Linux 5.5 and earlier), where migration is not needed.
  • Writeback (Selected): With caching set to writeback mode, both the host page cache and the disk write cache are enabled for the guest. Due to this, the I/O performance for applications running in the guest is good, but the data is not protected in a power failure. As a result, this caching mode is recommended only for temporary data where potential data loss is not a concern.
  • None: With caching mode set to none, the host page cache is disabled, but the disk write cache is enabled for the guest. In this mode, the write performance in the guest is optimal because write operations bypass the host page cache and go directly to the disk write cache. If the disk write cache is battery-backed, or if the applications or storage stack in the guest transfer data properly (either through fsync operations or file system barriers), then data integrity can be ensured. However, because the host page cache is disabled, the read performance in the guest would not be as good as in the modes where the host page cache is enabled, such as write through mode.
  • Unsafe: The host may cache all disk I/O, and sync requests from guest are ignored.
Caching mode None is recommended for remote NFS storage, because direct I/O operations (O_DIRECT) perform better than synchronous I/O operations (with O_SYNC). Caching mode None effectively turns all guest I/O operations into direct I/O operations on the host, which is the NFS client in this environment. Moreover, it is the only option to support migration.

Note:

For Test Bed Setting, set Caching Mode to Writeback.
3.1.2.4 Memory Tuning Parameters

Swappiness

The swappiness parameter controls the tendency of the kernel to move processes out of physical memory and onto the swap disk. Since disks are much slower than RAM, this can lead to slower response times for system and applications if processes are too aggressively moved out of memory.
  • vm.swappiness = 0: The kernel swaps only to avoid an out of memory condition.
  • vm.swappiness = 1: Kernel version 3.5 and over, as well as kernel version 2.6.32-303 and over; Minimum amount of swapping without disabling it entirely.
  • vm.swappiness = 10: This value is recommended to improve performance when sufficient memory exists in a system.
  • vm.swappiness = 60: Default
  • vm.swappiness = 100: The kernel swaps aggressively

Note:

For Test Bed Setting, set vm.swappiness to 10.

Kernel Same Page Merging

Kernel Same-page Merging (KSM), used by the KVM hypervisor, allows KVM guests to share identical memory pages. These shared pages are usually common libraries or other identical, high-use data. KSM allows for greater guest density of identical or similar guest operating systems by avoiding memory duplication. KSM enables the kernel to examine two or more already running programs and compare their memory. If any memory regions or pages are identical, KSM reduces multiple identical memory pages to a single page. This page is then marked copy-on-write. If the contents of the page is modified by a guest virtual machine, a new page is created for that guest.

This is useful for virtualization with KVM. When a guest virtual machine is started, it only inherits the memory from the host qemu-kvm process. Once the guest is running, the contents of the guest operating system image can be shared when guests are running the same operating system or applications. KSM allows KVM to request that these identical guest memory regions be shared.

KSM provides enhanced memory speed and utilization. With KSM, common process data is stored in cache or in main memory. This reduces cache misses for the KVM guests, which can improve performance for some applications and operating systems. Secondly, sharing memory reduces the overall memory usage of guests, which allows for higher densities and greater utilization of resources.

The following 2 services controls KSM:
  • KSM Service: When the KSM service is started, KSM shares up to half of the host system's main memory. Start the KSM service to enable KSM to share more memory.
  • KSM Tuning Service: The ksmtuned service loops and adjusts KSM. The ksmtuned service is notified by libvirt, when a guest virtual machine is created or destroyed

Note:

For Test Bed Setting, set KSM service to active and ensure ksmtuned service running on KVM hosts.

Zone Reclaim Mode

When an operating system allocates memory to a NUMA node, but the NUMA node is full, the operating system reclaims memory for the local NUMA node rather than immediately allocating the memory to a remote NUMA node. The performance benefit of allocating memory to the local node outweighs the performance drawback of reclaiming the memory. However, in some situations reclaiming memory decreases performance to the extent that the opposite is true. In other words, in these situations, allocating memory to a remote NUMA node generates better performance than reclaiming memory for the local node.

A guest operating system causes zone to reclaim in the following situations:
  • When you configure the guest operating system to use huge pages.
  • When you use KSM to share memory pages between guest operating systems.

Configuring huge pages and running KSM are both best practices for KVM environments. Therefore, to optimize performance in KVM environments, it is recommended to disable zone reclaim.

Note:

For Test Bed Setting, disable zone reclaim.

Transparent Huge Pages

Transparent huge pages (THP) automatically optimize system settings for performance. By allowing all free memory to be used as cache, performance is increased.

Note:

For Test Bed Setting, enable THP.

3.2 Benchmark Testing

The way the testing was performed and the benchmark test set-up is the same for each benchmark infrastructure. Each section describes the common set-up and procedures used to benchmark, and then the specific results for the benchmarks are provided for each benchmark infrastructure.

3.2.1 DA-MP Relay Benchmark

This benchmarking case illustrates conditions for an overload of a DSR DA MP.

3.2.1.1 Topology
The below figure illustrates the logical topology used for this testing. Diameter traffic is generated by an MME simulator and sent to an HSS simulator.

Figure 3-1 DA-MP Relay Testing Topology


DA-MP Relay Testing Topology

The dsr.cpu utilization can be further increased to higher levels by means of configuration changes with DOC/CL1/CL2 discards set to 0 and multi-queuing enabled on all hosts. With this configuration, it must be noted that all the discards are at one step CL3 for all incoming and outgoing messages.

3.2.1.2 Message Flow

The following figure illustrates the Message sequence for this benchmark case.

Figure 3-2 DA-MP Relay Message Sequence


DA-MP Relay Message Sequence

Table 3-1 Relay Performance Benchmarking on 16 DA-MPs DSR Setup

Scenario Call Flow Model DSR MPS Achieved DA-MP Flavor DA-MP Profile Avg Msg Size CPU Peak RAM Utilization Peak
Relay 100% Relay 288K

12 vCPU

(Regular)
30K_MPS 2.0 K 19% 35%
Relay (with Multique enable, configuration set to DOC/CL1/CL2 discards set to 0 and multi queuing enabled on all hosts) 100% Relay 576K

12 vCPU

(Regular)
40K_MPS_FABR 2.0 K 36% 32%
Relay 100% Relay 560K

18 vCPU

(Large)
35K_MPS 2.0 K 28% 26%
3.2.1.3 Indicative Alarms or Events

During benchmark testing the following alarms or events were observed when it reaches congestion.

Table 3-2 DA-MP Relay Alarms or Events

Number Severity Server Name Description
5007 Minor IPFE Out of Balance: Low Traffic statistics reveal that an application server is processing lower than average load.
5008 Minor IPFE Out of Balance: High Traffic statistics reveal that an application server is processing higher than average load and will not receive new connections.
22008 Info DA-MP Orphan Answer Response Received An answer response was received for which no pending request transaction existed resulting in the Answer message being discarded.
22201 Minor DA-MP MpRxAllRate DA-MP ingress message rate threshold crossed.
22221 Minor DA-MP Routing MPS Rate Message processing rate for this DA-MP is approaching or exceeding its engineered traffic handling capacity.
22225 Minor DA-MP MpRxDiamAllLen DA-MP diameter average ingress message length threshold crossed.

3.2.2 RBAR Benchmark

Range Based Address Resolution (RBAR) is a DSR-enhanced routing application that allows the routing of Diameter end-to-end transactions based on Diameter Application ID, Command Code, Routing Entity Type, and Routing Entity Addresses (range and individual) as a Diameter Proxy Agent.

A Routing Entity can be:
  • A User Identity:
    • International Mobile Subscriber Identity (IMSI)
    • Mobile Subscriber Integrated Services Digital Network (Number) (MSISDN)
    • IP Multimedia Private Identity (IMPI)
    • IP Multimedia Public Identity (IMPU)
  • An IP Address associated with the User Equipment:
    • IPv4 (based upon the full 32-bit value in the range of 0x00000000 to 0xFFFFFFFF)
    • IPv6-prefix (1 to 128 bits)
  • A general-purpose data type: UNSIGNED16 (16-bit unsigned value)
Routing resolves to a Destination that can be configured with any combination of a Realm and Fully Qualified Domain Name (FQDN); Realm-only, FQDN-only, or Realm and FQDN.

When a message successfully resolves to a destination, RBAR replaces the destination information (Destination-Host and/or Destination-Realm) in the ingress message with the corresponding values assigned to the resolved destination and forwards the message to the (integrated) Diameter Relay Agent for egress routing into the network.

3.2.2.1 Topology
The following figure illustrates the logical topology used for this testing. Diameter traffic is generated by an MME simulator and sent to an HSS simulator.

Figure 3-3 RBAR Testing Topology


RBAR Testing Topology

3.2.2.2 Message Flow
The following figure illustrates the Message sequence for this benchmark case.

Figure 3-4 DA-MP RBAR Message Sequence


DA-MP RBAR Message Sequence

Table 3-3 RBAR Performance Benchmarking on 16 DA-MPs DSR Setup

Scenario Call Flow Model DSR MPS Achieved DA-MP Flavor DA-MP Profile Avg Msg Size CPU Peak RAM Utilization Peak
RBAR 100% RBAR 256K 12 vCPU

(Regular)

30K_MPS 2.0 K 29% 30%
3.2.2.3 Indicative Alarms or Events
During benchmark testing the following alarms or events were observed when it reaches into congestion.

Table 3-4 DA-MP RBAR Alarms or Events

Number Severity Server Name Description
22008 Info DA-MP Orphan Answer Response Received An answer response was received for which no pending request transaction existed resulting in the Answer message being discarded.
22225 Minor DA-MP MpRxDiamAllLen DA-MP diameter average ingress message length threshold crossed.

3.2.3 Full Address Based Resolution (FABR - SDS) Capacity

The FABR application adds a Database Processor (DP) server to perform database lookups with a user defined key (IMSI, MSISDN, or Account ID and MSISDN or IMSI). If the key is contained in the database, the DP returns the realm and FQDN associated with that key. The returned realm and FQDN can be used by the DSR Routing layer to route the connection to the desired endpoint. Since there is additional work done on the DA-MP to query the DP, running the FABR application has an impact on the DA-MP performance. This section contains the performance of the DA-MP while running FABR as well as benchmark measurements on the DP itself.

3.2.3.1 Topology

Figure 3-5 SDS DP Testing Topology


SDS DP Testing Topology

SDS DB Details

The SDS database was first populated with subscribers. This population simulates real-world scenarios likely encountered in a production environment and ensure the database is of substantial size to be queried against.
  • SDS DB Size: 300 million routing entities (150 million MSISDNs or 150 million IMSIs)
  • AVP Decoded: User-Name for IMSI
SDS profile (Large) enhances the capacity of SDS FABR database to 780 million routing entities. The Large profile is defined in DSR VM Configurations based on the below 780 million entry configuration:
  • 260 million subscribers having 2 IMSI, 1 MSISDN = 780 million routing entities
  • IMSI = 15 bytes
  • MSISDN = 10 bytes
3.2.3.2 Message Flow

Figure 3-6 SDS DP Message Sequence


SDS DP Message Sequence

Table 3-5 SDS DP Performance Benchmarking using one SDS DP

Scenario Call Flow Model DP MPS Achieved DA-MP MPS Achieved DA-MP Flavor DA-MP Profile Avg Msg Size CPU Peak RAM Utilization Peak
FABR 100% FABR 80K 160K

12 vCPU

(Regular)
30K_MPS 400 23% 28%
3.2.3.3 Indicative Alarms or Events

Table 3-6 SDS DP Alarms or Events

Number Severity Server Name Description
19814 Info DA-MP Communication Agent Peer has not responded to heartbeat Communication Agent Peer has not responded to heartbeat.
19825 Major, Critical DA-MP Communication Agent Transaction Failure Rate The number of failed transactions during the sampling period has exceeded configured thresholds.
19832 Info DA-MP Communication Agent Reliable Transaction Failed Communication Agent Reliable Transaction Failed.
22004 Info DA-MP Maximum pending transactions allowed exceeded Routing attempted to select an egress Diameter connection to forward a message but the maximum number of allowed pending transactions queued on the Diameter connection is reached.
22008 Info DA-MP Orphan Answer Response Received An answer response was received for which no pending request transaction existed resulting in the Answer message being discarded.
22225 Minor DA-MP MpRxDiamAllLen DA-MP diameter average ingress message length threshold crossed.
22606 Info DA-MP Database or DB connection error FABR application received service notification indicating Database (DP) or DB connection (COM Agent) Errors (DP timeout, errors or COM Agent internal errors) for the sent database query.
31000 Critical DA-MP S/W Fault Program impaired by s/w fault

3.2.4 Full Address Based Resolution (FABR-UDR) Capacity

The FABR is a DSR application that provides an enhanced DSR routing capability to enable network operators to resolve the designated Diameter server (IMS HSS, LTE HSS PCRF, OCS, OFCS, and AAA) addresses based on individual user identity addresses in the incoming Diameter request messages. It offers enhanced functionalities with User Data Repository (UDR), which is used to store subscriber data. FABR routes the message as a Diameter Proxy Agent based on request message parameter content.

FABR use the services of the Diameter Plug-In for sending and receiving Diameter messages from or to the network. It uses Communication Agent to interact with off-board data repository (UDR) for address resolution. This section contains the performance of the DA-MP while running FABR.

3.2.4.1 Topology

Figure 3-7 FABR with UDR Testing Topology


FABR with UDR Testing Topology

UDR DB Details

The UDR database was first populated with subscribers. This population simulates real-world scenarios likely encountered in a production environment and ensure the database is of substantial size to be queried against.
  • UDR DB Size: Tested with 40 Million records
  • AVP Decoded: User-Name for IMSI
Following UDR profile is used for benchmarking.

Table 3-7 UDR Profile

vCPU RAM (GB) HDD (GB)
18 70 400
3.2.4.2 Message Flow

Figure 3-8 FABR with UDR Message Sequence


FABR with UDR Message Sequence

Table 3-8 FABR with UDR Performance Benchmarking on 16 DA-MPs DSR Setup

Scenario Call Flow Model DSR MPS Achieved DA-MP Flavor DA-MP Profile Avg Msg Size CPU Peak RAM Utilization Peak
FABR + Relay 70% FABR + 30% Relay

288K

(18K/MP)

12 vCPU

(Regular)
30K_MPS 2.0 K 38% 36%
3.2.4.3 Indicative Alarms or Events

Table 3-9 FABR with UDR Alarms or Events

Number Severity Server Name Description
19825 Minor/Major/Critical DA-MP Communication Agent Transaction Failure Rate The number of failed transactions during the sampling period has exceeded the configured thresholds.
19832 Info DA-MP Communication Agent Reliable Transaction Failed Failed transaction between servers result in normal maintenance actions, overload conditions, software failures, or equipment failures.
22008 Info DA-MP Orphan Answer Response Received An answer response was received for which no pending request transaction existed resulting in the Answer message being discarded.
22201 Minor DA-MP MpRxAllRate DA-MP ingress message rate threshold crossed.
22221 Minor DA-MP Routing MPS Rate Message processing rate for this DA-MP is approaching or exceeding its engineered traffic handling capacity.
22225 Minor DA-MP MpRxDiamAllLen DA-MP diameter average ingress message length threshold crossed.
22606 Info DA-MP Database or DB connection error FABR application received service notification indicating Database (DP) or DB connection (COM Agent) Errors (DP timeout, errors or COM Agent internal errors) for the sent database query.

3.2.5 vSTP MP

The vSTP-MP server type is a virtualized STP that supports M2PA, M3UA, and TDM. It can be deployed either with other DSR functionality as a combined DSR or vSTP, or as a standalone virtualized STP without any DSR functionality.

3.2.5.1 vSTP MP Benchmarking

The following table describes the feature wise vSTP MP benchmarking.

Table 3-10 Feature-wise vSTP MP Benchmarking

Scenario Call Flow Model vSTP MPS Achieved SS7-MP Flavor CPU Peak RAM Peak
SFAPP + MNP + GTT 2K MNP + 2K SFAPP + 6K GTT 18K/MP 8 vCPU 22 55
SFAPP + MNP + GFLEX + GTT 2K MNP + 2K SFAPP + 1K GFLEX + 4K GTT 18K/MP 8 vCPU 25 49
TIF + GTT 5K MNP + 10K GTT 20K/MP 8 vCPU 28 39
vMNP + GTT 5K MNP + 10K GTT 20K/MP 8 vCPU 28 43
GFLEX + GTT 5K MNP + 10K GTT 20K/MP 8 vCPU 27 61
INPQ + GTT 5K MNP + 10K GTT 20K/MP 8 vCPU 40 52
GTT + MTP Routing with MTP screening (M2PA & M3UA) 16K MPS 16K/MP 8 vCPU 56 50
GTT + MTP Routing 20K MPS 20K/MP 8 vCPU 56 42
vEIR 5K 5K/MP 8 vCPU 17 49
Elynx (E1/T1 Card) – GTT Relay 10K TDM + 10K GTT 20K/MP 8 vCPU 29 52
ENUM 5K 5K/MP 8 vCPU 9 26
DNS 10K 10K/MP 8 vCPU 2 25
vSTP – Home SMS MO-FSM AllowList + BlockList Traffic (10K + 10K)

10K/SS7

(2 MPs)

20K /Proxy MP
8 vCPU 46 45
Tracing (GTT) 9K (Tracing) 18K/MP 8 vCPU 45 48
Tracing (GTT) + Non-Tracing (GTT) 6K (Tracing) + 6K (GTT) 18K/MP 8 vCPU 40 44

Note:

  • For ENUM, new vENUM-MP is introduced. vENUM sends messages to UDR over ComAgent interface.
  • Default timer values are supported when vSTP is configured to operate at 10K MPS for each MP.
  • When vSTP is configured to operate at 20K MPS, then the t1Timer to t5Timer values has to be updated. For information about the updated timer values, see MMI API Specification.
  • MNP processing is equivalent to two messages and SFAPP processing is equivalent to four messages.
  • For tracing feature, the tracing is applied on all GTT traffic.

3.2.6 Policy DRA (PDRA) Benchmarking

The Policy DRA (PDRA) application adds two additional database components, the SBR(session) (SBR-s) and the SBR (binding) (SBR-b). The DA-MP performance was also measured since the PDRA application puts a different load on the DA-MP than either running Relay or FABR traffic. There are two sizing metrics when determining how many SBR-s or SBR-g server groups (for example, horizontal scaling units) are required. The first is the MPS traffic rate seen at the DA-MPs. This is the metric that is benchmarked in this document. The second factor is the number of bindings (SBR-b) or sessions (SBR-s) that can be supported. This session or binding capacity is set primarily by the memory sizing of the VM and is fixed at a maximum of 16 million per SBR from the DSR 8.3 release. The number of bindings and sessions required for a given network are customer dependent. But a good starting place for engineering is to assume:
  • The number of bindings is equal to the number of subscribers supported by the PCRFs.
  • The number of sessions is equal to number of subscribers times the number of IPCAN sessions required on average for each subscriber. For instance, a subscriber might have one IPCAN session for LTE, and one for VoLTE.

    Note:

    The number of sessions is equal to or greater than the number of bindings.
3.2.6.1 Topology

Figure 3-9 SBR Testing Topology


SBR Testing Topology

3.2.6.2 Message Flow

Figure 3-10 PDRA Message Sequence


PDRA Message Sequence

The following table shows the call model used for the testing. The message distribution is Oracle’s baseline benchmarking may differ significantly from customer distributions based on factors such as the penetration of LTE support in comparison with VoLTE support. The Traffic Details shows the configured PDRA options. For more details on these options, see Oracle Communications Diameter Signaling Router Policy and Charging Application User Guide.

Table 3-11 PDRA Test Call Model

Messages Traffic Details
Message Count Distribution Message Distribution
CCR-I, CCA-I 1 7.14%

Gx with MSISDN Alternative Key,

Gx Topology Hiding
100%
CCR-U, CCA-U 3 21.42% Gx Topology Hiding 100%
CCR-T, CCA-T 1 7.14% Gx Topology Hiding 100%
Gx RAR, RAA 3 21.42% Gx Topology Hiding 100%
AAR, AAA Initial 2 14.29% Rx Topology Hiding 100%
STR, STA 2 14.29% Rx Topology Hiding 100%
Rx RAR, RAA 2 14.29% Rx Topology Hiding 100%

Table 3-12 PDRA Performance Benchmarking on 16 DA-MPs DSR Setup

Scenario Call Flow Model SBR MPS Achieved DA-MP Flavor DA-MP Profile Avg Msg Size DA-MPCPU Peak RAM Utilization Peak

Single Server group

(1 SBR(s), 1 SBR(b))
100% PDRA 50K

12 vCPU

(Regular)
30K_MPS 600 21% 23%

Single Server group

(4 SBR(s), 4 SBR(b))
100% PDRA 200K

12 vCPU

(Regular)
30K_MPS 648 25% 23%
3.2.6.3 Indicative Alarms or Events

Table 3-13 PDRA Alarms or Events

Number Severity Server Name Description
19814 Info DA-MP Communication Agent Peer has not responded to heartbeat Communication Agent Peer has not responded to heartbeat.
19825 Major, Critical DA-MP Communication Agent Transaction Failure Rate The number of failed transactions during the sampling period has exceeded configured thresholds.
19832 Info DA-MP Communication Agent Reliable Transaction Failed Communication Agent Reliable Transaction Failed
22008 Info DA-MP Orphan Answer Response Received An answer response was received for which no pending request transaction existed resulting in the Answer message being discarded.
22328 Minor DA-MP IcRate Connection ingress message rate threshold crossed
22704 Info DA-MP Communication Agent Error Policy and Charging server to SBR server communication failure
22705 Info DA-MP SBR Error Response Received Policy and Charging server received response from SBR server indicating SBR errors
22714 Info SBR SBR RAR Initiation Error SBR encountered an error while processing PCA initiated RAR requests
22716 Info SBR SBR Audit Statistics Report SBR Audit Statistics Report
22718 Info DA-MP Binding Not Found for Binding Dependent Session Initiate Request Binding record is not found for the configured binding keys in the binding dependent session-initiation request message
22741 Info DA-MP Failed to route PCA generated RAR RAA with Unable To Deliver (3002) error is received at PCA for the locally generated RAR
31232 Critical, Minor DA-MP HA Late Heartbeat Warning High availability server has not received a message on specified path within the configured interval
31236 Major DA-MP HA Link Down High availability TCP link is down

3.2.7 Diameter Security Application (DSA) Benchmarking

Counter measures are applied for benchmarking, for ingress messages received from external foreign network and for egress messages sent to external foreign network. Different counter measure profiles can be created for different IPX or roaming partners by enabling or disabling counter measures individually for different IPX provider or roaming partner Diameter Peers.

Enabling DSA Application (DSA) Benchmarking

DSA application is enabled on DA-MP and it uses vUDR to store context information.

3.2.7.1 Topology

Figure 3-11 DSA Testing Topology


DSA Testing Topology

The following stateful and stateless counter measure application configuration and the modes of operations used in benchmarking tests.

Table 3-14 Stateful and Statelss Counter Measures

Application Configuration Data General Options Settings
Table Name Count of Configured Entries Options Values
AppCmdCst_Config 2 Opcodes Accounting Disabled
AppIdWL_Config 1 Max. UDR Queries per Message 5
AVPInstChk_Config 48 Max. Size of Application State 4800
Foreign_WL_Peers_Cfg_Sets 14 Logging of Vulnerable Messages Enabled
MCC_MNC_List 11    
MsgRateMon_Config 1    
Realm_List 6    
Security_Countermeasure_Config 19    
SpecAVPScr_Config 1 Application Threads
System_Config_Options 1 Request 6
TimeDistChk_Config 2000 Answer 4
TTL_Config 5 SbrEvent 4
VplmnORCst_Config 1 AsyncEvent 2
TimeDistChk_Country_Config 2    
TimeDistChk_Exception_List 0    
TimeDistChk_Continent_Config 15    
VplmnORCst_Config 1    
RealmIMSICst_Config 210    
Exception_Rule_Config 0    
All Exception Types Table
  • IMSI_Exception_Config
  • MCC_MNC_Exception_Config
  • Origin_Host_Exception_Config
  • Realm_Exception_Config
  • VPLMN_ID_Exception_Config
0    

Note:

The following error is received during performance run, if the call rate is more than 1.7k in each MP DSA:
UDR Internal Error: Create record failed. Error Code = SendError
This is caused due to comagent connection getting timeout due to ttl expired.
Communication Agent Reliable Transaction Failed}  .. GN_INFO/INF Failure reason = Time to live limit exceeded
To avoid this, run the following commands from Active DSR NOAM before running performance traffic:
iset -fvalue=400 ComAgtConfigParams where "name='IntraNe Maximum Timeout Value'"
iset -fvalue=3 ComAgtConfigParams where "name='Maximum Number Of Retries'"

Table 3-15 DSA Performance Benchmarking on 2 DA-MPs DSR Setup

Counter Measure (CM) Call Flow Model DSR MPS Achieved (Per DA-MP) DA-MP Flavor UDR Flavor DA-MP Profile Avg Msg Size CPU Peak RAM Utilization Peak
Previous_Location_Check 15% Vulnerable and 85% Non Vulnerable traffic 9.2K

12 vCPU

(Regular)
18vCPU 30K_MPS 2.0 K 29% 63%
Time_Distance_Check 15% Vulnerable and 85% Non Vulnerable traffic 9.8K

12 vCPU

(Regular)
18vCPU 30K_MPS 2.0 K 34% 63%
Source_Host_Validation_Hss 15% Vulnerable and 85% Non Vulnerable traffic 9.6K

12 vCPU

(Regular)
18vCPU 30K_MPS 2.0 K 34% 62%
Source_Host_Validation_Mme 15% Vulnerable and 85% Non Vulnerable traffic 10K

12 vCPU

(Regular)
18vCPU 30K_MPS 2.0 K 33% 63%
Message_Monitoring_Rate 15% Vulnerable and 85% Non Vulnerable traffic 10K

12 vCPU

(Regular)
18vCPU 30K_MPS 2.0 K 30% 63%
Session_Integrity_Validation_Chk 15% Vulnerable and 85% Non Vulnerable traffic 10K

12 vCPU

(Regular)
18vCPU 30K_MPS 2.0 K 39% 63%
All Stateful 15% Vulnerable and 85% Non Vulnerable traffic 7.02K

12 vCPU

(Regular)
18vCPU 30K_MPS 2.0 K 15% 30%
All Stateful + All stateless 15% Vulnerable and 85% Non Vulnerable traffic 5.25K

12 vCPU

(Regular)
18vCPU 30K_MPS 2.0 K 14% 31%

Indicative Alarms/Events

Table 3-16 Indicative Alarms/Events

Number Severity server Name Description
19825 Minor/Major/Critical DA-MP Communication Agent Transaction Failure Rate The number of failed transactions during the sampling period has exceeded configured thresholds.
19832 Info DA-MP Communication Agent Reliable Transaction Failed Failed transaction between servers result from normal maintenance actions, overload conditions, software failures, or equipment failures.
33308 Info DA-MP DCA to UDR Comagent Error DCA failed to send query to UDR due to ComAgent Error.
33446 Major DA-MP SrcHostValHssExecFailedAlrm Failed executing Source-Host-Validation-HSS business logic. Disable the CounterMeasure until the problem is resolved.
33449 Major DA-MP TimeDistChkExecFailedAlrm Failed executing Time-Distance Check business logic. Disable the CounterMeasure until the problem is resolved.

3.2.8 Rx-ShUDR-Application (RSA) Benchmarking

Rx-ShUDR Application (RSA) will be implemented using DCA framework and will be deployed at the Oracle Roaming DRA which is a virtual DSR. Oracle Roaming DRA (R-DRA) Virtual DSR establishes Diameter Rx connection with Oracle PCRF segments.

When deciding on whether a request should be processed, the Rx-ShUDR Application takes a number of aspects into consideration:

  • Sh Lookup parameter should be enabled for received Rx Client FQDN. It can be checked in the Rx_Client Table.
  • Sh UDR message should be created as per the content received in the Rx AAR message from MCPTT client and send it to the dedicated HSS via Core DRA.

Note:

If Sh Lookup parameter is disabled, AAR message should be forwarded to the RBAR dip for PCRF segment address resolution using the IPv6 prefixes from the received Rx AAR message.
  • RBAR (Range Based Address Resolution) activated on Roaming DRA (R-DRA), populated with IPv6 Prefixes mapped to PCRF segment FQDNs.

Topology

Figure 3-12 Topology


Topology

Message Flow

Figure 3-13 Message Flow


Messsage Flow

Figure 3-14 Sh lookup


SH lookup

Configuration

The following table describes the application configurations and the modes of operations used in benchmarking tests.

RSA Test Cal Model

Table 3-17 DSR common configuration

DSR Common Configuration parameters No of Records Rx Message originator Supported Diameter Message
No of DAMP's 2 MCPTT Client AAR-I/U, STR
No of Local Nodes 2 PCRF ASR, RAR
No of Peer Nodes 40 - -
No of MCPTT Client Connections 16 Rx Message originator Supported Message Type
No of PCRF connections 16 Rx ShUDR application Sh-UDR
No of HSS Connections 8 - -
Total No of connections 40 - -

Table 3-18 Application configuration

Table Name Counts Messages Distribution
Error_Action_Config 41 Rx AAR, AAA Initial 40%
PDNGWHost_PCRF_Mapping 7 Rx AAR, AAA Update 4%
Rx_Client 10 Rx STR, STA 13%
System_Config_Options 1 Rx ASR, ASA 13%
Topology_Hiding 3 Rx RAR, RAA 3%
- - Sh-UDR, Sh-UDA 27%

RSA Performance Benchmarking on 2 DA-MPs DSR Setup

Table 3-19 RSA Performance Benchmarking

Scenario Call Flow Model DSR MPS Achieved DA-MP Flavor DA-MP Profile Avg Msg Size CPU Peak RAM Utilization Peak
Use Case (A+B) scenario with 27% of the traffic having Sh_lookup Use Case (A + B) 30K (15K/MP) 12 vCPU (Regular) 30K_MPS 400 53% 24%

Indicative Alarms/Events

Table 3-20 Indicative Alarms/Events

Number Severity Server Name Description
22008 Info DA-MP Orphan Answer Response Received An Answer response was received for which no pending request transaction existed resulting in the Answer message being discarded.
33318 Major DA-MP DCA CreateAndSend Request Message Send Failed Failed While Sending a CreateAndSend Request Message.
33430 Major DA-MP RxShUDRAppExecFailedAlrm Failed executing Rx-ShUDR application business logic.
5002 Critical IPFE IPFE address configuration error An address pertaining to inter-IPFE state synchronization is configured incorrectly.
5003 Critical IPFE IPFE state sync run error Error syncing state data between IPFEs.
5011 Critical IPFE System or platform error prohibiting operation Error related to system misbehavior or platform misconfiguration, check traces for more information.
5012 Critical IPFE Signaling interface heartbeat timeout Heartbeats to monitor the liveness of a signaling interface have timed out.

3.2.9 Radius Benchmarking

Radius (Remote Authentication Dial In User Service) is an Authentication, Authorization and Accounting (AAA) protocol that is a predecessor to diameter. Radius is frequently used, specially in WLAN networks and 3G mobile data applications. DSR will be deployed in networks requiring support for both Diameter and Radius nodes as well in Radius only networks.

Radius has similarities to Diameter but is significantly different in many ways. Radius is primarily supported on DSR by a new connection layer called the Radius Connection Layer, while using the existing routing services of the Diameter Routing Layer (DRL) and the existing Diameter based message interface to or from the DRL.

Ingress radius request or response messages are encapsulated in Diameter Request or Answer messages respectively. Diameter Request message content is created by Radius Connection Layer based on a set of predefined rules using both configuration data and radius message content. Diameter answer message content is created by Radius Connection Layer based on a set of predefined rules using mostly the Diameter Request message content associated with the transaction.

Radius request message routing is based upon the associated Diameter Request message which encapsulates the Radius message, the user must be familiar with how the Diameter Request capsule is created so they can configure the DRL to route Radius request messages.

Diameter Routing Layer provides required information to Radius Connection Layer to enable forwarding of Radius messages to the peer.

The Radius Connection Layer prevents accidental routing of non Radius messages to a Radius connection due to misconfiguration.

Figure 3-15 Topology


Topology

Configuration

Following table describes the configurations used in benchmarking tests:

Table 3-21 Radius Configuration

Parameters Counts
Number of DAMP's 2
Number of Radius Client Connections for Access-Accept 16
Number of Radius Server Connections Access-Request 45
Number of Radius Client Connections Accounting-Accept 8
Number of Radius Server Connections Accounting-Request 45
Total Number of Connections 114
Number of Local Nodes 9
Number of Peer Nodes 114

Table 3-22 Diameter Configuration

Parameters Counts
Number of DAMP's 2
Number of DAMPs with fixed or floating connections 2
Number of IFEE 2
Number of TSA Defined 1
Number of DAMP in TSA Groups 2
Number of Initiator Connections 120
Number of Responder Connections 40
Total Number of Diameter Connections 160
Number of Local Nodes 2
Number of Peer Nodes 10

Table 3-23 Traffic Call Model

Traffic Type Distribution
Radius Access Traffic 5%
Radius Accounting Traffic 25%
Diameter Relay Traffic 70%

Table 3-24 Radius + Diameter Performance Benchmarking on 2 DA-MPs DSR Setup

Scenario Call Flow Model DSR MPS Achieved DA-MP Flavor DA-MP Profile Avg Msg Size CPU Peak RAM Utilization Peak
Radius + Diameter 30% Radius + 70% Diameter 36K (18K/MP) 12 vCPU (Regular) 30K_MPS 400 42% 28%

Table 3-25 Indicative Alarms/Events

Number Severity Server Name Description
22008 Info DA-MP Orphan Answer Response Received An Answer response was received for which no pending request transaction existed resulting in the Answer message being discarded.
22201 Minor DA-MP MpRxAllRate DA-MP ingress message rate threshold crossed.
22221 Minor DA-MP Routing MPS Rate Message processing rate for this DA-MP is approaching or exceeding its engineered traffic handling capacity