Overview

To produce seamless switchovers from one Oracle Communications Session Border Controller to the other, the HA node uses shared virtual MAC and virtual IPv4 or IPv6 addresses for the media interfaces in a way that is similar to VRRP (virtual router redundancy protocol). When there is a switchover, the standby Oracle Communications Session Border Controller sends out gratuitous Address Resolution Protocol (ARP) or Network Discovery Protocol (NDP) messages using the virtual MAC address, establishing that MAC on another physical port within the Ethernet switch. To the upstream router, the MAC and IP are still alive, meaning that existing sessions continue uninterrupted.

Note:

NDP is an equivalent to ARP for IPv6.

Within the HA node, the Oracle Communications Session Border Controllers advertise their current state and health to one another in checkpointing messages; each system is apprised of the other’s status. Using Oracles HA protocol, the Oracle Communications Session Border Controllers communicate with UDP messages sent out and received on the rear interfaces.

The standby Oracle Communications Session Border Controller shares virtual MAC and IP addresses for the media interfaces (similar to VRRP) with the active Oracle Communications Session Border Controller. Sharing addresses eliminates the possibility that the MAC and IP address set on one Oracle Communications Session Border Controller in an HA node will be a single point of failure. The standby Oracle Communications Session Border Controller sends ARP or NDP requests using a utility IP address and its hard-coded MAC addresses to obtain Layer 2 bindings.

The standby Oracle Communications Session Border Controller assumes the active role when:

  • It has not received a checkpoint message from the active Oracle Communications Session Border Controller for a certain period of time.
  • It determines that the active Oracle Communications Session Border Controller’s health score has decreased to an unacceptable level.
  • The active Oracle Communications Session Border Controller relinquishes the active role.

Establishing Active and Standby Roles

Oracle Communications Session Border Controller s establish active and standby roles in the following ways.

  • If a Oracle Communications Session Border Controller boots up and is alone in the network, it is automatically the active system. If you then pair a second Oracle Communications Session Border Controller with the first to form an HA node, then the second system to boot up will establish itself as the standby automatically.
  • If both Oracle Communications Session Border Controller s in the HA node boot up at the same time, they negotiate with each other for the active role. If both systems have perfect health, then the Oracle Communications Session Border Controller with the lowest HA rear interface IPv4 address will become the active Oracle Communications Session Border Controller . The Oracle Communications Session Border Controller with the higher HA rear interface IPv4 address will become the standby Oracle Communications Session Border Controller .
  • If the rear physical link between the two Oracle Communications Session Border Controller s fails during boot up or operation, both will attempt to become the active Oracle Communications Session Border Controller . In this case, processing will not work properly.

Health Score

HA Nodes use health scores to determine their active and standby status. Health scores are based on a 100-point system. When a Oracle Communications Session Border Controller is functioning properly, its health score is 100.

Generally, the Oracle Communications Session Border Controller with the higher health score is active, and the Oracle Communications Session Border Controller with the lower health score is standby. However, the fact that you can configure health score thresholds builds some flexibility into using health scores to determine active and standby roles. This could mean, for example, that the active Oracle Communications Session Border Controller might have a health score lower than that of the standby Oracle Communications Session Border Controller , but a switchover will not take place because the active Oracle Communications Session Border Controller ’s health score is still above the threshold you configured.

Alarms are key in determining health score. Some alarms have specific health score value that are subtracted from the Oracle Communications Session Border Controller ’s health score when they occur. When alarms are cleared, the value is added back to the Oracle Communications Session Border Controller ’s health score.

You can look at a Oracle Communications Session Border Controller ’s health score using the ACLI show health command.

Switchovers

A switchover occurs when the active Oracle Communications Session Border Controller stops being the active system, and the standby system takes over that function. There are two kinds switchovers: automatic and manual.

Automatic Switchovers

Automatic switchovers are triggered without immediate intervention on your part. Oracle Communications Session Border Controller s switch over automatically in the following circumstances:

  • When the active Oracle Communications Session Border Controller ’s health score of drops below the threshold you configure.
  • When a time-out occurs, meaning that the active Oracle Communications Session Border Controller has not has not sent checkpointing messages to the standby Oracle Communications Session Border Controller within the allotted time.

    The active Oracle Communications Session Border Controller might not send checkpointing messages for various reasons such as link failure, communication loss, or advertisement loss. Even if the active Oracle Communications Session Border Controller has a perfect health score, it will give up the active role if it does not send a checkpoint message or otherwise advertise its status within the time-out window. Then the standby Oracle Communications Session Border Controller takes over as the active system.

When an automatic switchover happens, the Oracle Communications Session Border Controller that has just become active sends an ARP message to the switch. This message informs the switch to send future messages to its MAC address. The Oracle Communications Session Border Controller that has just become standby ignores any messages sent to it.

Manual Switchovers

You can trigger a manual switchover in the HA node by using the ACLI notify berpd force command. This command forces the two Oracle Communications Session Border Controllers in the HA node to trade roles. The active system becomes standby, and the standby becomes active.

In order to perform a successful manual switchover, the following conditions must be met.

  • The Oracle Communications Session Border Controller from which you trigger the switchover must be in one of the following states: active, standby, or becoming standby.
  • A manual switchover to the active state is only allowed on a Oracle Communications Session Border Controller in the standby or becoming standby state if it has achieved full media, signaling, and configuration synchronization.
  • A manual switchover to the active state is only allowed on a Oracle Communications Session Border Controller in the standby or becoming standby state if it has a health score above the value you configure for the threshold.

State Transitions

Oracle Communications Session Border Controllers can experience a series of states as they become active or become standby.

Note:

Packet processing only occurs on an active Oracle Communications Session Border Controller .
State Description
Initial When the Oracle Communications Session Border Controller is booting.
Becoming Active When the Oracle Communications Session Border Controller has negotiated to become the active system, but is waiting the time that you set to become fully active. Packets cannot be processed in this state.
Active When the Oracle Communications Session Border Controller is handling all media, signaling, and configuration processing.
Relinquishing Active When the Oracle Communications Session Border Controller is giving up its Active status, but before it has become standby. This state is very brief.
Becoming Standby When the Oracle Communications Session Border Controller is becoming the standby system but is waiting to become fully synchronized. It remains in this state for the period of time you set in the becoming-standby-time parameter, or until it is fully synchronized.
Standby When the Oracle Communications Session Border Controller is fully synchronized with its active system in the HA node.
OutOfService When the Oracle Communications Session Border Controller cannot become synchronized in the period of time you set in the becoming-standby-time parameter.

State Transition Sequences

When the active Oracle Communications Session Border Controller assumes its role as the as the active system, but then changes roles with the standby Oracle Communications Session Border Controller to become standby, it goes through the following sequence of state transitions:

  • Active
  • RelinquishingActive
  • BecomingStandby
  • Standby

    When the standby Oracle Communications Session Border Controller assumes its role as the standby system, but then changes roles with the active Oracle Communications Session Border Controller to become active, it goes through the following sequence of state transitions:

  • Standby
  • BecomingActive
  • Active

HA Features

HA nodes support configuration checkpointing, which you are required to set up so that the configurations across the HA node are synchronized. In addition, you can set up the following optional HA node features:

  • Multiple rear interface support
  • Gateway link failure detection and polling

Multiple Rear Interfaces

Configuring your HA node to support multiple rear interfaces eliminates the possibility that either of the rear interfaces you configure for HA support will become a single point of failure. Using this feature, you can configure individual Oracle Communications Session Border Controller s with multiple destinations on the two rear interfaces, creating an added layer of failover support.

When you configure your HA node for multiple rear interface support, you can use last two rear interfaces (wancom1 and wancom2) for HA—the first (wancom0) being used for Oracle Communications Session Border Controller management. You can connect your Oracle Communications Session Border Controller s using any combination of wancom1 and wancom2 on both systems. Over these rear interfaces, the Oracle Communications Session Border Controller s in the HA node share the following information:

  • Health
  • Media flow
  • Signaling
  • Configuration

For example, if one of the rear interface cables is disconnected or if the interface connection fails for some other reason, all health, media flow, signaling, and configuration information can be checkpointed over the other interface.

Health information is checkpointed across all configured interfaces. However, media flow, signaling, and configuration information is checkpointed across one interface at a time, as determined by the Oracle Communications Session Border Controller ’s system HA processes.

Configuration Checkpointing

During configuration checkpointing, all configuration activity and changes on one Oracle Communications Session Border Controller are automatically mirrored on the other. Checkpointed transactions include adding, deleting, or modifying a configuration on the active Oracle Communications Session Border Controller . This means that you only need to perform configuration tasks on the active Oracle Communications Session Border Controller because the standby system will go through the checkpointing process and synchronize its configuration to reflect activity and changes.

Because of the way configuration checkpointing works, the ACLI save-config and activate-config commands can only be used on the active Oracle Communications Session Border Controller .

  • When you use the ACLI save-config command on the active Oracle Communications Session Border Controller , the standby Oracle Communications Session Border Controller learns of the action and updates its own configuration. Then the standby Oracle Communications Session Border Controller saves the configuration automatically.
  • When you use the ACLI activate-config command on the active Oracle Communications Session Border Controller , the standby Oracle Communications Session Border Controller learns of the action and activates its own, updated configuration.

The ACLI acquire-config command is used to copy configuration information from one Oracle Communications Session Border Controller to another.

Gateway Link Failure Detection and Polling

In an HA node, the Oracle Communications Session Border Controllers can poll for and detect media interface links to the gateways as they monitor ARP connectivity. The front gateway is assigned in the network interface configuration, and is where packets are forwarded out of the originator’s LAN.

The Oracle Communications Session Border Controller monitors connectivity using ARP messages that it exchanges with the gateway. The Oracle Communications Session Border Controller sends regular ARP messages to the gateway in order to show that it is still in service; this is referred to as a heartbeat message. If the Oracle Communications Session Border Controller deems the gateway unreachable for any of the reasons discussed in this section, a network-level alarm is generated and an amount you configure for this fault is subtracted from the system’s health score.

The Oracle Communications Session Border Controller generates a gateway unreachable network-level alarm if the Oracle Communications Session Border Controller has not received a message from the media interface gateway within the time you configure for a heartbeat timeout. In this case, The Oracle Communications Session Border Controller will send out ARP requests and wait for a reply. If no reply is received after resending the set number of ARP requests, the alarm remains until you clear it. The health score also stays at its reduced amount until you clear the alarm.

When valid ARP requests are once again received, the alarm is cleared and system health scores are increased the appropriate amount.

You can configure media interface detection and polling either on a global basis in the SD HA nodes/redundancy configuration or on individual basis for each network interface in the network interface configuration.

Note:

To improve the detection of link failures, the switchport connected to the NIU should have Spanning Tree disabled. Enabling Spanning Tree stops the switchport from forwarding frames for several seconds after a reset. This prevents the NIU from reaching the gateway and generates a "gateway unreachable" network-level alarm.

Georedundant High Availability (HA)

You can locate the two nodes that make up an HA pair in different locations from one another. This is known as georedundancy, which increases fault tolerance. A georedundant pair must adhere to rigid network operating conditions to ensure that all state and call data is shared between the systems, and that failovers happen quickly without losing calls.

The following network constraints are required for georedundant operation:

  • A pair of dedicated fiber routes between sites is required. Each route must have non-blocking bandwidth sufficient to connect wancom1 and wancom2 ports (i.e., 1Gbps per port)
  • Inter-site round-trip time (RTT) must be less than 10 ms. 5 ms or less is ideal. Georedundant operation must be built upon a properly engineered layer-2 WAN (eg. MPLS or Metro Ethernet) that connects active and standby HA pair members.
  • Simultaneous packet loss across the inter-site link pair must be 0%. Loss of consecutive heartbeats could potentially result in split-brain behaviors.
  • Security (privacy and data-integrity) must be provided by the network itself.

As with local HA nodes, management traffic (e.g. SSH, SFTP, SNMP, etc.) must be confined to the wancom0 management interface. HA node peers must have their wancom0 IP addresses on the same subnet. All Oracle Communications Session Border Controller configuration, including host routes and the system-config's default-gateway, is shared between the HA pair so it is not possible to have two different management interface default-gateways. This implies the requirement of an L2-switched connection between the 2 wancom0 management interfaces.

Troubleshooting Georedundant Deployments

The Oracle Communications Session Border Controller provides rich statistics and status information on HA operation, documented in the ACLI Reference and Maintenance and Troubleshooting Guides. Some of this information is especially suited for troubleshooting the latency and packet-loss requirements for georedundant deployments, including:

  • Details within the show redundancy command output, including:
    • Request-response round-trip time measurements (show redundancy <task-name>)
    • Request-response loss measurements (show redundancy <task-name>)
    • journal statistics (show redundancy <task-name> journals)
    • journal latency (show redundancy <task-name> journals)
    • protocol-specific redundancy actions (show redundancy <task-name> actions)
    • protocol-specific redundancy objects (show redundancy <task-name> objects)
  • Details within the show queues command output, including:
    • sipd command queue statistics (show queue <task-name> commands)
  • Protocol-specific log messaging