Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Excerpt

Sonus supports a new N:1 mechanism for the SBC SWe platform where up to N active SBC instances are backed up by a single standby SBC instance. On a fail-over scenario, the standby instance takes over and becomes active and the failed active instance becomes standby, once it is up and running.

Note
iconfalse
titleNote

The maximum value for N is one for Signaling SBC (S-SBC) and four for Media SBC (M-SBC).


 

SBC Redundancy Group

The SBC Redundancy Group (RG) consists of one or more SBC SWe instances. All the instances in a RG must have homogeneous resource allocation, configuration and personality. Each SBC SWe instance including standby maintains its own configuration DB, logs, events, and alarms.

A Cluster is a group of one or more SBC RGs. All RGs in a cluster must use the same SBC SWe type (Signaling SBC, Media SBC), which then dictates the cluster type, such as Media cluster and Signaling cluster.

The following diagrams depict the N:1 architecture with a single SBC Redundancy group (RG) and multiple redundancy groups within an SBC cluster.

Caption
0Figure
1N:1 Architecture

Image Added

Caption
0Figure
1Clusters with different Redundancy Groups

Image Added

Launching N:1 SBC Instances

The N:1 SBC is instantiated using the following methods:

The N:1 architecture is grouped into three different areas:

  • The SBC SWe Instances, which implements

Feature Overview

The N:1 architecture mainly applies to N:1 redundancy for SBC platforms.The SBC instances are grouped into redundancy groups comprising N+1 elements, with the active SBC instances are denoted by N backed up by a single standby SBC instance.

The N:1 redundancy is grouped into:

  • SBC instances, which implement the N:1 redundancy for service availability and stable call and registration resiliency.
  • The VNF lifecycle management function, which defines the SBC redundancy groups and , directs both the initial instantiation and subsequent healing after fault management.
  • EMS, which manages the SBC instances.
  • The SBC redundancy group is initiated by the VNF lifecycle

...

  • orchestrator. The VNFM, either through VNFD

...

  • template or through

...

  • SBC heat templates, implements a

...

  • Redundancy Group (RG) with up-to four active SBC instances and

...

  • a single standby instance. The VNFM dynamically assigns each

...

  • RG a unique

...

  • Redundancy Group ID (RGID).

...

  • You can create multiple Redundancy Groups that forms the SBC cluster.
  • The EMS, which manages the SBC SWe instances.
    All the SBC instances are registered with the EMS and download the configuration from the EMS. The EMS adds the SBC SWe instances to the registered list and to the active SBC database. Once the SBC application is up and running, the EMS connects to the SBC application and starts monitoring and collects the statistics.

Pagebreak

In addition to the VIPs used for signaling and media, the VNFM also needs to manage the HA addresses associated with the SBC instances. When either an active or standby instance is created, the VNFM must add that HA address for that instance to a HA address set. When creating a new SBC instance, the current contents of this HA set must be passed to the created instance.

Note

HA information provided to instances is not static; the first SBC instance gets no HA peer information (since the HA set is empty at that point), whereas the 5th SBC instance gets the addressing for all 4 peers. This heterogeneity of HA address distribution is a natural consequence of the arbitrary ordering of instantiation possible in a real cloud deployment.

The SBC instances together implement the redundancy and the local management entity here is the Redundancy Group Manager (RGM) framework. While the existing AMF framework is still in place, it no longer implements the redundancy function. When an instance is instantiated as active, the RGM uses the AMF framework to bring up the SBC application as standalone. For an instance created as standby, the RGM uses the AMF framework to bring up the SBC application as standalone but the tasks are in standby mode. The RGM then establishes the peering relationship between the active and standby mode. The RGM instances learn the HA address of the standby instance for the redundancy group through a membership protocol. Each active RGM then attempts to connect to that standby RGM and handshake as a redundancy client. For each active-standby pairing, if the handshake is successful, the RGM on the active and standby interact with the local RTMs to begin the synchronization process and, once synced, the ongoing mirroring of necessary call and registration state changes. The RGM on the standby also sets up a health check schedule to ensure that the standby is quickly aware of any failure in active node.

The lifecycle agent (LCA) of the SBC instances dynamically registers with the EMS, once the instantiation is completed. Both the active and the standby instances download the configuration. The EMS manages all SBC instances including the standby instances, which send traps and other alarms to the EMS. The EMS enters the active instance through an active system table and performs all the existing operations associated with active systems such as health-checking, status collection, etc. For the standby SBC instance, the EMS maintains the instance within the registered group and also does health-checking. However, when the instance is not added to the active system table, then statistics collections or other activities associated with the active system must not be performed.

Fault detection and switchover is first handled by the SBC instances. As previously mentioned, the standby SBC instance sets up a health check schedule with each backed-up active instance. When the health check fails, the RGM converts the standby instance to active for the failed instance. The standby tasks destroy all contexts other than the context to be activated. and the remaining context is activated. To optimize the user experience, this sequence is performed first for media associated tasks, then signaling tasks, and finally OA&M tasks. When activation is complete, the activated standby generates an alarm to indicate that it has taken over from a failed standby. Additionally, the LCA on the activated standby updates the registration to the EMS and indicates that it has taken over a failed registration. When this occurs, the EMS updates the appropriate entry in the active system table to reflect the updated parameters (such as management IP address and credentials). System activities such as statistics collection etc. is directed against the former standby instance.