The N:1 architecture mainly applies to N:1 redundancy for SBC platforms.The SBC instances are grouped into redundancy groups comprising N+1 elements, with the active SBC instances are denoted by N backed up by a single standby SBC instance.
The N:1 redundancy is grouped into:
The SBC redundancy group is initiated by the VNF lifecycle orchestrator. This orchestrator is either the Sonus VNFM or a third-party orchestrator. The VNFM, either through VNFD specification or through knowledge of the SBC redundancy facilities, implements a redundancy group as a combination of up to 4 active SBC instances and 1 standby instance. The VNFM dynamically assigns each redundancy group with a unique redundancy group ID (RGID). When instantiating an active SBC instance, the VNFM provides a role of “active” and the RGID through the appropriate user data. Virtual IP (VIP) addresses are attached as Allowed-Address-Pairs to the active instances, and are also added to a VNFM-maintained VIP set for that redundancy group. For a standby SBC instance, the VNFM provides a role of “standby” and the RGID. All VIP addresses in the VIP set are assigned to the standby as Allowed-Address-Pairs. Additionally, a VIP address is added to the set after a standby SBC instance is running. This is also added to the standby instance as an Allowed-Address-Pair.
In addition to the VIPs used for signaling and media, the VNFM also needs to manage the HA addresses associated with the SBC instances. When either an active or standby instance is created, the VNFM must add that HA address for that instance to a HA address set. When creating a new SBC instance, the current contents of this HA set must be passed to the created instance.
HA information provided to instances is not static; the first SBC instance gets no HA peer information (since the HA set is empty at that point), whereas the 5th SBC instance gets the addressing for all 4 peers. This heterogeneity of HA address distribution is a natural consequence of the arbitrary ordering of instantiation possible in a real cloud deployment.
The SBC instances together implement the redundancy and the local management entity here is the Redundancy Group Manager (RGM) framework. While the existing AMF framework is still in place, it no longer implements the redundancy function. When an instance is instantiated as active, the RGM uses the AMF framework to bring up the SBC application as standalone. For an instance created as standby, the RGM uses the AMF framework to bring up the SBC application as standalone but the tasks are in standby mode. The RGM then establishes the peering relationship between the active and standby mode. The RGM instances learn the HA address of the standby instance for the redundancy group through a membership protocol. Each active RGM then attempts to connect to that standby RGM and handshake as a redundancy client. For each active-standby pairing, if the handshake is successful, the RGM on the active and standby interact with the local RTMs to begin the synchronization process and, once synced, the ongoing mirroring of necessary call and registration state changes. The RGM on the standby also sets up a health check schedule to ensure that the standby is quickly aware of any failure in active node.
The lifecycle agent (LCA) of the SBC instances dynamically registers with the EMS, once the instantiation is completed. Both the active and the standby instances download the configuration. The EMS manages all SBC instances including the standby instances, which send traps and other alarms to the EMS. The EMS enters the active instance through an active system table and performs all the existing operations associated with active systems such as health-checking, status collection, etc. For the standby SBC instance, the EMS maintains the instance within the registered group and also does health-checking. However, when the instance is not added to the active system table, then statistics collections or other activities associated with the active system must not be performed.
Fault detection and switchover is first handled by the SBC instances. As previously mentioned, the standby SBC instance sets up a health check schedule with each backed-up active instance. When the health check fails, the RGM converts the standby instance to active for the failed instance. The standby tasks destroy all contexts other than the context to be activated. and the remaining context is activated. To optimize the user experience, this sequence is performed first for media associated tasks, then signaling tasks, and finally OA&M tasks. When activation is complete, the activated standby generates an alarm to indicate that it has taken over from a failed standby. Additionally, the LCA on the activated standby updates the registration to the EMS and indicates that it has taken over a failed registration. When this occurs, the EMS updates the appropriate entry in the active system table to reflect the updated parameters (such as management IP address and credentials). System activities such as statistics collection etc. is directed against the former standby instance.