Modified: for 12.1.1



In this section:

Ribbon Cloud Native Overview

The Ribbon cloud-native Session Border Control solution for the SBC, PSX, and RAMP is a fully containerized cloud-native edition of the respective Ribbon functions. The solution offers all the benefits of cloud-native functions (CNF), including faster time to market, higher scalability, simpler management, reduction in overall cost, and enablement of automation and DevOps practices. The CNF products are the cloud-native decomposition and re-architecture of existing functionality in terms of various loosely coupled microservices that interact to provide the required functionality. This provides the benefits of a cloud-native architecture while meeting the functional objectives and the performance KPIs. You can deploy the CNF products across various container orchestration platforms, including Kubernetes, OpenShift Container Platform (OCP) from Red Hat, and the AWS EKS service. They are also deployed in several customer-specific container platforms.

Cloud Native Architecture Principles

Cloud-native architecture is a software development and deployment approach emphasizing cloud environments' scalability, resilience, and automation.

Some of the basic principles of cloud-native architecture include:

  • Microservices: The cloud-native architecture emphasizes using small, loosely coupled services developed, deployed, and scaled independently. This approach allows for greater agility and scalability in the application development and deployment process.
  • Containers: Cloud-native architecture typically relies on containers to package and deploy software applications. Containers provide a lightweight and portable runtime environment deployed across multiple cloud platforms.
  • Resiliency: Cloud-native applications are designed to be resilient despite failures and disruptions. This includes strategies such as fault tolerance, self-healing, and disaster recovery.
  • Scalability: Cloud-native architecture emphasizes horizontal scalability, meaning applications can be easily scaled by adding additional instances of the same service.
  • Observability: Cloud-native applications are designed with observability in mind, providing comprehensive, real-time visibility into the application's performance and behavior.
  • DevOps: Cloud-native architecture emphasizes a collaborative approach to development and operations, emphasizing automation, Continuous Integration/Continuous Deployment (CI/CD), and rapid iteration.

Ribbon CNF Deployment Model

Ribbon recommends the following model to deploy the Ribbon cloud-native Session Border Control solution in your network:

  • RAMP CNFs – Always deploy two RAMP CNFs running in Geo-redundancy mode (one RAMP CNF is deployed in a cluster in geographic region/site-1, while the other RAMP CNF is deployed in a cluster in another geographic region/site-2). Dynamically control the state of the RAMP CNFs (i.e., ACTIVE and STANDBY) dynamically using etcd CNFs (3 etcd instances are running in different clusters. Ribbon recommends running the etcd CNFs instances on three distinct geographic regions/sites).
  • PSX-Primary CNFs – Deploy a minimum of three PSX-Primary CNFs (spread across multiple clusters and geographic regions). When adding additional PSX-Primary CNFs, keep the count odd (primarily to avoid split brain/loss of quorum conditions during network isolation events).
  • PSX-Replica CNFs – The number of PSX-Replica CNFs to deploy is primarily decided based on the call capacity (i.e., Diameter+ queries, ENUM dips, etc.).  Each PSX-Replica CNF will have multiple PSX-Replica pods. 
  • SBC CNFs – The number of SBC CNFs (to achieve cluster level and geographic region/site level redundancy) is decided based on the call pattern (i.e., call/session capacity, call type [passthrough, transcode, direct media], SIPREC, Lawful Intercept, etc). The SC pods in the SBC CNF can dynamically scale up/down based on the call traffic. 

Ribbon CNF Products Architecture

The Ribbon CNF architecture is focused on the following five base objectives to achieve cloud-native core capabilities:

  1. Auto-Scaling
  2. Security
  3. Redundancy
  4. Observability
  5. Application Life Cycle Automation

Auto-Scaling

Auto-scaling is supported for all major Call signaling and media processing pods (Pods are in N:K redundancy model). Auto-scaling is achieved with the Ribbon Horizontal Pod Auto Scaler (RHPA). It does not use the K8S Auto-scaling functionality. The base deployment uses a minimum number of running active and standby pods. If and when the load increases, the RHPA scales out the pods as necessary until reaching the maximum configured number of pods (N+K). Similarly, when the load decreases, the RHPA reduces the number of pods. Currently, only SBC CNe pods have auto-scaling support.

Security

All Ribbon CNF products are deployed in separate namespaces. The SBC CNe is deployed as a separate CNF cluster. The PSX Primary and Replica nodes are deployed as two separate CNF clusters. The SBC CNe and PSX use secure connections to the RAMP. 

An inter-pod secure connection is planned in a future release.  

Redundancy

All Ribbon CNe products provide Redundancy for their functions (pods). Product-specific (SBC CNe, PSX CNe and RAMP) redundancy mechanisms are explained in their respective product documentation.

Observability

All CNF applications produce logs and metrics that are essential to understanding their state and health. Ribbon CNFs support EFK and Kafka as observability backends for centralized logging, and Prometheus as the metrics logging backend. The interaction with the backends is typically through an integrated telemetry agent in the service pod. Further details are provided later.

Application Life Cycle Automation

The Ribbon CNF Products follow general GitOps principles for life cycle automation. All images are stored in a container repository, and the product manifests (Helm charts) are stored in a Git repository. Any change in the manifests then triggers the pulling of the latest manifests by fluxCD and the modification of the running system parameters to match that of the repository. 

Ribbon customers with appropriate agreements who are eligible to receive new CNF software can obtain the software either by pulling from a Ribbon repository or having it pushed into their repository.

This deployment model assumes the customer is entitled to receive software and artifacts and the interface between the Ribbon and customer's repositories are agreed to and established.


SBC CNe Architecture Overview

The SBC CNe Microservices model comprises the following major groups of functions (pods).

  • Load balancing (SLB)
  • Operation and Management (OAM)
  • Call Processing & Media Transcoding (SC, RS)
  • Auto-Scaling (RHPA)
  • Call State Storage (Cache-Proxy, Redis-DB)
  • Other Common Services (CS, SG, EPU, RAC, NS)

In the SBC CNe deployment model, all incoming signaling packets are front-ended by the SLB Pod and media packets are front-ended by the SC Pod. The SLB (for signaling), SC (for media) and SG (for signaling) expose external (public) IPs. This model supports separate IPs (or group of IPs) for signaling and media traffic.


SBC CNe Microservices

The SBC CNe is the Cloud Native decomposition of core SBC functionality in terms of various microservices which interact with each other to provide SBC functionality. The components which make up the SBC CNe are described below.


SBC CNe Microservices Components

Pod Name

Containers

Details

Pod Resource Type

SIP Load Balancer (SLB)


  • slb
  • oamproxy
  • pvclogger
  • rbbn-telemetry-agent

The SLB acts as the single entry/exit point for all SIP Signaling to the SBC CNe.

The SLB route Calls (INVITEs) to SC Pods and Out of dialog requests (REGISTER, OPTIONS, SUBSCRIBE, etc.) to RS Pod. The SLB implements load balancing of received SIP requests to the SC and RS Pods based on their reported metrics.

Another critical function provided by the SLB is the seamless support for complex SIP signaling flows, e.g. INVITE with Replaces.

The SLB is deployed in a 1:1 redundancy model.

Deployment / Replicaset

Session Control (SC)


  • isbc
  • oamproxy
  • pvclogger
  • rbbn-telemetry-agent

The SC is the main Call and media processing engine for SIP sessions.

The SC Pods auto-scale based on the current load of existing instances. When the load increases, new instances are created; and when the load reduces, some of the existing instances terminate. You can define the thresholds for scaling in and scaling out in the SC Pod manifest files. The call state information is stored in an external Redis-DB Cache. The SC Pod that assumes the responsibility of a failed Pod retrieves the relevant call state from the Redis-DB Cache.

The SC is deployed in an N:K redundancy model. 

Deployment / Replicaset

Network Services (NS)


  • ns
  • rbbn-cache-proxy
  • oamproxy
  • rbbn-telemetry-agent

The NS Pod manages the floating public IP Address pool of an SC Pod.

Floating IP addresses are used by the SC Pod for external communication. The SC Pod uses floating IP addresses for media/Rx/Rf/DNS interface related communication.

The administrator must ensure the Signaling and Media Port Ranges do not overlapping each other.    

The NS is deployed in a 1:1 redundancy model. 

Deployment / Replicaset

Role Assignment Controller (RAC)



  • rac
  • oamproxy
  • rbbn-telemetry-agent

The RAC determines and provides the roles (Active or Inactive) to each of the SC Pods and performs takeover actions when it detects that an SC Pod failed. 

The RAC also assigns the role to the NS, HPA, EPU and CS Pods (similar to the role assignment of the SC Pods).  The choices are "Active" or "Inactive".

The RAC is deployed in a 1:1 redundancy model. 

Deployment / Replicaset

Ribbon Horizontal Pod Autoscaler (RHPA)


  • hpa
  • oamproxy
  • rbbn-telemetry-agent

The RHPA handles scale-out and scale-in of SC service instances based on the metrics they report. The RHPA aggregates all received metrics and instantiates or terminates SC instances based on the configured thresholds.

The two objectives of RHPA are:

  1. The solution must dynamically scale in and scale out the elements based on traffic. 
  2. The solution must be highly-available with failure resilience. This ensures that the n:k ratio of (n Active and k Standby) of SC Pods is maintained during the scale-out & scale-in operations.

The RHPA is deployed in a 1:1 redundancy model. 

Deployment / Replicaset

Common Services  (CS)


  • cs
  • oamproxy
  • pvclogger
  • rbbn-telemetry-agent

The CS Pod handles certain functions that requires centralized processing. Examples include address reachability tracking, path checking, and call admission control.

The CS handles any necessary aggregation and shares the aggregated view with all other relevant pods in the cluster. It also implements Call Admission Control on an overall CNe basis.  Each SC Pod queries the CAC service (part of the CS Container) for call admission purposes. This ensures correct overall admission control both in terms of call counts and call rates.

The CS is deployed in a 1:1 redundancy model.



Deployment / Replicaset

End Point Updater  (EPU) 


  • epu
  • rbbn-telemetry-agent
  • oamproxy

The EPU is required only in cases where the customer does not want to use default (eth0) interface for inter-pod communication.

The EPU service monitors the inter-pod communication (eth1 or ha0) interface IP address of all pods launched as part of SBC CNe and updates the associated Kubernetes endpoints. (The Kubernetes default endpoint element does not support non-eth0 interfaces' IP discovery). 

The EPU is deployed in a 1:1 redundancy model.

Deployment / Replicaset

Operations and Management  (OAM)


  • oam
  • rbbn-telemetry-agent

The OAM is the single point of contact for all SBC CNe configurations.

The OAM exposes a RESTCONF API and CLI through which the configuration is provided and queried. The OAM is also accessible via the SBC Manager in RAMP. The OAM distributes configuration information to relevant pod instances. The OAM also interfaces with RAMP for statistics, alarms, traps, licensing and CDRs.

The OAM is deployed in a 1:1 redundancy model.

Deployment / Replicaset

Redis-DB Cache  (DB)


  • rbbn-cache
  • rbbn-cache-init
  • rbbn-telemetry-agent

The DB Cache stores session data and any other state information. All pods which want to store any state/data use Redis-SB. This includes call state (INVITE sessions), registration (REGISTER) data and any other state information.

The SC instance assuming responsibility of a failed pod retrieves relevant state from the DB Cache.

The Redis-DB cache Pod is deployed in 3:3 redundancy model. 

Statefulset

Signaling Gateway (SG)


  • sg
  • oamproxy
  • pvclogger
  • rbbn-telemetry-agent

The SG acts as gateway for all pods that need to communicate with external entities. The DNS queries from SC Pod and the PSX queries from SC Pod go through the SG Pod.

The communication protocols supported by the SG are:

  • DNS
  • Diameter 
  • Diameter+ (SBC to PSX),
  • X2 (for LI).

The SG Pod is deployed in a 1:1 redundancy model. 

Deployment / Replicaset

Register/Relay (RS)


  • rs
  • oamproxy
  • pvclogger
  • rbbn-telemetry-agent

The RS Pod handles SIP Registration (REGISTER method) and other SIP Out of Dialog requests (SUBSCRIBE, OPTIONS, PUBLISH, MESSAGE) handling.

For completed Registration sessions, the RS Pod stores registration data in the Redis-DB Cache and receives REGISTER and other Out of dialog requests from the SLB.

The RS Pod is deployed in an N:K redundancy model. 

Deployment / Replicaset

DB Cache-Proxy

  • rbbn-cache-proxy
  • rbbn-telemetry-agent

The Cache-Proxy Pod acts like a proxy for all other pods that use the Redis-DB Cache. The Cache-Proxy Pod contains the rbbn-cache-proxy and rbbn-telemetry-agent containers. The Pod uses the K8S eth0 interface to communicate with other pods.

The Cahce-proxy Pod is deployed as 'N Active pods' model. The recommended value for N is "3."

Deployment / Replicaset


SBC CNe Internal Message Flow

All external signaling packets initially at the SLB Pod and external media packets at a specific SC Pod (based on the configuration). The SLB forwards call-related SIP messages to the SC Pod and Registration and Out of dialog messages to the RS Pod. The SLB also applies load balancing logic before forwarding requests to the RS and SCs. For INVITE processing, the SC first checks if the call can be admitted to the CS Pod, which invokes the CAC function.

To determine the destination peer for routing the call, the SC send a Routelookup request to the PSX CNe. This request, like all external (non-SIP) communication, travels through the SG Pod. Once the call is answered, the SBC CNe stores the call state information in the Redis-DB Cache and sends the CDRs to the OAM.

Note
  • The SLB forwards the REGISTER and all other non-INVITE methods to the RS Pod. The only exception is the OOD OPTIONS coming from non-registered endpoints (Keep Alive OPTIONS). The Keep-Alive OPTIONS are forwarded to the SC Pod. 
  • The SLB forwards the INVITE method to SC Pod. All in-dialog methods within INVITE dialog are also forwarded to the SC Pod.

SBC CNe Communication Interfaces

The SBC CNe pods use multiple interfaces for internal and external communication by various microservices:


NameInterface NameBandwidth
Requirement
Pods Exposing
This Interface

IP Version Supported

CNI

Application Protocols

(Accepted by/sent  from this Interface)

Transport Protocol

Purpose

Management

mgt0

1 Gbps

OAM



IPv4 or IPv6

OVS / MacVLAN



SSH

TCP

Allows Admin user to log into OAM directly

REST

TLS

OAM communication to RAMP

SNMP

UDP

OAM communication to SNMP Server

SFTP

TCP

To send CDRs to CDR Server

Packet


pkt0

pkt1

 

 


SLB

Dual-stack

SR-IOV



SIP

UDP, TCP, TLSAll SIP signaling travels through the SLB.


SC

Dual-stack

RTP

X3


All types of Media (RTP and X3) land on SC directly, not through the SLB.


SG

Dual-stack

DNS

Diameter, Diameter+

X2 

UDP, TCP, TLSAll external messages (except SIP and RTP) go through SG Pod
Default interface

eth0

10 GbpsALLIPv4 or IPv6

OVS / OVN

ZMQ

gRPC

TCP

Default Kubernetes Control Plane Interface.

Inter-pod communication and Observability

  • ZMQ: SIP PDU transfer
  • gRPC: Application Control related messaging
Inter POD

ha0

10 GbpsALLIPv4 or IPv6

OVS / MacVLAN

ZMQ

gRPC

TCP

Inter-pod communication


SBC CNe Resiliency

Redundancy Support

Pod Name

Redundancy Model

CS1 (Active) : 1 (Standby)
DB Cache-ProxyN (Active)
EPU1 (Active) : 1 (Standby)
NS1 (Active) : 1 (Standby)
OAM1 (Active) : 1 (Standby) 
RAC1 (Active) : 1 (Standby)
Redis-DB Cache3 (Active) : 3 (Standby) 
Ribbon HPA1 (Active) : 1 (Standby) 
RSN (Active) : K (Standby) 
SCN (Active) : K (Standby) 
SG1 (Active) : 1 (Standby) 
SLB1 (Active) : 1 (Standby) 


All SBC CNe pods provide at least 1:1 (Active: Standby) redundancy. The functional pods that handle Call & Media processing (SC Pod) support N:K (N Actives and K Standbys, where K is much less than N) redundancy. For pods that use 1:1 redundancy, the state information is typically stored directly within the application context. In contrast, the pods that support the N:K redundancy model store their state information (mostly call data) in a high-performance, centralized Redis-DB database. The Redis-DB itself is HA supported by a 3:3 redundancy model. 

For pods that support the 1:1 model, the Standby pod takes over upon failure of the Active pod, becoming Active and starts providing the function. In the N:K redundancy model pods, when an Active pod fails, one of the Standby pods gets triggered to switch to Active. The selected pod then fetches the call state information stored in the Redis-DB by the now failed pod and finally publishes itself as Active.

  • DB Cache PodThis microservice supports 3:3 redundancy model (i.e. 3 active and 3 standby). DB Cache is a sharded database. There are 3 shards and 1:1 redundancy for each shard.
  • SC PodSC microservice supports N:K redundancy model, where 'N' is the number of Active pods and 'K' is the number of Standby pods. The SBC CNe uses the following CNF Helm Chart parameters for SC Pod instantiation according to the N:K model. Refer to the installation procedure section for more information about Helm parameters. If you need to change any of these parameters, perform a Helm update.
    • Min number of Active SC Pods- (default = 1)
    • Max number of Active SC Pods - (default = 1), (Max = 47)
    • Max number of Standby SC Pod - (default =1), (Max = 3)
    • Scaling factor for Standby SC Pod (Range: 1 to 100 / default = 1)
    • Threshold for auto-scaling (Range: 20% to 90% / default = 70%)

Auto-Scaling 

The Session Control (SC) Pod supports the auto-scale feature. It does not use the K8S auto-scaling functionality; instead, it use the Ribbon Horizontal Pod Autoscaler (RHPA) to calculate a utilization metric, which is defined using CPU, Session (number of calls and registrations) and Bandwidth utilization. The SBC CNe aggregates the metric reported by each SC Pod and compares the data with the configured threshold values to make an auto-scaling decision. The default threshold value is 70%.

For SBC CNe, the Session Control (SC) PODs support the auto-scale feature. It does not use the K8S auto scaling functionality; instead, it uses the Ribbon Horizontal POD Autoscaler (RHPA) because in order to decide the scale-out or scale-in points, the SBC CNe must consider many aspects like CPU usage, bandwidth usage, number of sessions and registrations in the system. All SC PODs periodically provide their usage reports to RHPA. The metrics reported by each SC POD is aggregated and compared with thresholds to make an auto-scaling decision. The default threshold value is set to 70%.


RHPA in every period calculates the average load on the SBC CNF. If the average load is greater than a configured threshold (scale out threshold), RHPA will trigger scale out. The RAC (Role Assignment Controller) ensures assigning the right Role (Active or Standby) to the newly spawned SC Pod. RHPA will also ensure correct ratio of Active to Standby SC Pods. It continues to monitor usage and calculates the average load. When it see the average load is less than a configured threshold (scale in threshold), it will start scale in by terminating SC Pod.