This document captures the prerequisites for the Ribbon CNF Products (SBC, PSX, RAMP), including hardware, software, and related add-ons.
v2 release v0.31.4 The explicit versions captured here are the ones that are qualified as part of our CNF solution testing. Generally, the Ribbon CNF solution should also work with the later releases.Item Version/Type Additional Information Argo Rollout 1.1 or above Only if Canary upgrade is performed using Argo Rollout FluxCD Only if GitOps-based installation is performed. Helm 3 or above Kubernetes (K8S) 1.23 or above Linux Kernel 4.18 or 5.14 The Linux kernel installed on the worker nodes configured as part of the Kubernetes Cluster Openshift Container Platform 4.8 or above
Item | Details |
---|---|
NIC | Intel® Ethernet Network Adapter X710 |
Processor | If an AMD processor is used, AMD processor specific tuning must be done. The parameters and recommended settings are as follows:
For Intel Processors, BIOS Settings are as follows:
|
Storage | Storage Classes:
|
Interface | Network Types | Minimum Bandwidth | Additional Information |
---|---|---|---|
mgt0 | macvlan, ovs, sriov | 1 Gbps | Management communication |
ha0 | macvlan,ovs with whereabouts | 10 Gbps | Inter Pod communication |
pkt0 & pkt1 | sriov | 10 Gbps | Signaling and Media packets |
Interface | Network Types | Minimum Bandwidth | Additional Information |
---|---|---|---|
mgt0 | macvlan with whereabouts | 1Gbps | management communication with RAMP |
eth1 | macvlan with whereabouts | 10Gbps | D+ traffic, dbaas communication |
Interface | Network Types | Minimum Bandwidth | Additional Information |
---|---|---|---|
eth1 | macvlan | 1 Gbps | northbound and southbound interface |
Configuration Item | Requirement/Usecase | How to check if configured/set/enabled |
---|---|---|
Hugepages | Realtime processing of media(RTP) packets requires faster memory access. A huge page size of 1Gi is required. | $kubectl describe node <node name> $cat /proc/cmdline default_hugepagesz=1G hugepagesz=1G hugepages=128 + #hugeadm --pool-list $cat /sys/fs/cgroup/hugetlb/kubepods.slice/hugetlb.1GB.limit_in_bytes |
CPU Manager | Realtime processing of signaling and media(RTP) packets with low latency and better performance require dedicated, isolated CPU cores with fewer context switches. CPU Manager Policy should be set to 'static' cpuManagerPolicy: static |
|
SR-IOV | For high throughput of signaling and media packets which require dedicated bandwidth for media packets. Driver Requirements:
|
|
Multius CNI | In Kubernetes, each pod only has one Network Interface (apart from a loopback). Voice traffic handling requires dedicated network interfaces to handle signaling and media traffic. Multus CNI plugin enables attaching multiple network interfaces to Pods. Multus acts as a meta-plugin (a CNI plugin that can call multiple other CNI plugins). |
Check whether the Multus conf file exists: Check the Multus binary present under: |
NUMA | If there are multiple NUMA nodes, all resources for a given pod should come from the same NUMA node -- CPU, Memory and NICs. In Performance profile, Numa: Topology Policy: single-numa-node |
|
Real-Time Scheduling | In order to give the capability to assign Real-time scheduling (SCHED_FIFO) to SWe_NP threads inside the pod, the following Host setting is required. Note
This setting is not persistent across the reboots. The above bash command must be included in one of the Host initialization scripts. | cat /proc/sys/kernel/sched_rt_runtime_us |
Kernel Same-Page Merging | Kernel Same-page Merging (KSM) is a technology which finds common memory pages inside a Linux system and merges the pages to save memory in the event of one of the copies being updated, a new copy is created so the function is transparent to the processes on the system. For hypervisors, KSM is highly beneficial when multiple guests are running with the same level of the operating system. However, there is overhead due to the scanning process which may cause the application to run slower, which is not desirable. To turn off KSM:
| |
Role and Role Binding | Privileges are required to edit/patch/view the resources like endpoint/service (epu for updating the ha0 IP), deployment (hpa, to scale the deployment count), pod (to fetch ha0 IP from annotations), and so forth. | |
PVC | The SBC CNF requires creating PVCs in both RWX (ReadWriteMany) and RWO (ReadWriteOnce) modes. A minimum of 15 PVCs can be created. (Storage size depends on the type, between 100 MB and 20 GB.) | |
CPU Frequency | The CPU Frequency setting determines the operating clock speed of the processor and, in turn, the system performance. Redhat offers a set of built-in tuning profiles and a tool called tuned-adm that helps configure the required tuning profile. Applying the "throughput-performance" profile is recommended, allowing the processor to operate at maximum frequency. Apply the 'throughput-performance" tuning profile.
This configuration is persistent across reboots and takes effect immediately. There is no need to reboot the host after configuring the profile. | Determine the Active tuning profile:
Current Active profile: throughput-performance |
Container Privileges | Some SBC CNF Pods/Containers need root privileges (e.g., SC container). All of the containers run in privileged mode. | |
autoMountServiceAccountToken | Ribbon SBC CNF containers must use Kubernetes API resources from the container application to support Horizontal Pod auto-scaling & Inter-Pod communication using the eth1 interface. Needed for most of the PODs. This requires the "autoMountServiceAccountToken" to be enabled. | |
Argo Operator | If progressive update is required using argo, then argo Operator should be instantiated. | |
Coredump Handler Requirements | This set of requirements is not mandatory. It is needed if the Customer deploys the Ribbon Coredump handler Tool, which collects core dump files from crashed containers.
| |
Shielding CPU Cores from the Interrupts | Isolating interrupts (IRQs) from real time workloads like SC and SLB on different dedicated CPUs (Host reserved cores) can minimize or eliminate latency in real-time environments. Approach 1
Set " apiVersion: performance.openshift.io/v2 kind: PerformanceProfile metadata: name: manual spec: globallyDisableIrqLoadBalancing: true
On the worker nodes
Approach 2 This approach applies to the OCP K8S environment. For certain workloads, the host reserved CPUs are not always sufficient for dealing with device interrupts, and for this reason, device interrupts are not globally disabled on the isolated CPUs. Device interrupts are load-balanced between all isolated and reserved CPUs to avoid overloading CPUs, except for CPUs with a guaranteed pod running. Guaranteed pod CPUs are prevented from processing device interrupts when the pod annotation, irq-load-balancing.crio.io, is defined with the value as disabled. When configured, CRI-O disables device interrupts only when the pod is running. The corresponding update will be visible (after the latency-sensitive workload pod has been scheduled) in the From the SBC CNF helm chart perspective, the following setting needs to be configured:
Additional worker node configuration for Approach 2 The irqbalance service gets restarted every time the Since the SC pods are dynamically scale-able entities based on traffic subjected to the SBC CNe cluster, the SC pods would frequently be created and destroyed - resulting in frequent restarts of irqbalance service on the worker node. By default, the system allows five restarts (StartLimitBurst) in 10 seconds (StartLimitIntervalSec), which is not sufficient in certain scaling occasions, especially during the initial scale-out of SC deployment immediately after helm installation to minimum active SC pods. Therefore, the irqbalance service configuration file A sample configuration would be as follows: [Unit]Description=irqbalance daemon ConditionVirtualization=!container [Service]EnvironmentFile=/etc/sysconfig/irqbalance ExecStart=/usr/sbin/irqbalance --foreground $IRQBALANCE_ARGS StartLimitBurst=60 <---------------------------------------------------- New parameter. [Install] WantedBy=multi-user.target After modifying the irqbalance.service unit file, you need to reload systemd and then restart the service for the changes to take effect:
For more information, refer to: https://docs.openshift.com/container-platform/4.15/scalability_and_performance/cnf-low-latency-tuning.html |
Some SBC CNF Pods require Linux capabilities for specific functions. Capability/securityContext Additional Information
See SBC Kernel Parameters for the complete list of SBC Kernel Parameters.
Configuration Item | Requirment/Usecase | How to check if configured/set/enabled | Additional Details |
---|---|---|---|
Hugepages | A huge page size of 1Gi is required for faster memory access. | $kubectl describe node <node name> $cat /proc/cmdline default_hugepagesz=1G hugepagesz=1G hugepages=128 + #hugeadm --pool-list $cat /sys/fs/cgroup/hugetlb/kubepods.slice/hugetlb.1GB.limit_in_bytes | |
Multus CNI | PSXs use non-eth0 interfaces to communicate within themselves for db sync/replication and with RAMP for registration. Multus enables this. |
Check whether the Multus conf file exists: Check the Multus binary present under: | |
Role and Role Binding | Privileges are required to edit/patch/view the resources like endpoint/service (epu for updating the ha0 IP), deployment (hpa, to scale the deployment count), pod (to fetch ha0 IP from annotations), and so forth. | ||
PVC | The PSX CNF requires the ability to create PVCs in both RWX (ReadWriteMany) and RWO (ReadWriteOnce) modes. | ||
Container Privileges | Some PSX CNF Pods/Containers need root privileges (Primary and replica Pods). |
Some of the PSX Pods require Linux capabilities for specific functions. The following is the complete list of Capabilities required:
Capability/securityContext | Additional Information |
---|---|
allowPrivilegeEscalation: true |
There are no specific kernel parameters that need to be tuned for PSX.
Configuration Item | Requirement/Usecase | How to check if configured/set/enabled |
---|---|---|
Multus CNI | RAMP uses non-eth0 interfaces to communicate within northbound and southbound. Multus enables this. Multus CNI plugin enables attaching multiple network interfaces to Pods. Multus acts as a meta-plugin (a CNI plugin that can call multiple other CNI plugins). |
Check whether the multus conf file exists check multus binary present under |
Role and Role Binding | Privileges are required to edit/patch/view the resources like endpoint/service (epu for updating the ha0 IP), deployment (hpa, to scale the deployment count), pod (to fetch ha0 IP from annotations), and so forth. | |
PVC | The RAMP CNF requires the ability to create PVCs in both RWX (ReadWriteMany) and RWO (ReadWriteOnce) modes. |
Some of the RAMP Pods require Linux capabilities for specific functions. Following is the complete list of Capabilities required. Capability/securityContext:
Parameter | Values |
---|---|
sysctl kernel.pid_max | >= 4096 |
fs.inotify.max_user_instances | 8192 |