WARNING

Any SBC SWe configuration (other than/outside of a D-SBC deployment) beyond 32 vCPUs is invalid and not supported.

This document captures the prerequisites for the Ribbon CNF Products (SBC, PSX, RAMP), including hardware, software, and related add-ons.


Software Requirements

ItemVersion/TypeAdditional Information
Argo Rollout1.1 or aboveOnly if Canary upgrade is performed using Argo Rollout
FluxCD

v2

release v0.31.4

Only if GitOps-based installation is performed.
Helm3 or above
Kubernetes (K8S)1.23 or above
Linux Kernel4.18 or 5.14The Linux kernel installed on the worker nodes configured as part of the Kubernetes Cluster
Openshift Container Platform4.8 or above
Note

The explicit versions captured here are the ones that are qualified as part of our CNF solution testing.  Generally, the Ribbon CNF solution should also work with the later releases.

Hardware Requirements

ItemDetails
NIC

Intel® Ethernet Network Adapter X710
OR
Intel® Ethernet Network Adapter X550
OR
Intel® Ethernet Network Adapter E810
OR
Mellanox Connectx5 Or Connectx-6

Processor

If an AMD processor is used, AMD processor specific tuning must be done. The parameters and recommended settings are as follows:

  • AMD Core Performance Boost: Enable
    Enables the processor to transition to a higher frequency than its rated speed if it has available power and is within its temperature specifications.
  • AMD Fmax Boost Limit Control: Auto
    Sets the maximum processor boost frequency. Auto will allow the processor to run at the highest possible boost frequencies.
  • AMD I/O Virtualization Technology: Enable
    Enables capabilities provided by AMD I/O Virtualization (IOMMU) functionality.
  • AMD SMT Option: Enable
    Enables Multi-Threading. When enabled, each physical processor core operates as two logical processor cores.
  • Page Table Entry Speculative Lock Scheduling: Enable
    Disabling this feature impacts performance.
  • Processor x2APIC Support: Auto
    This parameter enables operating systems to run more efficiently on high core count configurations. It also optimizes interrupt distribution in virtualized environments. Setting this option to Auto configures the OS to enable this feature when the logical core count is equal to or greater than 255 and disables it if it is less than 255. 
  • SR-IOV: Enable
  • Power Regulator: Static High-Performance Mode
  • HW Prefetcher: Disable
  • Minimum Processor Idle Power Core C-State: No C-States
  • Data Fabric C-State Enable: Disable
  • NUMA Memory Domains per Socket: One memory domain per socket.
    This is the same as the disabled Sub-NUMA clustering.

For Intel Processors, BIOS Settings are as follows:

  • CPU Power Management Power Regulator: Maximum Performance or Static High Performance
  • Intel Hyper-Threading: Enabled
  • Intel Turbo Boost: Enabled
  • Intel VT-x (Virtualization Technology): Enabled
  • Thermal Configuration: Optimal Cooling or Maximum Cooling
  • Minimum Processor Idle Power Core C-State: No C-states
  • Minimum Processor Idle Power Package C-State: No C-states
  • Energy Performance BIAS: Max Performance
  • Sub-NUMA Clustering: Disabled
  • HW Prefetcher: Disabled
  • SRIOV: Enabled
  • Intel® VT-d: Enabled
Storage

Storage Classes:

  • block
  • file

Network Interface Requirements

SBC

InterfaceNetwork TypesMinimum BandwidthAdditional Information
mgt0macvlan, ovs, sriov1 GbpsManagement communication
ha0macvlan,ovs with whereabouts10 GbpsInter Pod communication
pkt0 & pkt1sriov10 GbpsSignaling and Media packets

PSX

InterfaceNetwork TypesMinimum BandwidthAdditional Information
mgt0macvlan with whereabouts1Gbpsmanagement communication with RAMP
eth1macvlan with whereabouts10GbpsD+ traffic, dbaas communication

RAMP

InterfaceNetwork TypesMinimum BandwidthAdditional Information
eth1macvlan1 Gbpsnorthbound and southbound interface 

SBC Specific Requirements

Cluster and Node Level Settings

Configuration ItemRequirement/UsecaseHow to check if configured/set/enabled
Hugepages

Realtime processing of media(RTP) packets requires faster memory access.

A huge page size of 1Gi is required.


$kubectl describe node <node name>

$cat /proc/cmdline
default_hugepagesz=1G hugepagesz=1G hugepages=128 +

#hugeadm --pool-list

$cat /sys/fs/cgroup/hugetlb/kubepods.slice/hugetlb.1GB.limit_in_bytes
CPU Manager

Realtime processing of signaling and media(RTP) packets with low latency and better performance require dedicated, isolated  CPU cores with fewer context switches.

CPU Manager Policy should be set to 'static'

cpuManagerPolicy: static

$cat /var/lib/kubelet/cpu_manager_state

{"policyName":"static","defaultCpuSet":"0-3","checksum":611748604}

SR-IOV

For high throughput of signaling and media packets which require dedicated bandwidth for media packets.

Driver Requirements

  • vfio_pci driver:
    Intel® Ethernet Network Adapter X710
    Intel® Ethernet Network Adapter X550
    Intel® Ethernet Network Adapter E810
  • mlx5_core driver (default driver)
    Note: No extra drivers are required for the following:
    Mellanox Connectx5
    Connectx-6.
  1. Verify whether the Linux kernel running on the Nodes supports SRIOV.
    # grep -i sriov /boot/config-$(uname -r)

  2. Check the interfaces for SRIOV support.
    $lspci | grep Eth
    $lspci -s <specific ethernet controller> -vnnn

  3. Check whether SRIOV /cni/operator/plugins are installed and running in the cluster.
    $kubectl get pods -A | grep sriov

  4. Check at the node level support.
    $kubectl describe node <node name>

  5. Check at the cluster level (for all nodes).
    $kubectl get nodes -o json | jq '.items[].status.allocatable'

Multius CNIIn Kubernetes, each pod only has one Network Interface (apart from a loopback).
Voice traffic handling requires dedicated network interfaces to handle signaling and media traffic.
Multus CNI plugin enables attaching multiple network interfaces to Pods. Multus acts as a meta-plugin (a CNI plugin that can call multiple other CNI plugins).

$kubectl get pods --all-namespaces | grep -i multus

Check whether the Multus conf file exists:
/etc/cni/net.d/00-multus.conf 

Check the Multus binary present under:
/opt/cni/bin

NUMAIf there are multiple NUMA nodes, all resources for a given pod should come from the same NUMA node -- CPU, Memory and NICs.

In Performance profile,
Numa:
   Topology Policy: single-numa-node

#oc get performanceprofiles.performance.openshift.io

#oc describe performanceprofiles.performance.openshift.io <profile name>

Real-Time Scheduling

In order to give the capability to assign Real-time scheduling (SCHED_FIFO) to SWe_NP threads inside the pod, the following Host setting is required.

echo -1 > /proc/sys/kernel/sched_rt_runtime_us

This setting removes the limits on the CPU bandwidth available to the Real-Time threads.
The default value is “/proc/sys/kernel/sched_rt_runtime_us”  “950000”,

Note

This setting is not persistent across the reboots. The above bash command must be included in one of the Host initialization scripts.

cat /proc/sys/kernel/sched_rt_runtime_us    
Kernel Same-Page Metering

Kernel Same-page Metering (KSM) is a technology that finds common memory pages inside a Linux system and merges the pages to save memory. If one of the copies is updated, a new copy is created so the function is transparent to the processes on the system. KSM is highly beneficial for hypervisors when multiple guests run with the same operating system level. However, there is overhead due to the scanning process, which may cause the application to run slower.

To turn off KSM:

#systemctl disable ksm

#systemctl disable ksmtuned


Role and Role Binding

Privileges are required to edit/patch/view the resources like endpoint/service (epu for updating the ha0 IP), deployment (hpa, to scale the deployment count), pod (to fetch ha0 IP from annotations), and so forth.

Example
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: {{ .Values.global.namespace }}
  name: {{ .Release.Name }}-calculator-role

rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list", "patch", "delete"]
- apiGroups: [""]
  resources: ["pods/log"]
  verbs: ["get"]
- apiGroups: ["rbac.authorization.k8s.io"]
  resources: ["roles"]
  verbs: ["get", "list"]
- apiGroups: [""]
  resources: ["endpoints"]
  verbs: ["get", "list", "patch"]
- apiGroups: [""]
  resources: ["services"]
  verbs: ["get", "list"]
- apiGroups: ["apps"]
  resources: ["statefulsets"]
  verbs: ["get", "list"]
- apiGroups: ["apps"]
  resources: ["deployments/scale", "statefulsets/scale"]
  verbs: ["get", "patch", "update"]
- apiGroups: ["apps"]
  resources: ["deployments"]
  verbs: ["get", "list", "patch"]
- apiGroups: ["metrics.k8s.io"]
  resources: ["pods"]
  verbs: ["get", "list"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: {{ .Release.Name }}-calculator-role-binding
  namespace: {{ .Values.global.namespace }}
subjects:
- kind: ServiceAccount
  name: {{ .Values.global.serviceAccount.name }}
  namespace: {{ .Values.global.namespace }}
roleRef:
  kind: Role
  name: {{ .Release.Name }}-calculator-role
  apiGroup: rbac.authorization.k8s.io


PVC

The SBC CNF requires creating PVCs in both RWX (ReadWriteMany) and RWO (ReadWriteOnce) modes.

A minimum of 15 PVCs can be created. (Storage size depends on the type, between 100 MB and 20 GB.)


CPU Frequency

The CPU Frequency setting determines the operating clock speed of the processor and, in turn, the system performance. Redhat offers a set of built-in tuning profiles and a tool called tuned-adm that helps configure the required tuning profile.

Applying the "throughput-performance" profile is recommended, allowing the processor to operate at maximum frequency.


Apply the 'throughput-performance" tuning profile.

#tuned-adm profile throughput-performance

This configuration is persistent across reboots and takes effect immediately. There is no need to reboot the host after configuring the profile.

Determine the Active tuning profile:

#tuned-adm active

Current Active profile: throughput-performance

Container Privileges

Some SBC CNF Pods/Containers need root privileges (e.g., SC container).  

All of the containers run in privileged mode.


autoMountServiceAccountToken

Ribbon SBC CNF containers must use Kubernetes API resources from the container application to support Horizontal Pod auto-scaling & Inter-Pod communication using the eth1 interface. Needed for most of the PODs.

This requires the "autoMountServiceAccountToken" to be enabled.


Argo Operator

If progressive update is required using argo, then argo Operator should be instantiated.


Coredump Handler Requirements

This set of requirements is not mandatory. It is needed if the Customer deploys the Ribbon Coredump handler Tool, which collects core dump files from crashed containers.

  • Read/Write "Bind Mount" permissions are required for the host path where core dumps are stored.
  • "SysCtl" is required to set the core pattern on the host.

Shielding CPU Cores from the Interrupts

Isolating interrupts (IRQs) from real time workloads like SC and SLB on different dedicated CPUs (Host reserved cores) can minimize or eliminate latency in real-time environments.

Approach 1

  • On Openshift(OCP) K8 platform:

Set "globallyDisableIrqLoadBalancing" in the performance profile to "true" to shield the isolated cores from IRQs. 

apiVersion: performance.openshift.io/v2
kind: PerformanceProfile
metadata:
  name: manual
spec:
  globallyDisableIrqLoadBalancing: true 
  • On non-OCP K8s Platforms:

On the worker nodes /etc/sysconfig/irqbalance file needs to be updated to have either the IRQBALANCE_BANNED_CPULIST or IRQBALANCE_BANNED_CPUS parameter with values of CPUs(based on the version of irqbalance service) that need to be banned from IRQ.

  1. All the CPUs part of the isolcpus configuration in the grub-line has to be isolated from the IRQ servicing from the above configuration, as we don't know which set of CPUs the container workload would use in advance.
  2. After updating the above configuration, the irqbalance service has to be restarted using systemctl restart irqbalance.service
  3. Extra care is advised while allocating host reserved cores; a sufficient amount of CPUs should be allocated for the host processes(i.e., host reserved cores) in this scenario, as the IRQ will land only on the host reserved cores, leading to an increase in CPU utilization. 

Approach 2

This approach applies to the OCP K8S environment. For certain workloads, the host reserved CPUs are not always sufficient for dealing with device interrupts, and for this reason, device interrupts are not globally disabled on the isolated CPUs.

Device interrupts are load-balanced between all isolated and reserved CPUs to avoid overloading CPUs, except for CPUs with a guaranteed pod running.

Guaranteed pod CPUs are prevented from processing device interrupts when the pod annotation, irq-load-balancing.crio.io, is defined with the value as disabled.

When configured, CRI-O disables device interrupts only when the pod is running.

The corresponding update will be visible (after the latency-sensitive workload pod has been scheduled) in the /etc/sysconfig/irqbalance file - which would contain container CPUs in the IRQ banned list.

From the SBC CNF helm chart perspective, the following setting needs to be configured:

  1. disableIrqBalance: a boolean value.
    To disable interrupt request processing on vCPUs allocated to the pod for enhanced performance, set this value to true.
    true: Results in enhanced performance by banning the IRQ landing on the cores of lantency sensitive workload.
    false: default - Allows interrupt request handling on vCPUs allocated to latency sensitive workload.
  2. performanceProfileName - a string value.
    This parameter must be provided with the name of the OCP performance profile configured on worker nodes hosting latency sensitive pods when IRQ handling needs to be disabled.

    When the first approach is used, the above two parameters should be left with the default value (i.e., false and "")

Additional worker node configuration for Approach 2

The irqbalance service gets restarted every time the /etc/sysconfig/irqbalance file has to be updated with the container's CPU details as part of container scheduling.

Since the SC pods are dynamically scale-able entities based on traffic subjected to the SBC CNe cluster, the SC pods would frequently be created and destroyed - resulting in frequent restarts of irqbalance service on the worker node.

By default, the system allows five restarts (StartLimitBurst) in 10 seconds (StartLimitIntervalSec), which is not sufficient in certain scaling occasions, especially during the initial scale-out of SC deployment immediately after helm installation to minimum active SC pods.

Therefore, the irqbalance service configuration file /usr/lib/systemd/system/irqbalance.service should be updated to have StartLimitBurst set to 60 to account for the maximum number of irqbalance service restarts upon the SC pod instantiation.

A sample configuration would be as follows:

[Unit]Description=irqbalance daemon
ConditionVirtualization=!container

[Service]EnvironmentFile=/etc/sysconfig/irqbalance
ExecStart=/usr/sbin/irqbalance --foreground $IRQBALANCE_ARGS
StartLimitBurst=60 <---------------------------------------------------- 
New parameter.

[Install]
WantedBy=multi-user.target

After modifying the irqbalance.service unit file, you need to reload systemd and then restart the service for the changes to take effect:

  1. Reload systemd to pick up the changes to the unit files:

    systemctl daemon-reload
  2. Restart or reload the service:

    systemctl restart irqbalance.service

    or

    systemctl reload irqbalance.service

     

For more information, refer to: https://docs.openshift.com/container-platform/4.15/scalability_and_performance/cnf-low-latency-tuning.html


Linux Capabilities

Some SBC CNF Pods require Linux capabilities for specific functions. 

Capability/securityContextAdditional Information
  • NET_ADMIN
  • SYS_RAWIO
  • SYS_RESOURCE
  • FOWNER
  • IPC_LOCK
  • IPC_OWNER
  • KILL
  • LEASE
  • MKNOD
  • NET_BIND_SERVICE
  • NET_RAW
  • SYS_BOOT
  • SYS_MODULE
  • DAC_OVERRIDE
  • DAC_READ_SEARCH
  • SYS_RESOURCE
  • SETFCAP
  • SETPCAP
  • SETUID

Kernel Parameters

See SBC Kernel Parameters for the complete list of SBC Kernel Parameters.

Centralized Policy Server (PSX) Specific Requirements

Cluster and Node Level Settings

Configuration ItemRequirment/UsecaseHow to check if configured/set/enabled Additional Details
HugepagesA huge page size of 1Gi is required for faster memory access.$kubectl describe node <node name>

$cat /proc/cmdline
default_hugepagesz=1G hugepagesz=1G hugepages=128 +

#hugeadm --pool-list

$cat /sys/fs/cgroup/hugetlb/kubepods.slice/hugetlb.1GB.limit_in_bytes

Multus CNIPSXs use non-eth0 interfaces to communicate within themselves for db sync/replication and with RAMP for registration. Multus enables this.

$kubectl get pods --all-namespaces | grep -i multus

Check whether the Multus conf file exists:
/etc/cni/net.d/00-multus.conf 

Check the Multus binary present under:
/opt/cni/bin


Role and Role Binding

Privileges are required to edit/patch/view the resources like endpoint/service (epu for updating the ha0 IP), deployment (hpa, to scale the deployment count), pod (to fetch ha0 IP from annotations), and so forth.

How Create Role/Role Bindings
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: {{ .Values.global.namespace }}
  name: {{ .Release.Name }}-calculator-role

rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list", "patch", "delete"]
- apiGroups: [""]
  resources: ["pods/log"]
  verbs: ["get"]
- apiGroups: ["rbac.authorization.k8s.io"]
  resources: ["roles"]
  verbs: ["get", "list"]
- apiGroups: [""]
  resources: ["endpoints"]
  verbs: ["get", "list", "patch"]
- apiGroups: [""]
  resources: ["services"]
  verbs: ["get", "list"]
- apiGroups: ["apps"]
  resources: ["statefulsets"]
  verbs: ["get", "list"]
- apiGroups: ["apps"]
  resources: ["deployments/scale", "statefulsets/scale"]
  verbs: ["get", "patch", "update"]
- apiGroups: ["apps"]
  resources: ["deployments"]
  verbs: ["get", "list", "patch"]
- apiGroups: ["metrics.k8s.io"]
  resources: ["pods"]
  verbs: ["get", "list"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: {{ .Release.Name }}-calculator-role-binding
  namespace: {{ .Values.global.namespace }}
subjects:
- kind: ServiceAccount
  name: {{ .Values.global.serviceAccount.name }}
  namespace: {{ .Values.global.namespace }}
roleRef:
  kind: Role
  name: {{ .Release.Name }}-calculator-role
  apiGroup: rbac.authorization.k8s.io



PVC

The PSX CNF requires the ability to create PVCs in both RWX (ReadWriteMany) and RWO (ReadWriteOnce) modes.



Container Privileges Some PSX CNF Pods/Containers need root privileges (Primary and replica Pods).  

Linux Capabilities

Some of the PSX Pods require Linux capabilities for specific functions. The following is the complete list of Capabilities required:

Capability/securityContextAdditional Information
  • NET_ADMIN
  • SYS_RAWIO
  • SYS_RESOURCE
  • FOWNER
  • IPC_LOCK
  • IPC_OWNER
  • KILL
  • LEASE
  • MKNOD
  • NET_BIND_SERVICE
  • NET_RAW
  • SYS_BOOT
  • SYS_MODULE
  • DAC_OVERRIDE
  • DAC_READ_SEARCH
  • SYS_RESOURCE
  • SETFCAP
  • SETPCAP
  • SETUID

allowPrivilegeEscalation: true
            privileged: true


Kernel Parameters

There are no specific kernel parameters that need to be tuned for PSX.

Ribbon Application Management Platform (RAMP) Specific Requirements

Cluster and Node Level Settings

Configuration ItemRequirement/UsecaseHow to check if configured/set/enabledAdditional Details
Multus CNI

RAMP uses non-eth0 interfaces to communicate within northbound and southbound. Multus enables this.

Multus CNI plugin enables attaching multiple network interfaces to Pods. Multus acts as a meta-plugin (a CNI plugin that can call multiple other CNI plugins).

$kubectl get pods --all-namespaces | grep -i multus

Check whether the multus conf file exists
/etc/cni/net.d/00-multus.conf 

check multus binary present under
/opt/cni/bin


Role and Role Binding

Privileges are required to edit/patch/view the resources like endpoint/service (epu for updating the ha0 IP), deployment (hpa, to scale the deployment count), pod (to fetch ha0 IP from annotations), and so forth.

Example
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: {{ .Values.global.namespace }}
  name: {{ .Release.Name }}-calculator-role

rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list", "patch", "delete"]
- apiGroups: [""]
  resources: ["pods/log"]
  verbs: ["get"]
- apiGroups: ["rbac.authorization.k8s.io"]
  resources: ["roles"]
  verbs: ["get", "list"]
- apiGroups: [""]
  resources: ["endpoints"]
  verbs: ["get", "list", "patch"]
- apiGroups: [""]
  resources: ["services"]
  verbs: ["get", "list"]
- apiGroups: ["apps"]
  resources: ["statefulsets"]
  verbs: ["get", "list"]
- apiGroups: ["apps"]
  resources: ["deployments/scale", "statefulsets/scale"]
  verbs: ["get", "patch", "update"]
- apiGroups: ["apps"]
  resources: ["deployments"]
  verbs: ["get", "list", "patch"]
- apiGroups: ["metrics.k8s.io"]
  resources: ["pods"]
  verbs: ["get", "list"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: {{ .Release.Name }}-calculator-role-binding
  namespace: {{ .Values.global.namespace }}
subjects:
- kind: ServiceAccount
  name: {{ .Values.global.serviceAccount.name }}
  namespace: {{ .Values.global.namespace }}
roleRef:
  kind: Role
  name: {{ .Release.Name }}-calculator-role
  apiGroup: rbac.authorization.k8s.io



PVC

The RAMP CNF requires the ability to create PVCs in both RWX (ReadWriteMany) and RWO (ReadWriteOnce) modes.



Linux Capabilities

Some of the RAMP Pods require Linux capabilities for specific functions. Following is the complete list of Capabilities required.

Capability/securityContextAdditional Information
  • NET_ADMIN
  • SYS_RAWIO
  • SYS_RESOURCE
  • FOWNER
  • IPC_LOCK
  • KILL
  • NET_RAW
  • AUDIT_WRITE
  • SYS_CHROOT

Kernel Parameters

ParameterValues
sysctl kernel.pid_max>= 4096
fs.inotify.max_user_instances8192