In this section:

This document provides VM performance tuning recommendations to improve system performance on the Red Hat Enterprise Virtualization (RHEV) platform.

General Recommendations

  • Ensure the number of vCPUs in an instance is always an even number (4, 6, 8, and so on) as hyper threaded vCPUs are used.
  • For best performance, make sure a single instance is confined to a single NUMA. Performance degradation occurs if an instance spans across multiple NUMAs.
  • Ensure the physical NICs associated with an instance are connected to the same NUMA/socket where the instance is hosted. Doing so reduces the remote node memory access, which in turn helps improve the performance.

Recommended BIOS settings

Ribbon recommends the following BIOS settings in the host for optimum performance.

Recommended BIOS Settings for optimal performance of the SBC running on the server:

BIOS Settings for optimal performance of the SBC

BIOS ParameterSetting

CPU power management/ Power Regulator

Maximum performance

or Static High Performance

Intel Hyper-Threading

Enabled

Intel Turbo Boost

Enabled

Intel VT-x (Virtualization Technology)

Enabled

Thermal Configuration

Optimal Cooling, or Maximum Cooling

Minimum Processor Idle Power Core C-state

No C-states

Minimum Processor Idle Power Package C-state

No Package state

Energy Performance BIAS

Max Performance

Sub-NUMA Clustering

Disabled

HW Prefetcher

Disabled

SRIOV

Enabled

Intel VT-d

Enabled

Host settings

High Level Guidance

  • Disable KSM.
  • Set appropriate CPU frequency setting on the host.
  • Isolate vCPUs for host processes, so that remaining vCPUs can be used by guest VMs. Before embarking on this, CPU layout of the host machine should be known.
  • Recommended guidance is to reserve first two physical cores of each socket for host processes.

Kernel Same-page Metering (KSM) Settings

Kernel same-page metering (KSM) is a technology which finds common memory pages inside a Linux system and merges the pages to save memory resources. In the event of one of the copies being updated, a new copy is created so the function is transparent to the processes on the system. For hypervisors, KSM is highly beneficial when multiple guests are running with the same level of the operating system. However, there is overhead due to the scanning process which may cause the applications to run slower, which is not desirable.
Turn off KSM in the host.

Deactivate KSM by stopping the ksmtuned and the ksm services as shown below. This does not persist across reboots.

# systemctl stop ksm
# systemctl stop ksmtuned

Disable KSM persistently as shown below:

# systemctl disable ksm
# systemctl disable ksmtuned

CPU Frequency Setting on the Host

The CPU frequency setting determines the operating clock speed of the processor and in turn the system performance. Red Hat offers a set of in-built tuning profiles and a tool called tuned-adm that helps in configuring the required tuning profile.

Ribbon recommends to apply throughput-performance tuning profile, which makes the processor to operate at maximum frequency.

Find out the active tuning profile.

# tuned-adm active

Example output:

[root@typhoon3 ~]# tuned-adm active
Current active profile: virtual-host
[root@typhoon3 ~]#

In the above example, the active profile is the virtual-host.

Apply the throughput-performance tuning profile by running the following command:

# tuned-adm profile throughput-performance

Example output:

[root@typhoon3 ~]# tuned-adm profile throughput-performance
[root@typhoon3 ~]#
[root@typhoon3 ~]# tuned-adm active
Current active profile: throughput-performance
[root@typhoon3 ~]#

This configuration is persistent across reboots and takes effect immediately. There is no need to reboot the host after configuring this tuning profile.

CPU affinity

On the host, edit, /etc/systemd/system.conf and update the CPUAffinity field with vCPUs numbers where the host processes are to be pinned.

For example, the HPE DL360 Gen10 server with Intel(R) Xeon(R) Gold 6226R processor has the following CPU layout:

============================================================
Core and Socket Information (as reported by '/proc/cpuinfo')
============================================================

cores =  [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
sockets =  [0, 1]
cpu model =  Intel(R) Xeon(R) Gold 6226R CPU @ 2.90GHz

        Socket 0        Socket 1
        --------        --------
Core 0  [0, 32]         [16, 48]

Core 1  [1, 33]         [17, 49]

Core 2  [2, 34]         [18, 50]

Core 3  [3, 35]         [19, 51]

Core 4  [4, 36]         [20, 52]

Core 5  [5, 37]         [21, 53]

Core 6  [6, 38]         [22, 54]

Core 7  [7, 39]         [23, 55]

Core 8  [8, 40]         [24, 56]

Core 9  [9, 41]         [25, 57]

Core 10 [10, 42]        [26, 58]

Core 11 [11, 43]        [27, 59]

Core 12 [12, 44]        [28, 60]

Core 13 [13, 45]        [29, 61]

Core 14 [14, 46]        [30, 62]

Core 15 [15, 47]        [31, 63]

Assign the first two physical cores of each CPU socket for host processes, which translates to vCPUs 0, 1, 16, 17, 32 ,33, 48, 49 for the CPU layout shown above.

Then, the CPUAffinity field of /etc/systemd/system.conf field will have these vcpu numbers in the list.

The setting will look like:

CPUAffinity = 0 1 16 17 32 33 48 49

Reboot the host to apply the CPU affinity changes.

SBC VM Settings

High Level Guidance

  • All vCPUs of the guest VM should belong to the same host NUMA node (CPU socket).
  • VM should emulate a single virtual socket with single Virtual CPU per VM core.
  • Preferably, ensure memory used by this VM is from the same NUMA node.
  • Ensure guest vCPU to host vCPU pinning such that two consecutive guest vCPUs belong to the same host physical core (shown later).
  • CPU model/emulation should use host passthrough to leverage all capabilities of the host processor.
  • PCI (including PCI SR-IOV) devices being passthrough to guest should belong to the same NUMA node as that of its vCPU.
  • Each VM requires 4 NIC ports. The first two NIC ports should be Virtio based and map to mgt0 and ha0. The third and fourth NIC ports should be SR-IOV based and map to pkt0 and pkt1.
  • If the SBC VM is already created by this point in time, then ensure the SBC VM is powered off, for applying/modifying the CPU pinning and other configurations.

CPU Pinning

CPU pinning ensures that a VM only gets CPU time from a specific CPU or set of CPUs of the host. Pinning is performed on each logical CPU of the guest VM against each core ID in the host system. The official RHEV documentation for this setting is available here: Virtual machine management guide.

CPU Pinning Topology

ItemDescription

CPU Pinning topology

The syntax of CPU pinning is v#p[_v#p], for example:

  • 0#0 - Pins vCPU 0 to pCPU 0.
  • 0#0_1#3 - Pins vCPU 0 to pCPU 0, and pins vCPU 1 to pCPU 3.

To pin a virtual machine to a host, you must also select the following on the Host tab:

  • Start Running On: Specific
  • Migration Options: Do not allow migration
  • Pass-Through Host CPU

If CPU pinning is set and you change Start Running On: Specific or Migration Options: Do not allow migration, a CPU pinning topology will be lost window appears when you click OK.

On the RHEV, perform the guest to host vCPU pinning through the Resource Allocation > CPU Pinning Topology tab in the Edit Virtual Machine settings tab.

Edit Virtual Machine > Resource Allocation

Update the CPU pinning topology field with the pinning details. For example, the CPU pinning configuration for a 24vCPU VM is shown below:

0#2_1#34_2#3_3#35_4#4_5#36_6#5_7#37_8#6_9#38_10#7_11#39_12#8_13#40_14#9_15#41_16#10_17#42_18#11_19#43_20#12_21#44_22#13_23#45

Additional mandatory attributes 

Set the following additional attributes for the SBC VM:

Under Virtual Machine > System Settings tab set the following options:

AttributesValue

Total Virtual CPUs

24

Cores per virtual socket

24

Virtual sockets

1

Threads per core

1

Note

The total virtual CPUs value shown in the table is for example only.

Ensure that the following conditions are met:

  • The total virtual CPUs value is equal to the value estimated by the VNF estimator tool for the customer's traffic requirement.
  • The cores per virtual socket is equal to the total virtual CPUs.
  • The number of virtual sockets is set to 1.
  • The number of threads per core is set to 1.

Ensuring Host CPU Capability Passthrough on RHEV

In Edit Virtual Machine > Host tab enable the following:

  1. Select the appropriate host in “Specific Host(s)” option.
  2. Enable the “Pass-through Host CPU” tick box.


Edit Virtual Machine Host Tab

Ensuring NUMA locality of Memory

In the Edit Virtual Machine > Host tab, do the following:

  1. Configure NUMA node count as 1, and set the Tune Mode as Preferred as shown below: 

    Configure NUMA

Configuring Virtual NUMA

  1. Click the Virtual Machines tab and select a virtual machine.
  2. Click Edit.
  3. Click the Host tab.
  4. Select the Specific Host(s) radio button and select a host from the list. The selected host must have at least two NUMA nodes.
  5. Select Do not allow migration from the Migration Options drop-down list.
  6. Enter a number into the NUMA Node Count field to assign virtual NUMA nodes to the virtual machine.
  7. Select StrictPreferred, or Interleave from the Tune Mode drop-down list. If the selected mode is Preferred, then set the NUMA Node Count to 1.
  8. Click NUMA Pinning
  9. In the NUMA Topology window, click and drag virtual NUMA nodes from the box on the right to host NUMA nodes on the left as required, and click OK.
  10. Click OK.

NUMA Topology


Note
  • Do not pin any other VMs to the NUMA allocated to the SBC SWe VM. Inclusion of another VM on the same NUMA leads to significant performance degradation.
  • Do not pin the vCPUs allocated to the SBC SWe to other VMs as it leads to performance degradation.
  • Do not pin the vCPUs allocated for the host processes (as configured in /etc/systemd/system.conf) to the SBC SWe VM.
  • Do not launch an unpinned VM on the same server as it results in performance degradation of the SBC VM, as the vCPUs of the unpinned VM can use any of the underlying host vCPUs randomly.