KVM Performance Tuning

In this section:

There are VM operating parameters you can set to improve system throughput for a single or multiple VMs installed on a KVM host. Some VM operating parameters are set on the KVM host and are modified any time when the VM instance is running, while others are set on the VM and are only configured when the VM instance is shut down.

The following sections contain VM performance tuning recommendations to improve system performance. These performance recommendations are general guidelines and are not exhaustive. Refer to the documentation provided by your Linux OS and KVM host vendors. For example, Redhat provides extensive documentation on using virt-manager and optimizing VM performance. Refer to the Redhat Virtualization Tuning and Optimization Guide.

Note:

For performance tuning procedures on a VM instance you must log on to the host system as the root user.

General Recommendations

The following general recommendations apply to any platform where SBC SWe is deployed:

The number of vCPUs deployed on a system should be an even number (4, 6, 8, etc.).
For best performance, deploy only a single instance on a single NUMA. Performance degradation occurs if you host more than one instance on a NUMA or if a single instance spans multiple NUMAs.
Make sure that the physical NICs associated with an instance are connected to the same NUMA/socket where the instance is hosted. In the case of a dual NUMA host, ideally two instances should be hosted, with each instance on a separate NUMA and the associated NICs of each of the instances connected to their respective NUMAs.
To optimize performance, configure memory card equally on both NUMA nodes. For example if a dual NUMA node server has a total of 128 GiB of RAM, configure 64 GiB of RAM on each NUMA node.

Recommended BIOS Settings

Ribbon recommends applying the BIOS settings in the following table on all VMs for optimum performance:

Recommended BIOS Settings

BIOS Parameter	Setting	Comments
CPU power management	Balanced	Ribbon recommends Maximum Performance
Intel Hyper-Threading	Enabled
Intel Turbo Boost	Enabled
Intel VT-x (Virtualization Technology)	Enabled	For hardware virtualization

All server BIOS settings are different, but in general, the following guidelines apply:

Set power profiles to maximum performance
Set thermal configurations to optimal cooling
Disable HW prefetcher

Note

For GPU transcoding, ensure that all power supplies are plugged in to the server.

CPU Frequency Setting on the Host

Check the current configuration of the CPU frequency setting using the following command on the host system.

# cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

The CPU frequency setting must be set to performance to improve VNF performance. Use the following command on the host system:

# echo "performance" | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

Note

You must ensure that the settings persist across reboot.

Processor and CPU Details

To determine the host system's processor and CPU details, perform the following steps:

Execute the following command to determine how many vCPUs are assigned to host CPUs:
```
lscpu -p
```
The command provides the following output:

CPU Architecture

The first column lists the logical CPU number of a CPU as used by the Linux kernel. The second column lists the logical core number, this information can be used for vCPU pinning.

Persistent CPU Pinning

CPU pinning ensures that a VM only gets CPU time from a specific CPU or set of CPUs. Pinning is performed on each logical CPU of the guest VM against each core ID in the host system. The CPU pinning information will be lost every time the VM instance is shutdown or restarted. To avoid entering the pinning information again, you must update the KVM configuration XML file on the host system.

Note:

Ensure that no two VM instances are allocated the same physical cores on the host system.
Ensure that all the VMs hosted on the physical server are pinned.
To create vCPU to hyper-thread pinning, pin consecutive vCPUs to sibling threads (logical cores) of the same physical core. The logical core/sibling threads can be identified from the output returned by the command lscpu on the host.
Do not include the 0th physical core of the host in pinning. This is recommended because most host management/kernel threads are spawned on the 0th core by default.

To update the pinning information in the KVM configuration XML file:

Shutdown the VM instance.
Enter the following command.
```
virsh
```
The command provides the following output:

virsh Prompt
Enter the following command to edit the VM instance:
```
virsh # edit <KVM_instance_name>
```
Search for the vcpu placement attribute.

vCPU Placement Attribute
Enter CPU pinning information as shown below:

CPU Pinning Information

Tip

Ensure that no two VM instances have the same physical core affinity. For example, if VM1 has affinity of 0,1,2,3 assigned, then no VM should be pinned to 0,1,2,3,8,9,10 or 11 as these CPUs belong to the physical core assigned to VM1. Also, all other VM instances running on the same host must be assigned with affinity, otherwise the VMs without affinity might impact the performance of VMs having affinity.
Enter the following command to save and exit the XML file.
```
:wq
```

CPU Mode Configuration

Even if the Copy host CPU configuration was selected while creating a VM instance, the host configuration may not be copied on the VM instance. To resolve this issue, you must edit the CPU mode to host-passthrough using a virsh command in the host system.

To edit the VM CPU mode:

Shutdown the VM instance.
Enter the following command.
```
virsh
```
The command provides the following output:

virsh Prompt
Enter the following command to edit the VM instance:
```
edit <KVM_instance_name>
```
Search for the cpu mode attribute.

cpu mode
Edit the cpu mode attribute with the following:

Editing CPU Mode

Tip

The topology details entered must be same as the topology details that were set while creating the VM instance.
For example, if the topology was set to 1 socket, 4 cores and 1 thread, the same must be entered in this XML file.
Enter the following command to save and exit the XML file.
```
:wq
```
Enter the following command to start the VM instance.
```
start <KVM_instance_name>
```
Enter the following command to verify the host CPU configuration on the VM instance:
```
cat /proc/cpuinfo
```
The command provides the following output.

Verifying CPU Configuration

Increasing the Transmit Queue Length

By default, the transmit queue length is set to 500. To increase the transmit queue length to 4096:

Execute the following command to identify the available interfaces:
```
virsh
```
The virsh prompt is displayed.
Execute the following command.
```
domiflist <VM_instance_name>
```
The list of active interfaces is displayed.

Active Interfaces List
Execute the following command to increase the transmit queue lengths for the tap interfaces.
```
ifconfig <interface_name> txqueuelen <length>
```
where interface_name is the name of the interface you want to change, and length is the new queue length. For example, ifconfig macvtap4 txqueuelen 4096.
Execute the following command to verify the value of the interface length.
```
ifconfig <interface_name>
```
The command provides the following output.

Interface Information

Kernel Same-page Metering (KSM) Settings

Apply the following settings to all VMs installed on the host.

Kernel same-page metering (KSM) is a technology which finds common memory pages inside a Linux system and merges the pages to save memory resources. In the event of one of the copies being updated, a new copy is created so the function is transparent to the processes on the system. For hypervisors, KSM is highly beneficial when multiple guests are running with the same level of the operating system. However, there is overhead due to the scanning process which may cause the applications to run slower, which is not desirable. The SBC SWe requires that KSM is turned off.

The sample commands below are for Ubuntu 4.4; use the syntax that corresponds to your operating system.

# echo 0 >/sys/kernel/mm/ksm/run
# echo "KSM_ENABLED=0" > /etc/default/qemu-kvm

Once KSM is turned off, it is important to verify that there is still sufficient memory on the hypervisor. When the pages are not merged, it may increase memory usage and lead to swapping that negatively impacts performance.

Host Pinning

To avoid performance impact on VMs due to host-level Linux services, host pinning isolates physical cores where a guest VM is hosted from physical cores where the Linux host processes/services run. In this example, the core 0 (Core 0 and core 36 are logical cores) and core 1 (Core 1 and core 37 are logical cores) are reserved for Linux host processes.

The CPUAffinity option in /etc/systemd/system.conf sets affinity to systemd by default, as well as for everything it launches, unless their .service file overrides the CPUAffinity setting with its own value. Configure the CPUAffinity option in /etc/systemd/system.conf.

Execute the following command:

lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                72
On-line CPU(s) list:   0-71
Thread(s) per core:    2
Core(s) per socket:    18
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 79
Model name:            Intel(R) Xeon(R) CPU E5-2697 v4 @ 2.30GHz
Stepping:              1
CPU MHz:               2699.984
BogoMIPS:              4604.99
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              46080K
NUMA node0 CPU(s):     0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54,56,58,60,62,64,66,68,70
NUMA node1 CPU(s):     1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47,49,51,53,55,57,59,61,63,65,67,69,71

To dedicate the physical CPUs 0 and 1 for host processing, in the file /etc/systemd/system.conf, specify CPUAffinity as 0 1 36 37, as shown below. Restart the system.

CPUAffinity=0 1 36 37

Back up VMs with hugepages

Mount the HugeTLB filesystem on the host.
```
mkdir -p /hugepages
```

Add the following line in the /etc/fstab file.

hugetlbfs    /hugepages    hugetlbfs    defaults    0 0

Configure the number of 2M hugepages equal to the vRAM requirement for hosting a VM:

cat /etc/sysctl.conf# System default settings live in /usr/lib/sysctl.d/00-system.conf.
# To override those settings, enter new settings here, or in an /etc/sysctl.d/<name>.conf file
#
# For more information, see sysctl.conf(5) and sysctl.d(5).
vm.nr_hugepages = 25000 (assuming a 24G VM)
vm.hugetlb_shm_group = 36

Add lines in your instance XML file using virsh edit <instanceName>:

<domain type='kvm' id='3'>
  <name>RENGALIVM01</name>
  <uuid>f1bae5a2-d26e-4fc0-b472-3638743def9a</uuid>
  <memory unit='KiB'>25165824</memory>
  <currentMemory unit='KiB'>25165824</currentMemory>
  <memoryBacking>
   <hugepages>
      <page size='2048' unit='KiB' nodeset='0'/>  
    </hugepages>
  </memoryBacking>

Tip

The previous example pins the VM on NUMA node 0. For hosting a second VM on NUMA node 1, use nodeset = ‘1’.

Restart the host.
To verify, get the PID for the VM and execute the following command to check that VM memory is received from a single NUMA node:
```
numastat -p  <vmpid>
```

Disable Flow Control

Log into the system as the root user.
Execute the following command to disable flow control for interfaces attached to the SWe VM.
```
ethtool -A <interface name> rx off tx off autoneg off  
```
Tip

Use the <interface name> from the actual configuration.

Example:
ethtool -A p4p3 rx off tx off autoneg off
ethtool -A p4p4 rx off tx off autoneg off
ethtool -A em3 rx off tx off autoneg off
ethtool -A em4 rx off tx off autoneg off

Note:

Refer to the RHEL site for information on how to make NIC ethtool settings persistent (applied automatically at boot).

OVS-DPDK Virtio Interfaces - Performance Tuning Recommendations

Follow the open stack recommended performance settings for host and guest.
Refer to VNF Performance Tuning for details.
Make sure that physical network adapters, Poll Mode Driver (PMD) threads, and pinned CPUs for the instance are all on the same NUMA node. This is required for optimal performance.

PMD threads are the threads that do the heavy lifting for userspace switching. They perform tasks such as continuous polling of input ports for packets, classifying packets once received, and executing actions on the packets once they are classified.
Set the queue size for virtio interfaces to 1024 by updating the Director template.
1. NovaComputeExtraConfig: - nova::compute::libvirt::tx_queue_size: '"1024"'
2. NovaComputeExtraConfig: - nova::compute::libvirt::rx_queue_size: '"1024"'
Configure the following dpdk parameters in host ovs-dpdk:
1. Make sure two pairs of Rx/Tx queues are configured for host dpdk interfaces.
  
  To validate, issue the following command during ovs-dpdk bring-up:
  ovs-vsctl get Interface dpdk0 options
  For background details, refer to http://docs.openvswitch.org/en/latest/howto/dpdk/
2. Enable per-port memory to allow each port to use separate mem-pools when receiving packets instead of using a default shared mem-pool:
  ovs-vsctl set Open_vSwitch . other_config:per-port-memory=true
3. Configure 4096 MB hugepage memory on each socket:
  ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem=4096,4096
4. Spawn the appropriate number of PMD threads so that each port/queue is serviced by a particular PMD thread.
  Ensure that PMD threads are:
  1. pinned to dedicated cores/hyper-threads,
  2. in the same NUMA as network adapter and guest,
  3. isolated from the kernel, and
  4. not be used by guest for any other purpose.
5. Set the pmd-cpu-mask, accordingly:
  ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x40001004000100
  
  The example above sets PMD threads to run on two physical cores:8,26,36,54. (cores:8-36 and 26-54 are sibling hyper-threads).
6. Restart ovs-vswitchd after the changes:
  systemctl status ovs-vswitchd
  systemctl restart ovs-vswitchd
The port and Rx queue assignment to PMD threads is crucial for optimal performance. Follow http://docs.openvswitch.org/en/latest/topics/dpdk/pmd/ for more details. The affinity is a csv list of <queue_id>:<core_id> which needs to be set for each ports.

ovs-vsctl set interface dpdk0 other_config:pmd-rxq-affinity="0:8,1:26" ovs-vsctl set interface vhub89b3d58-4f other_config:pmd-rxq-affinity="0:36"
ovs-vsctl set interface vhu6d3f050e-de other_config:pmd-rxq-affinity="1:54"

In the example above, the PMD thread on core 8 will read queue 0 and PMD thread on core 26 will read queue 1 of dpdk0 interface.

Alternatively, you can use the default assignment of port/Rx queues to PMD threads and enable auto-load-balance option so that ovs will put the threads on cores based on load.

ovs-vsctl set open_vswitch . other_config:pmd-auto-lb="true" ovs-appctl dpif-netdev/pmd-rxq-rebalance

Troubleshooting

To check the port/Rx queue distribution among PMD threads, enter the command:
ovs-appctl dpif-netdev/pmd-rxq-show
To check the PMD thread stats (actual CPU usage), enter the following command, and look for "processing cycles" and "idle cycles":
ovs-appctl dpif-netdev/pmd-stats-clear && sleep 10 && ovs-appctl dpif-netdev/pmd-stats-show
To check packet drops (rx_dropped / tx_dropped counters) on host dpdk interfaces, enter the command:
watch -n 1 'ovs-vsctl get interface dpdk0 statistics|sed -e "s/,/\n/g" -e "s/[\",\{,\}, ]//g" -e "s/=/ =\u21d2 /g"'

Refer to the following page for troubleshooting performance issues/packet drops in an ovs-dpdk environment:

https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/10/html/ovs-dpdk_end_to_end_troubleshooting_guide/validating_an_ovs_dpdk_deployment#find_the_ovs_dpdk_port_physical_nic_mapping_configured_by_os_net_config

Benchmarking

Setup details:

Platform: RHOSP13
Host OS: RHEL7.5
Processor: Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz
1 Provider Network configured for Management Interface
1 Provider Network configured for HA Interface
OVS+DPDK enabled for packet interfaces (pkt0 and pkt1)
2 pair of Rx/Tx queues in host dpdk interfaces
1 Rx/Tx queue in guest virtio interface
4 PMD threads pinned to 4 hyper threads (i.e. using up 2 physical cores)

Guest Details:

SSBC - 8vcpu/18GB RAM/100GB HDD
MSBC - 10vcpu/20GB RAM/100 GB HDD

Benchmarking has been tested in a D-SBC setup with up to 30k pass-through sessions using the recommendations described in this document.

You may require additional cores for PMD threads for higher numbers.

External References

https://docs.openvswitch.org/en/latest/howto/dpdk/

https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/10/html/ovs-dpdk_end_to_end_troubleshooting_guide/index

https://docs.openvswitch.org/en/latest/topics/dpdk/pmd/

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/virtualization_tuning_and_optimization_guide/index

Space shortcuts

Page tree