Add_workflow_for_techpubs | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Panel | ||||
---|---|---|---|---|
In this section:
|
The following sections contain VM performance tuning recommendations to improve system performance. These performance recommendations are general guidelines, and are not intended to be all-inclusive.
Refer to the documentation provided by your Linux OS and KVM host vendors for complete details. For example, Redhat provides extensive documentation on using virt-manager and optimizing VM performance. Refer to the Redhat Virtualization Tuning and Optimization Guide for details.
Info | ||
---|---|---|
| ||
For performance tuning procedures on a VM instance, log onto the host system as the |
Excerpt |
---|
General Recommendations
|
Spacevars | ||
---|---|---|
|
Caption | ||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||||||||||||
|
Info | ||
---|---|---|
| ||
For GPU transcoding, ensure all power supplies are plugged into the server. |
The CPU Frequency Setting determines the operating clock speed of the processor and in turn the system performance. Red Hat offers a set of built-in tuning profiles and a tool called tuned-adm that helps in configuring the required tuning profile.
Ribbon recommends to apply the throughput-performance
tuning profile, which allows the processor to operate at maximum frequency.
# tuned-adm active
Current active profile: powersave
# tuned-adm profile throughput-performance
This configuration is persistent across reboots and takes effect immediately. There is no need to reboot the host after configuring this tuning profile.
Use the procedure below to accomplish NUMA pinning for the VM.
Info | ||
---|---|---|
| ||
You can skip NUMA pinning for virtual pkt interfaces. |
Determine the number of NUMA nodes on the host server.
Code Block |
---|
[root@srvr3320 ~]# lscpu | grep NUMA NUMA node(s): 2 NUMA node0 CPU(s): 0-7,16-23 NUMA node1 CPU(s): 8-15,24-31 [root@srvr3320 ~]# |
In this example, there are two NUMA nodes on the server.
Obtain the bus-info of the PF interface using the command ethtool -I <PF interface name>
.
Code Block |
---|
[root@srvr3320 ~]# ethtool -i ens4f0 driver: igb version: 5.6.0-k firmware-version: 1.52.0 expansion-rom-version: bus-info: 0000:81:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: yes [root@srvr3320 ~]# |
Anchor 2.b 2.b
Identify the NUMA node of the PCI device using cat /sys/bus/pci/devices/<PCI device>/numa_node
.
Code Block |
---|
[root@srvr3320 ~]# cat /sys/bus/pci/devices/0000\:81\:00.0/numa_node 1 |
Repeat the previous step for other SR-IOV interfaces from which you plan to connect VFs.
Info | ||
---|---|---|
| ||
Make sure that all PCI devices are connected to the same NUMA node. |
Once the NUMA node is discovered, set the <numatune> of the SBC VM in the VM xml file.
Code Block |
---|
<numatune> <memory mode='preferred' nodeset="1"/> </numatune> |
To determine the host system's processor and CPU details, enter the following command to determine how many vCPUs are assigned to host CPUs:
Code Block |
---|
lscpu -p |
Anchor | ||||
---|---|---|---|---|
|
Code Block | ||
---|---|---|
| ||
[root@srvr3320 ~]# lscpu -p # The following is the parsable format, which can be fed to other # programs. Each different item in every column has an unique ID # starting from zero. # CPU,Core,Socket,Node,,L1d,L1i,L2,L3 0,0,0,0,,0,0,0,0 1,1,0,0,,1,1,1,0 2,2,0,0,,2,2,2,0 3,3,0,0,,3,3,3,0 4,4,0,0,,4,4,4,0 5,5,0,0,,5,5,5,0 6,6,0,0,,6,6,6,0 7,7,0,0,,7,7,7,0 8,8,1,1,,8,8,8,1 9,9,1,1,,9,9,9,1 10,10,1,1,,10,10,10,1 11,11,1,1,,11,11,11,1 12,12,1,1,,12,12,12,1 13,13,1,1,,13,13,13,1 14,14,1,1,,14,14,14,1 15,15,1,1,,15,15,15,1 16,0,0,0,,0,0,0,0 17,1,0,0,,1,1,1,0 18,2,0,0,,2,2,2,0 19,3,0,0,,3,3,3,0 20,4,0,0,,4,4,4,0 21,5,0,0,,5,5,5,0 22,6,0,0,,6,6,6,0 23,7,0,0,,7,7,7,0 24,8,1,1,,8,8,8,1 25,9,1,1,,9,9,9,1 26,10,1,1,,10,10,10,1 27,11,1,1,,11,11,11,1 28,12,1,1,,12,12,12,1 29,13,1,1,,13,13,13,1 30,14,1,1,,14,14,14,1 31,15,1,1,,15,15,15,1 [root@srvr3320 ~]# |
The first column lists the logical CPU number of a CPU as used by the Linux kernel. The second column lists the logical core number - use this information for vCPU pinning.
CPU pinning ensures that a VM only gets CPU time from a specific CPU or set of CPUs. Pinning is performed on each logical CPU of the guest VM against each core ID in the host system. The CPU pinning information is lost every time the VM instance is shut down or restarted. To avoid entering the pinning information again, update the KVM configuration XML file on the host system.
Info | ||
---|---|---|
| ||
|
Use the following steps to update the pinning information in the KVM configuration XML file:
Start virsh.
Code Block | ||
---|---|---|
| ||
virsh [root@kujo ~]# virsh Welcome to virsh, the virtualization interactive terminal. Type: 'help' for help with commands 'quit' to quit virsh # |
Edit the VM instance:
Code Block | ||
---|---|---|
| ||
virsh # edit <KVM_instance_name> |
Search for the vcpu placement
attribute.
Make sure the vCPUs are pinned to the correct NUMA node CPUs.
recommends to reserve the 1-core siblings of each NUMA node for the host process (do not use for the VM). Since the PCI is connected to NUMA node1 (as determined in step step 2.b of NUMA Pinning procedure), you must pin the vCPUs of the VM from the CPU siblings in NUMA node1. Spacevars 0 company
Skip the first physical core siblings, 8 and 24, and pin the rest.
Code Block |
---|
<vcpu placement='static' cpuset='9,25,10,26'>4</vcpu> <cputune> <vcpupin vcpu="0" cpuset="9"/> <vcpupin vcpu="1" cpuset="25"/> <vcpupin vcpu="2" cpuset="10"/> <vcpupin vcpu="3" cpuset="26"/> </cputune> |
As the CPU Architecture Example shows , you must pin the cores to their siblings (i.e. the two Hyperthreads coming from the same physical core). The second column in the example shows the physical core number.
Info | ||
---|---|---|
| ||
Note: As Sub-NUMA Clustering is disabled in the BIOS, each Socket will represent each numa node. So in this case socket 0 is NUMA node0 and Socket 1 is NUMA node1. Make sure that all the vCPUs are pinned to the same NUMA node and don’t cross the NUMA boundary. |
Tip | ||
---|---|---|
| ||
Ensure that no two VM instances have the same physical core affinity. For example, if VM1 has an affinity of 9,25,10,26 assigned, then no other VM should be pinned to this core again. To Assign CPU pinning to other VMs, use the other available cores on the host, leaving the first 2 logical cores (as described in Perform Host Pinning) per NUMA node for the host. Also, assign all other VM instances running on the same host with affinity; otherwise the VMs without affinity may impact the performance of VMs that have affinity. |
Save and exit the XML file.
Code Block |
---|
:wq |
Spacevars | ||
---|---|---|
|
host-model
using a virsh
command in the host system.Use the following steps to edit the VM CPU mode:
Start virsh.
Code Block |
---|
virsh |
The virsh
prompt displays.
Edit the VM instance:
Code Block | ||
---|---|---|
| ||
edit <KVM_instance_name> |
Search for the cpu mode
attribute.
Edit the cpu mode
attribute:
Tip | ||
---|---|---|
| ||
Ensure the topology details entered are identical to the topology details set while creating the VM instance. For example, if the topology was set to 1 socket, 2 cores and 2 threads, enter the same details in this XML file. |
Save and exit the XML file.
Code Block |
---|
:wq |
Start the VM instance.
Code Block | ||
---|---|---|
| ||
start <KVM_instance_name> |
This section is applicable only for virt-io based interfaces.
Spacevars | ||
---|---|---|
|
To increase the Transmit Queue Length to 4096:
Start virsh:
Code Block | ||
---|---|---|
| ||
virsh |
The virsh
prompt displays.
Identify the available interfaces.
Code Block | ||
---|---|---|
| ||
domiflist <VM_instance_name> |
The list of active interfaces displays.
Caption | ||||
---|---|---|---|---|
| ||||
Increase the Transmit Queue Lengths for the tap interfaces.
Code Block | ||
---|---|---|
| ||
ifconfig <interface_name> txqueuelen <length> |
The interface_name
is the name of the interface you want to change, and length
is the new queue length. For example, ifconfig macvtap4 txqueuelen 4096
.
Verify the value of the interface length.
Code Block | ||
---|---|---|
| ||
ifconfig <interface_name> |
Example output:
Modify/Create the 60-tap.rules file and add the KERNEL command
Code Block |
---|
# vim /etc/udev/rules.d/60-tap.rules KERNEL=="tap*", RUN+="/sbin/ip link set %k txqueuelen 4096" – Add this line # udevadm control --reload-rules |
Apply the rules to already created interfaces.
Code Block |
---|
# udevadm trigger --attr-match=subsystem=net |
Reboot the host.
Kernel same-page metering (KSM) is a technology which finds common memory pages inside a Linux system and merges the pages to save memory resources. In the event of one of the copies being updated, a new copy is created so the function is transparent to the processes on the system. For hypervisors, KSM is highly beneficial when multiple guests are running with the same level of the operating system. However, there is overhead due to the scanning process which may cause the applications to run slower, which is not desirable.
To turn off KSM in the host:
Deactivate KSM by stopping the ksmtuned
and the ksm
services as shown below. This does not persist across reboots.
Code Block |
---|
# systemctl stop ksm # systemctl stopksmtuned |
Disable KSM persistently as shown below:
Code Block |
---|
# systemctl disable ksm # systemctl disable ksmtuned |
To avoid performance impact on VMs due to host-level Linux services, host pinning isolates physical cores where a guest VM is hosted from physical cores where the Linux host processes/services run.
Spacevars | ||
---|---|---|
|
In this example, the core 0 (Core 0 and core 16 are logical cores) and core 8 (Core 8 and core 24 are logical cores) are reserved for Linux host processes.
Info | ||
---|---|---|
| ||
The |
Configure the CPUAffinity
option in /etc/systemd/system.conf
:
Code Block |
---|
[root@srvr3320 ~]# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 32 On-line CPU(s) list: 0-31 Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 45 Model name: Intel(R) Xeon(R) CPU E5-2658 0 @ 2.10GHz Stepping: 7 CPU MHz: 1782.128 CPU max MHz: 2100.0000 CPU min MHz: 1200.0000 BogoMIPS: 4190.19 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 20480K NUMA node0 CPU(s): 0-7,16-23 NUMA node1 CPU(s): 8-15,24-31 |
To dedicate the physical CPUs 0 and 8 for host processing, specify CPUAffinity as 0 8 16 24 in the file /etc/systemd/system.conf.
Code Block |
---|
CPUAffinity=0 8 16 24 |
Restart the system.
The <emulatorpin>
tag specifies to which host physical CPUs the emulator (a subset of a domain, not including vCPUs) is pinned. The <emulatorpin>
tag provides a method of setting a precise affinity to emulator thread processes. As a result, vhost threads run on the same subset of physical CPUs and memory, thus benefit from cache locality.
Code Block | ||
---|---|---|
| ||
<cputune> <emulatorpin cpuset="11,27"/> </cputune> |
The <emulatorpin>
tag is required in order to isolate the virtio network traffic to be pinned to a different core than the VM vCPUs. This greatly reduces the percentage steal seen inside the VMs.
Info | ||||
---|---|---|---|---|
| ||||
|
Spacevars | ||
---|---|---|
|
The number of hugepages is decided based on the total memory available on the host.
recommends to configure 80-90% of total memory as hugepage memory and leave the rest as normal linux memory. Spacevars 0 company
Configure the huge page size as 1G and number of huge pages by appending the following line to the kernel command line options in /etc/default/grub. In the example below, the host has a total of 256G memory, out of which 200G is configured as hugepages.
Code Block |
---|
GRUB_TIMEOUT=5 GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)" GRUB_DEFAULT=saved GRUB_DISABLE_SUBMENU=true GRUB_TERMINAL_OUTPUT="console" GRUB_CMDLINE_LINUX="console=tty0 console=ttyS0,115200n8 crashkernel=auto intel_iommu=on iommu=pt default_hugepagesz=1GB hugepagesz=1G hugepages=200 rhgb quiet" GRUB_DISABLE_RECOVERY="true" |
Regenerate the GRUB2 configuration as shown below:
If your system uses BIOS firmware, issue the command:
Code Block |
---|
# grub2-mkconfig -o /boot/grub2/grub.cfg |
If your system uses UEFI firmware, issue the command:
Code Block |
---|
# grub2-mkconfig -o /boot/efi/EFI/redhat/grub.cfg |
Add lines in your instance XML file using virsh
edit
<instanceName
>.
Info | ||
---|---|---|
| ||
Make sure that the PCI device (SR-IOV, vCPU and VM memory) comes from the same NUMA node. For virtual pkt interfaces, Also, ensure that the vCPU and memory comes from the same NUMA node. |
Code Block |
---|
<memory unit='KiB'>33554432</memory> <currentMemory unit='KiB'>33554432</currentMemory> <memoryBacking> <hugepages> <page size='1048576' unit='KiB' nodeset='1'/> </hugepages> </memoryBacking> |
Tip |
---|
This example pins the VM on NUMA node1. For hosting a second VM on other NUMA node use the proper NUMA node value in the nodeset = <NUMA Node>. |
Restart the host.
Obtain the PID of the VM:
Code Block |
---|
ps -eaf | grep qemu | grep -i <vm_name> |
Verify VM memory is received from a single NUMA node:
Code Block |
---|
numastat -p <vmpid> |
Perform the following steps to disable flow control.
Info | ||||
---|---|---|---|---|
| ||||
This setting is optional and depends on NIC capability. Not all NICs allow you to modify the flow control parameters. If it is supported by NICs,
|
To disable flow control:
root
user.Disable flow control for interfaces attached to the SWe VM.
Tip | ||
---|---|---|
| ||
Use the |
Code Block |
---|
ethtool -A <interface name> rx off tx off autoneg off |
Code Block | ||
---|---|---|
| ||
ethtool -A p4p3 rx off tx off autoneg off ethtool -A p4p4 rx off tx off autoneg off ethtool -A em3 rx off tx off autoneg off ethtool -A em4 rx off tx off autoneg off |
To make the setting persistent:
The network service in CentOS/RedHat has the ability to make the setting persistent. The script /etc/sysconfig/network-scripts/ifup-post
checks for the existence of /sbin/ifup-local
, and if it exists, runs it with the interface name as a parameter (e.g. /sbin/ifup-local eth0
)
Steps:
touch /sbin/ifup-local
chmod +x /sbin/ifup-local
chcon --reference /sbin/ifup /sbin/ifup-local
Here is an example of a simple script to apply the same settings to all interfaces (except lo):
Code Block |
---|
#!/bin/bash if [ -n "$1" ]; then if [ "$1" != "lo" ];then /sbin/ethtool -A $1 rx off tx off autoneg off fi fi |
Below is an example KVM configuraion XML file that includes all of the above changes. The highlighted text identifies the changed values, which should be followed properly as described above.
Panel | ||||
---|---|---|---|---|
| ||||
|
This section applies only to virt-io-based packet interfaces. Virt-IO networking works by sending interrupts on the host core. SBC VM performance can be impacted if frequent processing interruptions occur on any core of the VM. To avoid this, the affinity of the IRQs for a virtio-based packet interface should be different from the cores assigned to the SBC VM.
The /proc/interrupts
file lists the number of interrupts per CPU, per I/O device. IRQs have an associated "affinity" property, "smp_affin
ity," that defines which CPU cores are allowed to run the interrupt service routine (ISR) for that IRQ. Refer to the distribution guidelines of the host OS for the exact steps to locate and specify the IRQ affinity settings for a device.
External Reference: https://access.redhat.com/solutions/2144921
Include Page | ||||
---|---|---|---|---|
| ||||