In this section:
The following sections contain VMware performance tuning recommendations to improve system performance. These performance recommendations are general guidelines and are not all-inclusive.
BIOS Setting Recommendations
Ribbon recommends the following BIOS settings for optimum performance. The following table is intended as a reference. Exact values may vary based on vendor and HW.
Parameter | Recommended Setting |
---|---|
CPU power management/ Power Regulator | Maximum performance, or Static High Performance |
Intel Hyper-Threading | Enabled |
Intel Turbo Boost | Enabled |
Intel VT-x (Virtualization Technology) | Enabled |
Thermal Configuration | Optimal Cooling, or Maximum Cooling |
Minimum Processor Idle Power Core C-state | No C-states |
Minimum Processor Idle Power Package C-state | No Package state |
Energy Performance BIAS | Max Performance |
Sub-NUMA Clustering | Disabled |
HW Prefetcher | Disabled |
SRIOV | Enabled |
Intel® VT-d | Enabled |
The BIOS settings shown in the example below are recommended for HP DL380p Gen8 servers. For BIOS settings of other servers, refer to the respective vendor's website.
Parameter | Ribbon Recommended Settings | Default Settings |
---|---|---|
HP Power Profile | Maximum Performance | Balanced Power and Performance |
Thermal Configuration | Maximum Cooling | Optimal Cooling |
HW Prefetchers | Disabled | Enabled |
Adjacent Sector Prefetcher | Disabled | Enabled |
Processor Power and Utilization Monitoring | Disabled | Enabled |
Memory Pre-Failure Notification | Disabled | Enabled |
Memory Refresh Rate | 1x Refresh | 2x Refresh |
Data Direct I/O | Enabled | Disabled |
SR-IOV | Enabled | Disabled |
Intel® VT-d | Enabled | Disabled |
General Recommendations
- Ensure the number of vCPUs in an instance is always an even number (4, 6, 8, and so on), as hyper threaded vCPUs are used.
- For best performance, make sure a single instance is confined to a single NUMA. Performance degradation occurs if an instance spans across multiple NUMAs.
- Ensure the physical NICs associated with an instance are connected to the same NUMA/socket where the instance is hosted. Doing so reduces the remote node memory access, which in turn helps improve the performance.
Log in to the ESX host
Check the NIC in use by using the
esxcli network nic list
command.Find out the NUMA affinity of the NIC using the command
vsish -e get /net/pNics/<vmnicx>/properties | grep "NUMA".
ESXi Host Configuration Parameters
Use the VMware vSphere client to configure the following ESXi host configuration parameters on the Advanced Settings page (see figure below) before installing the SBC SWe.
Path: Host > Manage > Advanced Settings
ESXi Parameter | ESXi 6.5 Settings | ESXi 6.7 Settings | ESXi 7.0 Settings | |||
---|---|---|---|---|---|---|
Recommended | Default | Recommended | Default | Recommended | Default | |
Cpu.CoschedCrossCall | 0 | 1 | 0 | 1 | 0 | 1 |
Cpu.CreditAgePeriod | 1000 | 3000 | 1000 | 3000 | 1000 | 3000 |
DataMover.HardwareAcceleratedInit | 0 | 1 | 0 | 1 | 0 | 1 |
DataMover.HardwareAcceleratedMove | 0 | 1 | 0 | 1 | 0 | 1 |
Disk.SchedNumReqOutstanding | n/a | n/a | n/a | n/a | n/a | n/a |
Irq.BestVcpuRouting | 1 | 0 | 1 | 0 | 1 | 0 |
Mem.BalancePeriod | n/a | n/a | n/a | n/a | n/a | n/a |
Mem.SamplePeriod | n/a | n/a | n/a | n/a | n/a | n/a |
Mem.ShareScanGHz | 0 | 4 | 0 | 4 | 0 | 4 |
Mem.VMOverheadGrowthLimit | 0 | 4294967295 | 0 | 4294967295 | 0 | 4294967295 |
Misc.TimerMaxHardPeriod | 2000 | 500000 | 2000 | 500000 | 2000 | 500000 |
Misc.TimerMinHardPeriod | n/a | n/a | n/a | n/a | n/a | n/a |
Net.AllowPT | 1 | 1 | 1 | 1 | 1 | 1 |
Net.MaxNetifRxQueueLen | n/a | n/a | n/a | n/a | n/a | n/a |
Net.MaxNetifTxQueueLen | 1000 | 2000 | 2000 | 2000 | 2000 | 2000 |
Net.NetTxCompletionWorldlet | n/a | n/a | n/a | n/a | n/a | n/a |
Net.NetVMTxType | 1 | 2 | 1 | 2 | 1 | 2 |
Net.NetTxWordlet | n/a | n/a | n/a | n/a | n/a | n/a |
Numa.LTermFairnessInterval | 0 | 5 | 0 | 5 | 0 | 5 |
Numa.MonMigEnable | 0 | 1 | 0 | 1 | 0 | 1 |
Numa.PageMigEnable | 0 | 1 | 0 | 1 | 0 | 1 |
Numa.PreferHT | 1 | 0 | 1 | 0 | 1 | 0 |
Numa.RebalancePeriod | 60000 | 2000 | 60000 | 2000 | 60000 | 2000 |
Numa.SwapInterval | 1 | 3 | 1 | 3 | 1 | 3 |
Numa.SwapLoadEnable | 0 | 1 | 0 | 1 | 0 | 1 |
VM Settings
CPU Settings
To edit CPU, go to Edit instance settings > Virtual Hardware > CPU.
Recommended CPU Settings
Parameter | Recommended Settings |
---|---|
Cores Per Socket | 1 |
Reservation | Value = (No. of vCPUs * CPU frequency)/2 or example, No. of vCPUs associated with the SBC = 32 CPU Model : Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz Hence Value = 36800 MHz ( i.e., 32 x 2300 / 2) |
Limit | Unlimited |
Shares | Normal |
Memory Settings
To edit Memory, go to Edit instance settings > Virtual Hardware > Memory.
Recommended Memory Settings
Parameter | Recommended Settings |
---|---|
RAM | As per requirement. |
Reservation | Same as RAM. Check "Reserve all guest memory (All locked)" |
Limit | Unlimited |
Shares | Normal |
Latency Sensitivity Settings
To edit Latency sensitivity, go to Edit instance settings > VM Options > Advanced > Latency sensitivity.
Configure the VM Latency Sensitivity to High, if the ESX allows it.
- ESX 6.5 allows configuring latency sensitivity to High even with hyper-threaded CPU reservation.
- ESX 6.7 and above does not allow configuring latency sensitivity to High without full CPU core reservation.
NUMA Settings
To edit NUMA settings, go to Edit instance settings > VM Options > Advanced > Configuration Parameters > Edit Configuration.
Configure numa.nodeAffinity based on the NUMA node to which Pkt NICs are attached (as mentioned in General Recommendations). Ensure the VM memory fits in a single NUMA node, so that remote memory access does not happen.
Configure numa.vcpu.preferHT=TRUE. This is required for better cache optimizations. Refer to http://www.staroceans.org/ESXi_VMkernel_NUMA_Constructs.htm for further details.
Configure numa.autosize.once = FALSE.
Setting the Scheduling Affinity of a VMware Instance
By specifying a CPU affinity setting for each virtual machine, you can restrict the assignment of virtual machines to a subset of the available cores. By using this feature, you can assign each virtual machine to cores in the specified affinity set.
Each virtual machine on the Host instance (not just the SBC instance) must have a dedicated "CPU affinity" set, so that one VM will not interfere with another VM.
Procedure
Both logical cores of a physical core need assignment in the "Scheduling affinity" set of the same virtual machine. Logical cores on the same physical core have consecutive CPU numbers, so that CPUs 0 and 1 are on the first core together, CPUs 2 and 3 are on the second core, and so on.
You can let the Host OS run some of the CPU cores.
Validation
Use the below ESXI CLI command to validate the "Scheduling affinity" of the VM:
sched-stats -t cpu| grep vcpu
Example output from the above command:
In the preceding example:
Column Title | Description |
---|---|
Name | Displays the vCPU number: VM Name |
CPU | Displays the logical CPU number where the vCPU thread is presently running |
Affinity | Displays the "Scheduling Affinity" of the vCPU thread |
VMware does not provide any method for the user to pin a vCPU thread to a particular logical core. You can only provide a "Scheduling affinity" set for the VM, and the vCPU thread will get scheduled within the given CPU affinity set.