In this section:
The following sections contain VMware performance tuning recommendations to improve system performance. These performance recommendations are general guidelines and are not all-inclusive.
BIOS Setting Recommendations
Ribbon recommends the following BIOS settings for optimum performance. The following table is intended as a reference. Exact values may vary based on the vendor and hardware.
Recommended BIOS Settings for Optimum Performance:
Parameter | Recommended Setting |
---|---|
CPU power management/ Power Regulator | Maximum performance, or Static High Performance |
Intel Hyper-Threading | Enabled |
Intel Turbo Boost | Enabled |
Intel VT-x (Virtualization Technology) | Enabled |
Thermal Configuration | Optimal Cooling, or Maximum Cooling |
Minimum Processor Idle Power Core C-state | No C-states |
Minimum Processor Idle Power Package C-state | No Package state |
Energy Performance BIAS | Max Performance |
Sub-NUMA Clustering | Disabled |
HW Prefetcher | Disabled |
SRIOV | Enabled |
Intel® VT-d | Enabled |
The BIOS settings shown in the example below are recommended for HP DL380p Gen8 servers. For BIOS settings of other servers, refer to the respective vendor's website.
BIOS Setting Recommendations for HP DL380p Gen8 Server
Parameter | Ribbon Recommended Settings | Default Settings |
---|---|---|
HP Power Profile | Maximum Performance | Balanced Power and Performance |
Thermal Configuration | Maximum Cooling | Optimal Cooling |
HW Prefetchers | Disabled | Enabled |
Adjacent Sector Prefetcher | Disabled | Enabled |
Processor Power and Utilization Monitoring | Disabled | Enabled |
Memory Pre-Failure Notification | Disabled | Enabled |
Memory Refresh Rate | 1x Refresh | 2x Refresh |
Data Direct I/O | Enabled | Disabled |
SR-IOV | Enabled | Disabled |
Intel® VT-d | Enabled | Disabled |
General Recommendations
- Ensure the number of vCPUs in an instance is always an even number (4, 6, 8, and so on), as hyper threaded vCPUs are used.
- For best performance, make sure a single instance is confined to a single NUMA. Performance degradation occurs if an instance spans across multiple NUMAs.
- Ensure the physical NICs associated with an instance are connected to the same NUMA/socket where the instance is hosted. Doing so reduces the remote node memory access, which in turn helps improve the performance.
Log in to the ESX host
Check the NIC in use by using the
esxcli network nic list
command.Find out the NUMA affinity of the NIC using the command
vsish -e get /net/pNics/<vmnicx>/properties | grep "NUMA".
ESXi Host Configuration Parameters
Use the VMware vSphere client to configure the following ESXi host configuration parameters on the Advanced Settings page (see figure below) before installing the SBC SWe.
Path: Host > Manage > Advanced Settings
ESXi Advanced Settings
ESXi Parameter | ESXi 6.5 Settings | ESXi 6.7 Settings | ESXi 7.0 Settings | |||
---|---|---|---|---|---|---|
Recommended | Default | Recommended | Default | Recommended | Default | |
Cpu.CoschedCrossCall | 0 | 1 | 0 | 1 | 0 | 1 |
Cpu.CreditAgePeriod | 1000 | 3000 | 1000 | 3000 | 1000 | 3000 |
DataMover.HardwareAcceleratedInit | 0 | 1 | 0 | 1 | 0 | 1 |
DataMover.HardwareAcceleratedMove | 0 | 1 | 0 | 1 | 0 | 1 |
Disk.SchedNumReqOutstanding | n/a | n/a | n/a | n/a | n/a | n/a |
Irq.BestVcpuRouting | 1 | 0 | 1 | 0 | 1 | 0 |
Mem.BalancePeriod | n/a | n/a | n/a | n/a | n/a | n/a |
Mem.SamplePeriod | n/a | n/a | n/a | n/a | n/a | n/a |
Mem.ShareScanGHz | 0 | 4 | 0 | 4 | 0 | 4 |
Mem.VMOverheadGrowthLimit | 0 | 4294967295 | 0 | 4294967295 | 0 | 4294967295 |
Misc.TimerMaxHardPeriod | 2000 | 500000 | 2000 | 500000 | 2000 | 500000 |
Misc.TimerMinHardPeriod | n/a | n/a | n/a | n/a | n/a | n/a |
Net.AllowPT | 1 | 1 | 1 | 1 | 1 | 1 |
Net.MaxNetifRxQueueLen | n/a | n/a | n/a | n/a | n/a | n/a |
Net.MaxNetifTxQueueLen | 1000 | 2000 | 2000 | 2000 | 2000 | 2000 |
Net.NetTxCompletionWorldlet | n/a | n/a | n/a | n/a | n/a | n/a |
Net.NetVMTxType | 1 | 2 | 1 | 2 | 1 | 2 |
Net.NetTxWordlet | n/a | n/a | n/a | n/a | n/a | n/a |
Numa.LTermFairnessInterval | 0 | 5 | 0 | 5 | 0 | 5 |
Numa.MonMigEnable | 0 | 1 | 0 | 1 | 0 | 1 |
Numa.PageMigEnable | 0 | 1 | 0 | 1 | 0 | 1 |
Numa.PreferHT | 1 | 0 | 1 | 0 | 1 | 0 |
Numa.RebalancePeriod | 60000 | 2000 | 60000 | 2000 | 60000 | 2000 |
Numa.SwapInterval | 1 | 3 | 1 | 3 | 1 | 3 |
Numa.SwapLoadEnable | 0 | 1 | 0 | 1 | 0 | 1 |
VM Settings
CPU Settings
To edit CPU, go to Edit instance settings > Virtual Hardware > CPU.
Edit CPU Settings Screen
Recommended CPU Settings
Parameter | Recommended Settings |
---|---|
Cores Per Socket | 1 |
Reservation | Value = (No. of vCPUs * CPU frequency)/2 or example, No. of vCPUs associated with the SBC = 32 CPU Model : Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz Hence Value = 36800 MHz ( i.e., 32 x 2300 / 2) |
Limit | Unlimited |
Shares | Normal |
Memory Settings
To edit Memory, go to Edit instance settings > Virtual Hardware > Memory.
Recommended Memory Settings
Parameter | Recommended Settings |
---|---|
RAM | As per requirement. |
Reservation | Same as RAM. Check "Reserve all guest memory (All locked)" |
Limit | Unlimited |
Shares | Normal |
Latency Sensitivity Settings
To edit Latency sensitivity, go to Edit instance settings > VM Options > Advanced > Latency sensitivity.
Configure the VM Latency Sensitivity to High, if the ESX allows it.
- ESX 6.5 allows configuring latency sensitivity to High even with hyper-threaded CPU reservation.
- ESX 6.7 and above does not allow configuring latency sensitivity to High without full CPU core reservation.
NUMA Settings
To edit NUMA settings, go to Edit instance settings > VM Options > Advanced > Configuration Parameters > Edit Configuration.
Configure numa.nodeAffinity based on the NUMA node to which Pkt NICs are attached (as mentioned in General Recommendations). Ensure the VM memory fits in a single NUMA node, so that remote memory access does not happen.
Configure numa.vcpu.preferHT=TRUE. This is required for better cache optimizations. Refer to http://www.staroceans.org/ESXi_VMkernel_NUMA_Constructs.htm for further details.
Configure numa.autosize.once = FALSE.
Setting the Scheduling Affinity of a VMware Instance
By specifying a CPU affinity setting for each virtual machine, users can restrict the assignment of virtual machines to a subset of the available cores. By using this feature, users can assign each virtual machine to cores in the specified affinity set.
Each virtual machine on the Host instance (not just the SBC instance) should have a dedicated "CPU affinity" set, so that one VM would not interfere with another VM.
Procedure
Both logical cores of a physical core need assignment in the "Scheduling affinity" set of the same virtual machine. Logical cores on the same physical core have consecutive CPU numbers, so that CPUs 0 and 1 are on the first core together, CPUs 2 and 3 are on the second core, and so on.
The user can let the Host OS run some of the CPU cores.
Validation
Use the below ESXI CLI command to validate the "Scheduling affinity" of the VM:
sched-stats -t cpu| grep vcpu
Example output from the above command:
In the above diagram:
Column name – displays vCPU number: VM Name
Column cpu – displays the logical CPU number where the vCPU thread is presently running.
Column affinity – displays "Scheduling affinity" of the vCPU thread.
VMware does not provide anyway for the user to pin a vCPU thread to a particular logical core. The user can only provide a "Scheduling affinity" set for the VM, and the vCPU thread will get scheduled within the given CPU affinity set.