OVS-DPDK Virtio Interfaces - Performance Tuning Recommendations
Follow the open stack recommended performance settings for host and guest: Refer to VNF Performance Tuning for details.
Make sure that physical network adapters, Poll Mode Driver (PMD) threads, and pinned CPUs for the instance are all on the same NUMA node.This is a mandate for optimal performance.
PMD threads are the threads that do the heavy lifting for userspace switching. They perform tasks such as continuous polling of input ports for packets, classifying packets once received, and executing actions on the packets once they are classified.
- Set the queue size for virtio interfaces to 1024 by updating the Director template.
NovaComputeExtraConfig: - nova::compute::libvirt::tx_queue_size: '"1024"'
NovaComputeExtraConfig: - nova::compute::libvirt::rx_queue_size: '"1024"'
- Configure the following dpdk parameters in host ovs-dpdk:
- Make sure two pair of Rx/Tx queues are configured for host dpdk interfaces
To validate, issue the following command duringovs-dpdk
bring-up:ovs-vsctl get Interface dpdk0 options
For background details, see http://docs.openvswitch.org/en/latest/howto/dpdk/ - Enable per-port memory, which means each port will use separate mem-pool for receiving packets, instead of using a default shared mem-pool:
ovs-vsctl set Open_vSwitch . other_config:per-port-memory=true
- configure 4096 MB huge page memory on each socket:
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem=4096,4096
- Make sure to spawn the appropriate number of pmd threads so that each port/queue can be serviced by a particular pmd thread. The pmd threads must be pinned to dedicated cores/hyper-threads, which must be in the same NUMA as network adapter and guest, which must be isolated from kernel, and must not be used by guest for any other purpose. The pmd-cpu-mask needs to be set accordingly.
ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x40001004000100
The example above sets pmd threads to run on two physical cores:8,26,36,54. (cores:8-36 and 26-54 are sibling hyper-threads). - Restart ovs-vswitchd after the changes:
systemctl status ovs-vswitchd
systemctl restart ovs-vswitchd
- Make sure two pair of Rx/Tx queues are configured for host dpdk interfaces
- The port and Rx queue assignment to pmd threads is crucial for optimal performance. Follow http://docs.openvswitch.org/en/latest/topics/dpdk/pmd/ for more details. The affinity is a csv list of <queue_id>:<core_id> which needs to be set for each ports.
ovs-vsctl set interface dpdk0 other_config:pmd-rxq-affinity="0:8,1:26"
ovs-vsctl set interface vhub89b3d58-4f other_config:pmd-rxq-affinity="0:36"
ovs-vsctl set interface vhu6d3f050e-de other_config:pmd-rxq-affinity="1:54"
In the example above, the pmd thread on core 8 will read queue 0 and pmd thread on core 26 will read queue 1 of dpdk0 interface.
Alternatively, you can use the default assignment of port/Rx queues to pmd threads and enable auto-load-balance option so that ovs will put the threads on cores based on load.
ovs-vsctl set open_vswitch . other_config:pmd-auto-lb="true"
ovs-appctl dpif-netdev/pmd-rxq-rebalance
Troubleshooting
- To check the port/Rx queue distribution among pmd threads, enter the command:
ovs-appctl dpif-netdev/pmd-rxq-show
- To check the pmd thread stats ( actual cpu usage), use below command and check for "processing cycles" and "idle cycles":
ovs-appctl dpif-netdev/pmd-stats-clear && sleep 10 && ovs-appctl dpif-netdev/pmd-stats-show
To check packet drops on host dpdk interfaces, use the below command and check for rx_dropped/tx_dropped counters:
watch -n 1 'ovs-vsctl get interface dpdk0 statistics|sed -e "s/,/\n/g" -e "s/[\",\{,\}, ]//g" -e "s/=/ =\u21d2 /g"'
For additional details, refer to the following page for troubleshooting performance issues/packet drops in ovs-dpdk environment:
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/10/html/ovs-dpdk_end_to_end_troubleshooting_guide/validating_an_ovs_dpdk_deployment#find_the_ovs_dpdk_port_physical_nic_mapping_configured_by_os_net_config
Benchmarking
Setup details:
- Platform: RHOSP13
- Host OS: RHEL7.5
- Processor: Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz
- 1 Provider Network configured for Management Interface
- 1 Provider Network configured for HA Interface
- OVS+DPDK enabled for packet interfaces (pkt0 and pkt1)
- 2 pair of Rx/Tx queues in host dpdk interfaces
- 1 Rx/Tx queue in guest virtio interface
- 4 pmd threads pinned to 4 hyper threads (i.e. using up 2 physical cores)
Guest Details:
- SSBC - 8vcpu/18GB RAM/100GB HDD
- MSBC - 10vcpu/20GB RAM/100 GB HDD
Benchmarking has been tested in a D-SBC setup with up to 30k pass-through sessions using the recommendations described in this document.
You may require additional cores for pmd threads for higher numbers.
External References
https://docs.openvswitch.org/en/latest/howto/dpdk/
https://docs.openvswitch.org/en/latest/topics/dpdk/pmd/