Jump to: navigation, search

Difference between revisions of "Documentation/HypervisorTuningGuide"

Line 113: Line 113:
  
 
=== Symptoms of Being Memory Bound ===
 
=== Symptoms of Being Memory Bound ===
 +
** OOM Killer
 +
** Out of swap
 +
 
=== General Hardware Recommendations ===
 
=== General Hardware Recommendations ===
 +
* ensure numa distribution is balanced
 +
* memory speeds, vary by chip
 +
 
=== Operating System Configuration ===
 
=== Operating System Configuration ===
 
==== Linux ====
 
==== Linux ====
 +
* Applicable kernel tunables
 +
** Transparent Hugepages can go either way depending on workload
 +
* KSM
 +
** Might often cause performance (CPU) problems, better to turn it off
 
==== Windows ====
 
==== Windows ====
  
 
=== Hypervisor Configuration ===
 
=== Hypervisor Configuration ===
 
==== KVM / libvirt ====
 
==== KVM / libvirt ====
 +
* nova enables ballooning but doesn't actually use it
 +
** nova would need something doing the equivalent of MOM in oVirt to "exercise" the balloon:
 +
** http://www.ovirt.org/MoM
 +
* reserved_host_memory_mb (defaults: 512 mb which is too low for the real world)
 +
* Turn on/off EPT (see blog post)
 +
 
==== Xen ====
 
==== Xen ====
 
==== VMWare ====
 
==== VMWare ====
 
==== Hyper-V ====
 
==== Hyper-V ====
 +
 
=== OpenStack Configuration ===
 
=== OpenStack Configuration ===
 +
* Memory Overcommit & the cost of swapping
 
=== Instance and Image Configuration ===
 
=== Instance and Image Configuration ===
 +
* ensure ballooning is enabled / available
 +
* guests cannot see memory speed - not exposed like cpu flags are
 
=== Validation, Benchmarking, and Reporting ===
 
=== Validation, Benchmarking, and Reporting ===
 +
 +
==== General Tools ====
 +
* free
 +
 +
==== Benchmarking ====
 +
* stream
 +
 +
==== Metrics ====
 +
 +
* System
 +
** page in, page out, page scans per second, `free`
 +
* Per-Instance
 +
** nova diagnostics
 +
** virsh
  
 
== Network ==
 
== Network ==
  
 
=== Symptoms of Being Network Bound ===
 
=== Symptoms of Being Network Bound ===
 +
* from guest: soft irq will be high
 +
* high io wait for network-based instance disk
 +
* discards on switch
 +
 
=== General Hardware Recommendations ===
 
=== General Hardware Recommendations ===
 +
* Bonding
 +
** LACP vs balance-tlb vs balance-alb
 +
* VXLAN offload
 +
 
=== Operating System Configuration ===
 
=== Operating System Configuration ===
 
==== Linux ====
 
==== Linux ====
 +
* pin send/recv to specific cores
 +
* ip forwarding: disable GRO on kernel module (nic driver)
 +
 +
* Kernel Tunables
 +
** net.ipv4.tcp_keepalive_time, net.core.somaxconn, net.nf_conntrack_max
 +
** Different queue algos: FQ_CODEL, etc
 +
 +
* PCI Passthrough
 +
* SR-IOV?
 +
** NUMA locality of SR-IOV (and passthrough) devices (pretty much get this for free if you are using NUMATopologyFilter and have a chipset that has locality)
 +
* Jumbo frames? 9000 MTU https://paste.fedoraproject.org/284011/14459359/ - for VLANs - source https://access.redhat.com/solutions/1417133
 +
 
==== Windows ====
 
==== Windows ====
  
 
=== Hypervisor Configuration ===
 
=== Hypervisor Configuration ===
 
==== KVM / libvirt ====
 
==== KVM / libvirt ====
 +
* vhost-net (on by default on most modern distros?)
 +
* virtio
 +
** virtio multiqueue
 +
* ovs acceleration (dpdk)
 
==== Xen ====
 
==== Xen ====
 
==== VMWare ====
 
==== VMWare ====
Line 144: Line 202:
 
=== OpenStack Configuration ===
 
=== OpenStack Configuration ===
 
=== Instance and Image Configuration ===
 
=== Instance and Image Configuration ===
 +
* PCI pass-through
 +
* Network IO quotas and shares
 +
** not advanced enough
 +
** instead, using libvirt hooks
 +
* 1500 MTU
 +
* Make sure the instance is actually using vhost-net (load the kernel module)
 +
 
=== Validation, Benchmarking, and Reporting ===
 
=== Validation, Benchmarking, and Reporting ===
 +
 +
==== General Tools ====
 +
* iftop
 +
 +
==== Benchmarking ====
 +
* iperf
 +
 +
==== Metrics ====
 +
* System
 +
** bytes in/out, packets in/out, irqs, pps
 +
** /proc/net/protocols
 +
* Per-Instance
 +
** nova diagnostics
 +
** virsh
 +
** virtual nic stats
  
 
== Disk ==
 
== Disk ==
Line 163: Line 243:
 
=== Instance and Image Configuration ===
 
=== Instance and Image Configuration ===
 
=== Validation, Benchmarking, and Reporting ===
 
=== Validation, Benchmarking, and Reporting ===
 +
 +
== References ==
 +
 +
* RedHat guides from Steve Gordon
 +
** http://redhatstackblog.redhat.com/2015/05/05/cpu-pinning-and-numa-topology-awareness-in-openstack-compute/
 +
** http://redhatstackblog.redhat.com/2015/09/15/driving-in-the-fast-lane-huge-page-support-in-openstack-compute/
 +
 +
* Docs from distributions
 +
** KVM
 +
*** https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html-single/Virtualization_Tuning_and_Optimization_Guide/index.html
 +
** Xen
 +
*** http://wiki.xenproject.org/wiki/Tuning_Xen_for_Performance
 +
 +
* CERN Tuning for high throughput computing
 +
** http://openstack-in-production.blogspot.fr/2015/09/ept-huge-pages-and-benchmarking.html
 +
** http://openstack-in-production.blogspot.fr/2015/08/numa-and-cpu-pinning-in-high-throughput.html
 +
** http://openstack-in-production.blogspot.fr/2015/08/ept-and-ksm-for-high-throughput.html
 +
** http://openstack-in-production.blogspot.fr/2015/08/cpu-model-selection-for-high-throughput.html
 +
** http://openstack-in-production.blogspot.fr/2015/08/openstack-cpu-topology-for-high.html
 +
 +
* Previous Etherpads
 +
** https://etherpad.openstack.org/p/YVR-ops-hypervisor-tuning
 +
** https://etherpad.openstack.org/p/PAO-ops-hypervisor-tuning

Revision as of 03:59, 12 November 2015

Contents

About the Hypervisor Tuning Guide

The goal of the Hypervisor Tuning Guide (HTG) is to provide cloud operators with detailed instructions and settings to get the best performance out of their hypervisors.

This guide is broken into four major sections:

  • CPU
  • Memory
  • Network
  • Disk

Each section has tuning information for the following areas:

  • Symptoms of being (CPU, Memory, Network, Disk) bound
  • General hardware recommendations
  • Operating System configuration
  • Hypervisor configuration
  • OpenStack configuration
  • Instance and Image configuration
  • Validation, benchmarking, and reporting

How to Contribute

Simply add your knowledge to this wiki page! The HTG does not yet have a formal documentation repository. It's still very much in initial stages.

Understanding Your Workload

I imagine this section to be the most theoretical / high level out of the entire guide.

References

CPU

Introduction about CPU.

Symptoms of Being CPU Bound

  • Raw CPU, past 80%
  • Idle percentage is less than 20
  • When load is very high, it's usually a disk IO and not CPU
  • load can be very tricky to figure out
  • steal time: when high on the guest, indication that the hypervisor is busy

General Hardware Recommendations

  • host-passthrough is always faster than host-model or custom
    • This needs to have a warning that migrations will be impossible if non-identical compute nodes are added later

Hyperthreading

Notable CPU flags

  • nested cpu for virtualization within a guest
    • may have issues with older kernel version: nested vms would lock up

Operating System Configuration

Linux

  • exclude cores, dedicate cores / cpus specifically for certain OS tasks
    • iso cpu
    • see rh blog post below
  • reasonable increase in performance by compiling own kernels
  • turn off cpu scaling - run at full frequency

Windows

  • virtio drivers

Hypervisor Configuration

KVM / libvirt

Xen

VMWare

Hyper-V

  • has numa spanning enabled by default, should be disabled for performance, caveat with restarting instance

OpenStack Configuration

CPU Overcommit

  • Generally, it's safe to overcommit CPUs. It has been reported that the main reason not to overcommit CPU is because of not overcommitting memory.
  • RAM overcommit, particularly with KSM, has a CPU hit as well

Instance and Image Configuration

  • CPU quotas and shares
    • Reported use-case: default of 80% on all flavors, if workloads are very cpu heavy, don't do.
  • Hyper-v enlightenment features
  • Hyper-v gen 2 vms are seen to be faster than gen 1, reason?

Validation, Benchmarking, and Reporting

General Tools

  • top
  • vmstat
  • htop

Benchmarking Tools

  • phoronix

Metrics

  • System: user, system, iowait, irq, soft irq
  • Per-instance (nova diagnostics)
    • overlaying cputime vs allocated cpu

Memory

Symptoms of Being Memory Bound

    • OOM Killer
    • Out of swap

General Hardware Recommendations

  • ensure numa distribution is balanced
  • memory speeds, vary by chip

Operating System Configuration

Linux

  • Applicable kernel tunables
    • Transparent Hugepages can go either way depending on workload
  • KSM
    • Might often cause performance (CPU) problems, better to turn it off

Windows

Hypervisor Configuration

KVM / libvirt

  • nova enables ballooning but doesn't actually use it
  • reserved_host_memory_mb (defaults: 512 mb which is too low for the real world)
  • Turn on/off EPT (see blog post)

Xen

VMWare

Hyper-V

OpenStack Configuration

  • Memory Overcommit & the cost of swapping

Instance and Image Configuration

  • ensure ballooning is enabled / available
  • guests cannot see memory speed - not exposed like cpu flags are

Validation, Benchmarking, and Reporting

General Tools

  • free

Benchmarking

  • stream

Metrics

  • System
    • page in, page out, page scans per second, `free`
  • Per-Instance
    • nova diagnostics
    • virsh

Network

Symptoms of Being Network Bound

  • from guest: soft irq will be high
  • high io wait for network-based instance disk
  • discards on switch

General Hardware Recommendations

  • Bonding
    • LACP vs balance-tlb vs balance-alb
  • VXLAN offload

Operating System Configuration

Linux

  • pin send/recv to specific cores
  • ip forwarding: disable GRO on kernel module (nic driver)
  • Kernel Tunables
    • net.ipv4.tcp_keepalive_time, net.core.somaxconn, net.nf_conntrack_max
    • Different queue algos: FQ_CODEL, etc

Windows

Hypervisor Configuration

KVM / libvirt

  • vhost-net (on by default on most modern distros?)
  • virtio
    • virtio multiqueue
  • ovs acceleration (dpdk)

Xen

VMWare

Hyper-V

OpenStack Configuration

Instance and Image Configuration

  • PCI pass-through
  • Network IO quotas and shares
    • not advanced enough
    • instead, using libvirt hooks
  • 1500 MTU
  • Make sure the instance is actually using vhost-net (load the kernel module)

Validation, Benchmarking, and Reporting

General Tools

  • iftop

Benchmarking

  • iperf

Metrics

  • System
    • bytes in/out, packets in/out, irqs, pps
    • /proc/net/protocols
  • Per-Instance
    • nova diagnostics
    • virsh
    • virtual nic stats

Disk

Symptoms of Being Disk Bound

General Hardware Recommendations

Operating System Configuration

Linux

Windows

Hypervisor Configuration

KVM / libvirt

Xen

VMWare

Hyper-V

OpenStack Configuration

Instance and Image Configuration

Validation, Benchmarking, and Reporting

References