Difference between revisions of "Documentation/HypervisorTuningGuide"
(Created page with "== About the Hypervisor Tuning Guide == The goal of the Hypervisor Tuning Guide (HTG) is to provide cloud operators with detailed instructions and settings to get the best pe...") |
|||
Line 23: | Line 23: | ||
Simply add your knowledge to this wiki page! The HTG does not yet have a formal documentation repository. It's still very much in initial stages. | Simply add your knowledge to this wiki page! The HTG does not yet have a formal documentation repository. It's still very much in initial stages. | ||
+ | |||
+ | == Understanding Your Workload == | ||
+ | |||
+ | I imagine this section to be the most theoretical / high level out of the entire guide. | ||
+ | |||
+ | === References === | ||
+ | * https://docs.mirantis.com/openstack/fuel/fuel-6.1/planning-guide.html#hardware-calculation | ||
== CPU == | == CPU == | ||
+ | |||
+ | Introduction about CPU. | ||
=== Symptoms of Being CPU Bound === | === Symptoms of Being CPU Bound === | ||
+ | |||
+ | * Raw CPU, past 80% | ||
+ | * Idle percentage is less than 20 | ||
+ | * When load is very high, it's usually a disk IO and not CPU | ||
+ | * load can be very tricky to figure out | ||
+ | * steal time: when high on the guest, indication that the hypervisor is busy | ||
+ | |||
=== General Hardware Recommendations === | === General Hardware Recommendations === | ||
+ | |||
+ | * host-passthrough is always faster than host-model or custom | ||
+ | ** This needs to have a warning that migrations will be impossible if non-identical compute nodes are added later | ||
+ | |||
+ | ==== Hyperthreading ==== | ||
+ | * Virtual router application is better with HT turned off (network-specific workloads?) | ||
+ | * thread policies can also be important (prefer/avoid) - hopefully a mitaka enhancement | ||
+ | * NUMA? | ||
+ | ** http://docs.openstack.org/developer/nova/testing/libvirt-numa.html | ||
+ | * CPU pinning | ||
+ | |||
+ | ==== Notable CPU flags ==== | ||
+ | * nested cpu for virtualization within a guest | ||
+ | ** may have issues with older kernel version: nested vms would lock up | ||
+ | |||
=== Operating System Configuration === | === Operating System Configuration === | ||
+ | |||
+ | ==== Linux ==== | ||
+ | |||
+ | * exclude cores, dedicate cores / cpus specifically for certain OS tasks | ||
+ | ** iso cpu | ||
+ | ** see rh blog post below | ||
+ | * reasonable increase in performance by compiling own kernels | ||
+ | * turn off cpu scaling - run at full frequency | ||
+ | |||
+ | ==== Windows ==== | ||
+ | |||
+ | * virtio drivers | ||
+ | |||
=== Hypervisor Configuration === | === Hypervisor Configuration === | ||
+ | |||
+ | ==== KVM / libvirt ==== | ||
+ | ==== Xen ==== | ||
+ | ==== VMWare ==== | ||
+ | ==== Hyper-V ==== | ||
+ | * has numa spanning enabled by default, should be disabled for performance, caveat with restarting instance | ||
+ | |||
=== OpenStack Configuration === | === OpenStack Configuration === | ||
+ | ==== CPU Overcommit ==== | ||
+ | |||
+ | * Generally, it's safe to overcommit CPUs. It has been reported that the main reason not to overcommit CPU is because of not overcommitting memory. | ||
+ | * RAM overcommit, particularly with KSM, has a CPU hit as well | ||
+ | |||
=== Instance and Image Configuration === | === Instance and Image Configuration === | ||
+ | |||
+ | * CPU quotas and shares | ||
+ | ** Reported use-case: default of 80% on all flavors, if workloads are very cpu heavy, don't do. | ||
+ | |||
+ | * Hyper-v enlightenment features | ||
+ | * Hyper-v gen 2 vms are seen to be faster than gen 1, reason? | ||
+ | |||
=== Validation, Benchmarking, and Reporting === | === Validation, Benchmarking, and Reporting === | ||
+ | |||
+ | ==== General Tools ==== | ||
+ | * top | ||
+ | * vmstat | ||
+ | * htop | ||
+ | |||
+ | ==== Benchmarking Tools ==== | ||
+ | * phoronix | ||
+ | |||
+ | ==== Metrics ==== | ||
+ | * System: user, system, iowait, irq, soft irq | ||
+ | * Per-instance (nova diagnostics) | ||
+ | ** overlaying cputime vs allocated cpu | ||
== Memory == | == Memory == | ||
Line 39: | Line 115: | ||
=== General Hardware Recommendations === | === General Hardware Recommendations === | ||
=== Operating System Configuration === | === Operating System Configuration === | ||
+ | ==== Linux ==== | ||
+ | ==== Windows ==== | ||
+ | |||
=== Hypervisor Configuration === | === Hypervisor Configuration === | ||
+ | ==== KVM / libvirt ==== | ||
+ | ==== Xen ==== | ||
+ | ==== VMWare ==== | ||
+ | ==== Hyper-V ==== | ||
=== OpenStack Configuration === | === OpenStack Configuration === | ||
=== Instance and Image Configuration === | === Instance and Image Configuration === | ||
Line 49: | Line 132: | ||
=== General Hardware Recommendations === | === General Hardware Recommendations === | ||
=== Operating System Configuration === | === Operating System Configuration === | ||
+ | ==== Linux ==== | ||
+ | ==== Windows ==== | ||
+ | |||
=== Hypervisor Configuration === | === Hypervisor Configuration === | ||
+ | ==== KVM / libvirt ==== | ||
+ | ==== Xen ==== | ||
+ | ==== VMWare ==== | ||
+ | ==== Hyper-V ==== | ||
+ | |||
+ | |||
=== OpenStack Configuration === | === OpenStack Configuration === | ||
=== Instance and Image Configuration === | === Instance and Image Configuration === | ||
Line 59: | Line 151: | ||
=== General Hardware Recommendations === | === General Hardware Recommendations === | ||
=== Operating System Configuration === | === Operating System Configuration === | ||
+ | ==== Linux ==== | ||
+ | ==== Windows ==== | ||
+ | |||
=== Hypervisor Configuration === | === Hypervisor Configuration === | ||
+ | ==== KVM / libvirt ==== | ||
+ | ==== Xen ==== | ||
+ | ==== VMWare ==== | ||
+ | ==== Hyper-V ==== | ||
+ | |||
=== OpenStack Configuration === | === OpenStack Configuration === | ||
=== Instance and Image Configuration === | === Instance and Image Configuration === | ||
=== Validation, Benchmarking, and Reporting === | === Validation, Benchmarking, and Reporting === |
Revision as of 03:48, 12 November 2015
Contents
- 1 About the Hypervisor Tuning Guide
- 2 Understanding Your Workload
- 3 CPU
- 4 Memory
- 5 Network
- 6 Disk
About the Hypervisor Tuning Guide
The goal of the Hypervisor Tuning Guide (HTG) is to provide cloud operators with detailed instructions and settings to get the best performance out of their hypervisors.
This guide is broken into four major sections:
- CPU
- Memory
- Network
- Disk
Each section has tuning information for the following areas:
- Symptoms of being (CPU, Memory, Network, Disk) bound
- General hardware recommendations
- Operating System configuration
- Hypervisor configuration
- OpenStack configuration
- Instance and Image configuration
- Validation, benchmarking, and reporting
How to Contribute
Simply add your knowledge to this wiki page! The HTG does not yet have a formal documentation repository. It's still very much in initial stages.
Understanding Your Workload
I imagine this section to be the most theoretical / high level out of the entire guide.
References
CPU
Introduction about CPU.
Symptoms of Being CPU Bound
- Raw CPU, past 80%
- Idle percentage is less than 20
- When load is very high, it's usually a disk IO and not CPU
- load can be very tricky to figure out
- steal time: when high on the guest, indication that the hypervisor is busy
General Hardware Recommendations
- host-passthrough is always faster than host-model or custom
- This needs to have a warning that migrations will be impossible if non-identical compute nodes are added later
Hyperthreading
- Virtual router application is better with HT turned off (network-specific workloads?)
- thread policies can also be important (prefer/avoid) - hopefully a mitaka enhancement
- NUMA?
- CPU pinning
Notable CPU flags
- nested cpu for virtualization within a guest
- may have issues with older kernel version: nested vms would lock up
Operating System Configuration
Linux
- exclude cores, dedicate cores / cpus specifically for certain OS tasks
- iso cpu
- see rh blog post below
- reasonable increase in performance by compiling own kernels
- turn off cpu scaling - run at full frequency
Windows
- virtio drivers
Hypervisor Configuration
KVM / libvirt
Xen
VMWare
Hyper-V
- has numa spanning enabled by default, should be disabled for performance, caveat with restarting instance
OpenStack Configuration
CPU Overcommit
- Generally, it's safe to overcommit CPUs. It has been reported that the main reason not to overcommit CPU is because of not overcommitting memory.
- RAM overcommit, particularly with KSM, has a CPU hit as well
Instance and Image Configuration
- CPU quotas and shares
- Reported use-case: default of 80% on all flavors, if workloads are very cpu heavy, don't do.
- Hyper-v enlightenment features
- Hyper-v gen 2 vms are seen to be faster than gen 1, reason?
Validation, Benchmarking, and Reporting
General Tools
- top
- vmstat
- htop
Benchmarking Tools
- phoronix
Metrics
- System: user, system, iowait, irq, soft irq
- Per-instance (nova diagnostics)
- overlaying cputime vs allocated cpu