Jump to: navigation, search

Difference between revisions of "Nova/ResourcePartitioningSpec"

(minor text edits)
(minor text edits)
(No difference)

Revision as of 06:46, 19 November 2010

  • Launchpad Entry: NovaSpec:resource-partitioning
  • Created:
  • Contributors:

Summary

Partition system resources such as network bandwidth, CPU power, and possibly even storage I/O bandwidth in such a way as to give each user a reserved set of resources and allow all users to simultaneously maximize usage of their reserved resources without issue. * Limit resources used by virtual guests

  • Make sure that resources promised to users will always be available for them.
  • When working well, this should give similar results of benchmark tests runned anytime on given instances having the same resource limits.
  • usage of this is optional.

Release Note

Not only is possible to limit resources used by virtual guests, but we can (optionally) make sure that resources promised to users will always be available for them. Now public cloud users will know for sure if the virtual guest will be powerfull enough for theirs needs. They will know what exactly they are paying for.

Definition of Terms

  • Resource: Limited resource needed for virtual guests of the cluster (and cloud), like for example:
    • RAM,
    • disk space,
    • disk bandwidth for reading/writing,
    • Internet bandwidth (in/out),
    • intranet badwidth (in/out).
  • Reserved resources: resources which are always available for the instance on the hardware node, in ammounts specified in SLA
  • Shared resources: any user can use them as much as they want.
  • Limits: max. allowed resource usage. User cant use more of that resource.
  • Resource partitioning - users reserve a set of resources on a system, and these reserved resources, regardless of current usage, cannot be used by others.
  • Strict resource partitioning - users are restricted to their reserved set of resources
  • Loose resource partitioning - users are not restricted to their reserved set of resources, but may use free (unreserved) resources on the system.

Rationale

Users of public cloud want to be able to use what they paid for. Not less. Some of them do not want to share critical resources with other users. They want to have contracted resources available for them without waiting.Cloud providers want to provide users what they paid for. Not more.

User stories

Alice has a virtual server. Its Internet connection is shared with 100 other users. She is not happy with that. She enters into another SLA, now having reserved 50Mbps for upload and download from Internet. Internet operations of another virtual servers will not lower speed of her connection.

M. Hatter is using virtual servers for mathematical simulations. The host his virtual machine is sitting on has 100 users; therefore, he has quite limited access to CPU. He is willing to pay more to have some CPU power reserved only for him.

Assumptions

  • Live migration needs to be working in order to redistribute load across hosts.
  • We need to be able to limit resource usage for cluster resources. This is different for every node operating system and even for hypervizors.

Implementation

  • We will need to update (or subclass) scheduler to take into account not only actual load, but also reserved ammounts of resources.
  • Using Linux nodes and KVM/QEMU/UML, most natural tool for limits is cgroups.

Test/Demo Plan

When working well, running benchmarks on the same class of virtual guests should give similar results anytime. Other virtual guests should not be able to change availability of contracted resources for a virtual guest

Unresolved issues

How to limit disk i/o in a standard way? Work is underway to get this into Linux kernel, so it can be configured using cgroups.

What units to use for CPU power? How to measure it?

BoF agenda and discussion

We are using etherpad page for discussion