Difference between revisions of "Nova/ResourcePartitioningSpec"

Revision as of 22:28, 18 November 2010

Launchpad Entry: NovaSpec:resource-partitioning
Created:
Contributors:

Summary

Limit resources used by virtual guests
Make sure that resources promised to users will always be available for them.
When working well, this should give similar results of benchmark tests runned anytime on given instances having the same resource limits.
usage of this is optional.

Release Note

Not only is possible to limit resources used by virtual guests, but we can (optionally) make sure that resources promised to users will always be available for them. Now public cloud users will know for sure if the virtual guest will be powerfull enough for theirs needs. They will know what exactly they are paying for.

Definition of Terms

Resource: Limited resource needed for virtual guests, like for example: RAM, disk space, disk bandwidth for reading/writing, Internet bandwidth (in/out), intranet badwidth (in/on)
Reserved resources: resources which are always available for the instance on the hardware node, in ammounts specified in SLA
Shared resources: any user can use them as much as they want.
Limits: max. allowed resource usage. User cant use more of that resource.
Resource partitioning - users reserve a set of resources on a system, and these reserved resources, regardless of current usage, cannot be used by others.
Strict resource partitioning - users are restricted to their reserved set of resources
Loose resource partitioning - users are not restricted to their reserved set of resources, but may use free (unreserved) resources on the system.

Rationale

Users of public cloud want to be able to use what they paid for. Not less. Some of them do not want to share critical resources with other users. They want to have contracted resources available for them without waiting.Cloud providers want to provide users what they paid for. Not more.

User stories

Alice has a virtual server. Its Internet connection is shared with 100 other users. She is not happy with that. She enters into another SLA, now having reserved 50Mbps for upload and download from Internet. Internet operations of another virtual servers will not lower speed of her connection.

M. Hatter is using virtual servers for mathematical simulations. The host his virtual machine is sitting on has 100 users; therefore, he has quite limited access to CPU. He is willing to pay more to have some CPU power reserved only for him.

Assumptions

Live migration needs to be working in order to redistribute load across hosts.
We need to be able to limit resource usage for cluster resources. This is different for every node operating system and even for hypervizors.

Implementation

We will need to update (or subclass) scheduler to take into account not only actual load, but also reserved ammounts of resources.
Using Linux nodes and KVM/QEMU/UML, most natural tool for limits is cgroups

Test/Demo Plan

When working well, running benchmarks on the same class of virtual guests should give similar results anytime. Other virtual guests should not be able to change availability of contracted resources for e virtual guest

Unresolved issues

How to limit disk i/o in a standard way? Work is underway to get this into Linux kernel, so it can be configured using cgroups.

What units to use for CPU power? How to measure it?

BoF agenda and discussion

We are using etherpad page for discussion

@@ Line 6: / Line 6: @@
 == Summary ==
-Partition system resources such as network bandwidth, CPU power, and possibly even storage I/O bandwidth in such a way as to give each user a configured amount and allow all users on any given system to maximize their share simultaneously without issue.
+* Limit resources used by virtual guests
+* Make sure that resources promised to users will always be available for them.
+* When working well, this should give similar results of benchmark tests runned anytime on given instances having the same resource limits.
+* usage of this is optional.
 == Release Note ==
-This section should include a paragraph describing the end-user impact of this change.  It is meant to be included in the release notes of the first release in which it is implemented.  (Not all of these will actually be included in the release notes, at the release manager's discretion; but writing them is a useful exercise.)
+Not only is possible to limit resources used by virtual guests, but we can (optionally) make sure that resources promised to users will always be available for them. Now public cloud users will know for sure if the virtual guest will be powerfull enough for theirs needs. They will know what exactly they are paying for.
+== Definition of Terms ==
-It is mandatory.
+* '''Resource:''' Limited resource needed for virtual guests, like for example: RAM, disk space, disk bandwidth for reading/writing, Internet bandwidth (in/out), intranet badwidth (in/on)
+* '''Reserved resources:''' resources which are always available for the instance on the hardware node, in ammounts specified in SLA
+* '''Shared resources:''' any user can use them as much as they want.
+* '''Limits: '''max. allowed resource usage. User cant use more of that resource.
+* '''Resource partitioning''' - users reserve a set of resources on a system, and these reserved resources, regardless of current usage, cannot be used by others.
+* '''Strict resource partitioning''' - users are restricted to their reserved set of resources
+* '''Loose resource partitioning''' - users are not restricted to their reserved set of resources, but may use free (unreserved) resources on the system.
 == Rationale ==
+'''Users of public cloud''' want to be able to use what they paid for. Not less. Some of them do not want to share critical resources with other users. They want to have contracted resources available for them without waiting.'''Cloud providers''' want to provide users what they paid for. Not more.
 == User stories ==
+Alice has a virtual server. Its Internet  connection is  shared with 100 other users. She is not happy with that. She enters into another SLA, now having reserved 50Mbps for upload and download from Internet. Internet operations of another virtual servers will not lower speed of her connection.
+ M. Hatter is using virtual servers for mathematical simulations. The host his virtual machine is sitting on has 100 users; therefore, he has quite limited access to CPU. He is willing to pay more to have some CPU power reserved only for him.
 == Assumptions ==
-== Design ==
+* Live migration needs to be working in order to redistribute load across hosts.
-You can have subsections that better describe specific parts of the issue.
+* We need to be able to limit resource usage for cluster resources. This is different for every node operating system and even for hypervizors.
 == Implementation ==
-This section should describe a plan of action (the "how") to implement the changes discussed. Could include subsections like:
+* We will need to update (or subclass) scheduler  to take into account not only actual load, but also reserved ammounts of resources.
+* Using Linux nodes and KVM/QEMU/UML, most natural tool for limits is cgroups
-=== UI Changes ===
-Should cover changes required to the UI, or specific UI that is required to implement this
-=== Code Changes ===
-Code changes should include an overview of what needs to change, and in some cases even the specific details.
-=== Migration ===
-Include:
-* data migration, if any
-* redirects from old URLs to new ones, if any
-* how users will be pointed to the new way of doing things, if necessary.
 == Test/Demo Plan ==
-This need not be added or completed until the specification is nearing beta.
+When working well,  running benchmarks on the same class of virtual guests should give similar results anytime. Other virtual guests should not be able to change availability of contracted resources for e virtual guest
 == Unresolved issues ==
-This should highlight any issues that should be addressed in further specifications, and not problems with the specification itself; since any specification with problems cannot be approved.
+'''How to limit disk i/o in a standard way?''' Work is underway to get this into Linux kernel, so it can be configured using cgroups.
+'''What units to use for CPU power? '''How to measure it?
 == BoF agenda and discussion ==
-Use this section to take notes during the BoF; if you keep it in the approved spec, use it for summarising what was discussed and note any options that were rejected.
+We are using [http://etherpad.openstack.org/M9RemacWog etherpad page] for discussion
 ----
 [[Category:Spec]]