Jump to: navigation, search


Revision as of 14:17, 15 April 2013 by Ppetit (talk | contribs) (Detailed Description)
  • Created: Julien Danjou
  • Contributors: Julien Danjou, Patrick Petit, François Rossigneux, Julien Carpentier


This blueprint introduces the concept of a Capacity Leasing Service for OpenStack. With that service, a user with admin privileges can reserve hardware resources that are dedicated to the sole use of a tenant. A lease is a negotiation agreement between the provider and the consumer where the former agrees to make a set of hardware resources (compute and possibly storage) available to the latter, based on a set of lease terms presented by the consumer. The lease terms include the description of the host's capacity and the availability period during which the hardware is reserved.

We are thinking of three kinds of lease terms:

  • Schedule lease: where resources must be provisioned at a specific date and time
  • Best-effort lease: where resources are provisioned as soon as possible
  • Immediate lease: where resources are provisioned immediately or not at all

Once a lease is created, nobody but the users of the tenant can use the reserved resources during the period of the lease. When a lease ends, nothing disruptive happens to the instances that have been scheduled on the reserved resources. The hardware resources simply return to the common pool and so become available to other tenants for VM scheduling. A lease is potentially a billable item for which customers can be charged a flat fee or a premium price for each VM scheduled on reserved hardware and so usage of resource leases should be accounted for through Ceilometer.

See Launchpad Blueprint: Planned resource reservation API


There are situations where the reservation of hardware resources are desired to satisfy peaks of load that are known in advance. This is especially true for small scale cloud infrastructures and in HPC use cases where the co-scheduling of a fair amount of compute instances is necessary, but also to satisfy the needs of urgent computations, avoid uncontrolled noises caused by multi-tenancy VM activities, comply with regulations or security policies which proscribe the co-location of multiple tenants on the same physical host.

Detailed Description

The Capacity Leasing Service (CLS) behaves like the Filter Scheduler to assert the match making between the properties of the lease and the capabilities of the host. In fact, the CLS should accept all the standard filters consumed by the Filter Scheduler. A lease request should include the following properties:

  • the region
  • the availability zone
  • the host capabilities extra specs (scoped and non-scoped format should be accepted)
  • the number of CPU cores
  • the amount of free RAM
  • the amount of free disk space
  • the number of hosts
  • the type of lease (SCHEDULE, BEST-EFFORT, IMMEDIAT)
  • the staring date of the lease if of type SCHEDULE
  • the duration in days and hours of the lease
  • a timeout
  • ...

The CLS, primarily checks that the capabilities provided by the host satisfy any extra specifications associated with the properties of the lease and applies all enabled subsequent filters if any.

The CLS then checks in its database that any of eligible hosts are not already reserved for the requested the period. Then depending on the type of lease, the CLS performs different types of actions depending on whether the lease request can be fulfilled or not.

  1. If the lease is IMMEDIATE and the request can be fulfilled, then the service creates an aggregate for the list of eligible hosts with a special metadata key filter_lease_id that contains the unique id of the lease and returns SUCCESS with the ID of the lease and marks the lease as ACTIVE state. If the lease cannot be fulfilled, the CLS returns a FAILURE status.
  2. If the lease is BEST-EFFORT and the request cannot be fulfilled immediately, the CLS starts some sort of scavenger hunt which has for objective to move away any instance that belongs to some other tenants out of the list of eligible hosts. This operation can be timely and fairly complex and so different heuristics strategies may be applied depending on decision factors such as the number, type and state of the instances to be migrated. The CLS should assert that there are at least enough potential candidates for the migration prior to starting the actual migration. If the CLS decides to start the migration, it returns a SUCCESS status with the ID of the lease and marks the lease as IN-PROGRESS state. If the CLS decides not to start the migration, it returns a FAILURE status. If the scavenger hunt succeeds to make the list of eligible hosts available for the lease before the timeout is triggered, the CLS marks the lease as ACTIVE state. Conversely, if the scavenger hunt doesn't succeed, the CLS marks the lease as TIMEDOUT.
  3. Finally, if the lease is SCHEDULE and the request cannot be fulfilled for the requested period,

Last but not least. The Capacity Leasing Service is also expected to address a class of performance needs and types of workload that are typical of the high performance computing world, whereby applications require to be executed on dedicated nodes of similar hardware specification and speed (CPU arch, model, and clock frequency)

As a result, capacity leasing requests to the service should allow users to specify host capabilities parameters criteria that are compatible, or even the same, as those used by the ComputeCapabilitiesFilter which are known as the instance type extra specifications.

For exemple:

  • memory_mb == 22000
  • vcpus == 8
  • local_gb == 1690
  • cpu_arch == "x86_64"
  • cpu_info == '{"model":"Nehalem", "features":["tdtscp", "xtpr"]}'
  • xpu_arch = "fermi"
  • xpus = 2
  • xpu_info ='{"model":"Tesla 2050", "gcores":"448"}'
  • net_arch = "ethernet"
  • net_info = '{"encap":"Ethernet", "MTU":"8000"}'
  • net_mbps = 10000
  • hypervisor_ype == QEMU

As for the ComputeCapabilitiesFilter, the extra specification parameters used by the Capacity Leasing Service should support an operator at the beginning of the value string of a key/value pair. If there is no operator specified, then a default operator of ‘s==’ is used.

As a recap, valid operators are:

  • = (equal to or greater than as a number; same as vcpus case)
  • == (equal to as a number)
  •  != (not equal to as a number)
  • >= (greater than or equal to as a number)
  • <= (less than or equal to as a number)
  • s== (equal to as a string)
  • s!= (not equal to as a string)
  • s>= (greater than or equal to as a string)
  • s> (greater than as a string)
  • s<= (less than or equal to as a string)
  • s< (less than as a string)
  • <in> (substring)
  • <or> (find one of these)

Examples are: ">= 5", "s== 2.1.0", "<in> gcc", and "<or> fpu <or> gpu"

Open Issues

It is unclear how to guarantee consistency of the lease in situation of race conditions between the CLS and Nova Scheduler


A reservation, or lease, is tight to a project. It has a start and an end timestamp, during which the lease is valid. It also has a number of nodes and their flavors associated with, so it can be quantified. A lease has a set of scheduler hints set that are immutable. An API call allows a user to retrieve the list and combination of applicable hints. When a user tries to creates a lease, the list of scheduler hints is checked for validity: an operator can refuse a lease with invalid or too strict hints.

When an instance is created, it's registered as being part of the lease when the user passes the information at creation time. It's taken from the lease when it's destroyed. If an instance is created as being part of a lease, the scheduler has to launch the instance with the requirements fulfilled.




Test Plan