Difference between revisions of "Nova/CachingScheduler"

Revision as of 17:32, 23 January 2014

This wiki page describes the experimental scheduler that is being suggested here: https://blueprints.launchpad.net/nova/+spec/caching-scheduler

How it works

Lets take a look at what we have now, and what this blueprint changes...

What we have today...

We can roughly describe scheduling as:

load the current state (expensive, very expensive with current DB call)
use filters and weights to help pick a host (expensive with lots of hosts)

With the Caching Scheduler we do this...

The caching scheduler splits this into:

populate cache in periodic task
when we get a users request, pick the best host from the list of cached hosts

Some observations about this approach:

user no longer needs to wait for expensive get_all_hosts, runs before their request happens
user still runs through weights, but its on a much reduced list of hosts, only the hosts in the cache
but maintaining the cache is tricky, and tuning the cache may be trickier, but lets cover this next

Lets look at what happens in the periodic background task, for the first time:

admin configure a list of flavors, and how many "slots" for each flavor to reserve
in a periodic task we populate the cache
we try to fill all the requested number of slots for each flavor, iterating through in the order in the cache list
- if you want to reserve slots of larger instance types, put them early in the list
from the flavor, we generate a partial request_spec, run that through the existing filters and weights logic in host manager
the best match is saved in the cache

On the second run of the periodic task:

loop through the slots for each flavor, in the order defined in the cache list
- ensure we still have capacity for each slot on the given host
- slots for hosts that are dead or now full, are deleted from the cache
- we are not just replacing all the slots with new ones to reduce races with current user requests
- during this loop we claim resources as if we were building an instance on that host, so the next steps don't pick space we have already claimed
now we loop around the flavors, adding as many extra slots to get back up to the requested number of slots

When a user requests a server, this is how we pick a host:

Look up the list of slots for the given flavor
if cache is empty, attempt to repopulate the cache, then try again, then fail with NoValidHost
generate a list of host info to send through the weights
pick the best host (with an added bit of randomness, in the usual way)
remove the associated slot from the cache
if success full, return the picked host
if we raced on claiming the slot, retry the above process, but no need to repopulate the cache

This just gives you an idea of what is happening. Over time, we can try different strategies, and hopefully evolve a better strategy. As part of this optimisation work, we will develop am automatic test harness to compare different schedulers.

Performance Comparison to existing Filter Scheduler

There is no attempt to get all the features of the filter scheduler, the aim is to be very fast.

We need to compare this to the existing scheduler, but for now it can be assumed to be worse in every way.

Right now, the caching scheduler is just an interesting toy that may lead to something great...

Other Notes about the Caching Scheduler

Please note:

is likely to be experimental in Icehouse
looking into testing this in pre-production at Rackspace, hope to have more numbers soon
this re-uses host manager, so we can use existing filter and weights
current way groups and affinity are done appears to be lost

Configuration

Long term, it would be good if the cache could learn how it should be setup.

Right now, we are making all the tuning static configuration. Please note, as it is experimental, this configuration may change at any time. Hopefully it will be stable before Icehouse is released.

Lets consider asking for a cache of 10 m1.tiny (call it id 1) and 5 m1.small (call it id 2), and look to refresh the cache every 60 seconds. The config would look like this:

[caching_scheduler]

#
# Options defined in nova.scheduler.caching_scheduler
#

# Specification of the cache lists, a list of flavor_id.
# (multi valued)
cache_list=1,2

# Proportion by which to weight number of each flavor. If not
# defined, all our evenly weighted (multi valued)
weights=2,1

# Number of slots to keep for each flavor with a weight of
# one. (floating point value)
factor=5.0

# How often to refresh the cached slots in seconds. (integer value)
poll_period=60

Future work

We currently have blueprints to look at:

Currently this only works with a single scheduler, need to look at either:
- sharing the cache between scheduler, worker populates cache
- locking resources between schedulers (two phase commit), but still have each scheduler with their own cache
- shard hosts so each scheduler deals with a separate list of hosts
- https://blueprints.launchpad.net/nova/+spec/caching-scheduler-multi-host-decentralised
Caching by flavor is probably not always enough, we can look at a more complex cache key, at least including os_type
- https://blueprints.launchpad.net/nova/+spec/caching-scheduler-custom-cache-key

@@ Line 22: / Line 22: @@
 * but maintaining the cache is tricky, and tuning the cache may be trickier, but lets cover this next
-Lets look at what happens in the periodic background task:
+Lets look at what happens in the periodic background task, for the first time:
-** admin configure a list of flavors, and how many "slots" for each flavor to reserve
+* admin configure a list of flavors, and how many "slots" for each flavor to reserve
-** in a periodic task we populate the cache
+* in a periodic task we populate the cache
-** we generate a partial request_spec and run that through the existing filters and weights logic in host manager
+* we try to fill all the requested number of slots for each flavor, iterating through in the order in the cache list
-** the save the best hosts, as picked by the filters and weights
+** if you want to reserve slots of larger instance types, put them early in the list
-** we try to fill all the request slots, in order
+* from the flavor, we generate a partial request_spec, run that through the existing filters and weights logic in host manager
+* the best match is saved in the cache
+On the second run of the periodic task:
+* loop through the slots for each flavor, in the order defined in the cache list
+** ensure we still have capacity for each slot on the given host
+** slots for hosts that are dead or now full, are deleted from the cache
+** we are not just replacing all the slots with new ones to reduce races with current user requests
+** during this loop we claim resources as if we were building an instance on that host, so the next steps don't pick space we have already claimed
+* now we loop around the flavors, adding as many extra slots to get back up to the requested number of slots
 When a user requests a server, this is how we pick a host:
-** TODO
+* Look up the list of slots for the given flavor
+* if cache is empty, attempt to repopulate the cache, then try again, then fail with NoValidHost
+* generate a list of host info to send through the weights
+* pick the best host (with an added bit of randomness, in the usual way)
+* remove the associated slot from the cache
+* if success full, return the picked host
+* if we raced on claiming the slot, retry the above process, but no need to repopulate the cache
+This just gives you an idea of what is happening. Over time, we can try different strategies, and hopefully evolve a better strategy. As part of this optimisation work, we will develop am automatic test harness to compare different schedulers.
 == Performance Comparison to existing Filter Scheduler ==