Revision as of 16:31, 21 August 2013

Multiple Active Scheduler Drivers/Policies

Summary

Support for multiple active scheduler policies and/or drivers associated with different host aggregates within a single Nova deployment.

Blueprint: https://blueprints.launchpad.net/nova/+spec/multiple-scheduler-drivers

Rationale

In heterogeneous environments, it is often required that different hardware pools are managed under different policies. In Grizzly, basic partitioning of hosts and enforcement of compatibility between flavors and hosts during instance scheduling can be already implemented using host aggregates and FilterScheduler with AggregateInstanceExtraSpecsFilter. However, it is not possible to define, for example, different sets of filters and weights, or even entirely different scheduler drivers associated with different aggregates. For example, the admin may want to have a pool with a conservative CPU overcommit (e.g., for CPU-intensive workloads), and another pool with aggressive CPU over-commit (for workloads which are less CPU-bound). This blueprint introduces a mechanism to overcome this limitation.
Note: while in large-scale geo-distributed environments this can be done with Cells, there is no existing solution within a single (potentially small) Nova deployment.

User Stories

An administrator partitions the managed environment into host aggregates, decides on specialized scheduler configurations (policies) for some or all of the aggregates, and configures host aggregates and flavors accordingly.
On instance provisioning, the corresponding target aggregate and scheduling policy are determined based on the selected flavor

Note: more options to determine the desired policy will be considered in the future.

Usage Details

Configuration (user story 1)

The administrator will:

Specify 'default' scheduler driver policy in nova.conf, as usual, e.g., FilterScheduler with CoreFilter and AggregateExtraSpecFilter.
Define one or more host aggregates, comprising the desired partitioning of the managed environment, e.g., aggr1 and aggr2, so that each aggregate is designated to certain classes of workloads (e.g., CPU-intensive and CPU-balanced).
Attach new key-value pair to the metadata of each aggregate, specifying the label of the scheduling policy, e.g.: "sched_policy=low_cpu_density" for aggr1, and "sched_policy=high_cpu_density" for aggr2 (see Note in the next bullet for clarification regarding "sched_policy" key-value pair).
Decide which flavors should be used for each of the classes of workloads. Specify corresponding "sched_policy" key-value in the extra spec of the flavor ("Note": this would guarantee correct placement between aggregates, using AggregateExtraSpecFilter. If there are already other key-value pairs that would provide this guarantee, adding "sched_policy" to the aggregate and flavor is not necessary).
Decide on scheduler properties associated with each policy. Attach corresponding properties with "sched:" prefix to the extra spec (metadata) of the corresponding flavors. E.g., "sched:cpu_allocation_ratio=1.0" for flavor1 and "sched:cpu_allocation_ratio=8.0" for flavor2.

Invocation (user story 2)

The user will invoke an instance provisioning request specifying one of the flavors defined by the admin, as usual.

Note: when the flavor does not override any scheduler options, the default scheduler configuration (from nova.conf) will be used, as it has been done before.

Limitations and further enhancements

This implementation has few limitations, which will be addressed via consequent patches/blueprints:

The admin needs to ensure that if two flavors have conflicting scheduling policies (e.g., different CPU overcommit levels), then the corresponding instances will not be created on the same host (e.g., by keeping each of the flavors restricted to disjoint sets of aggregates)
Currently it is not possible to dynamically change the scheduling policy for VM instances provisioned from a given flavor
If the admin wants to manage workloads with the same virtual hardware under different scheduling policies, he will need to create several flavors (for each combination)

Ultimately, we plan to introduce scheduling policies as 'first class citizens' in Nova (DB, CRUD, association with flavors/aggregates/tenants/etc, etc). This will enable resolving most or all the above limitations.

@@ Line 11: / Line 11: @@
 == User Stories ==
-# An administrator partitions the managed environment into host aggregates, and associates specialized scheduler configurations (policies) to some or all of the aggregates.
+# An administrator partitions the managed environment into host aggregates, decides on specialized scheduler configurations (policies) for some or all of the aggregates, and configures host aggregates and flavors accordingly.
-# On instance provisioning, the name of a policy is specified using a new scheduler hint
+# On instance provisioning, the corresponding target aggregate and scheduling policy are determined based on the selected flavor
-::''Note'': more options to determine the desired policy, perhaps derived from other parameters/properties rather than explicitly specified in the provisioning request, will be considered in the future.
+::''Note'': more options to determine the desired policy will be considered in the future.
 == Usage Details ==
 === Configuration (user story 1) ===
 The administrator will:
-# Specify 'default' scheduler driver policy under [DEFAULT] section in nova.conf (e.g., FilterScheduler with CoreFilter) – as usual
+# Specify 'default' scheduler driver policy in nova.conf, as usual, e.g., FilterScheduler with CoreFilter and AggregateExtraSpecFilter.
-# Add to nova.conf one or more new sections, dedicated to specifying the different scheduling policy configurations, overriding the defaults – driver and/or associated properties. For example, [high_cpu_density] specifying FilterScheduler with CoreFilter and cpu_allocation_ratio=8, and [low_cpu_density] specifying FilterScheduler with CoreFilter and cpu_allocation_ratio=1. Not that in the above example, since driver and filters are the same, it would not be mandatory to specify them in the specific policy sections.
+# Define one or more host aggregates, comprising the desired partitioning of the managed environment, e.g., aggr1 and aggr2, so that each aggregate is designated to certain classes of workloads (e.g., CPU-intensive and CPU-balanced).
-# Specify in nova.conf which policies are enabled (using a new property – e.g., enabled_scheduler_policies=low_cpu_density, high_cpu_density)
+# Attach new key-value pair to the metadata of each aggregate, specifying the label of the scheduling policy, e.g.: "sched_policy=low_cpu_density" for aggr1, and "sched_policy=high_cpu_density" for aggr2 (see Note in the next bullet for clarification regarding "sched_policy" key-value pair).
-# Create and populate with hosts one or more host aggregates, as usual.
+# Decide which flavors should be used for each of the classes of workloads. Specify corresponding "sched_policy" key-value in the extra spec of the flavor ("Note": this would guarantee correct placement between aggregates, using AggregateExtraSpecFilter. If there are already other key-value pairs that would provide this guarantee, adding "sched_policy" to the aggregate and flavor is not necessary).
-# Set a new metadata key-value pair for one or more of the aggregates, specifying the desired policy to be used for scheduling instances in the corresponding aggregate (e.g., "policy=high_cpu_density").
+# Decide on scheduler properties associated with each policy. Attach corresponding properties with "sched:" prefix to the extra spec (metadata) of the corresponding flavors. E.g., "sched:cpu_allocation_ratio=1.0" for flavor1 and "sched:cpu_allocation_ratio=8.0" for flavor2.
-Example (partial) nova.conf:
-<tt>
-[DEFAULT]<BR>
-scheduler_driver=nova.scheduler.filter_scheduler.FilterScheduler<BR>
-scheduler_default_filters = CoreFilter, SchedulerPolicyFilter<BR>
-cpu_allocation_ratio = 4.0<BR>
-<BR>
-<nowiki>#</nowiki> a list of policies that will be used by this scheduler<BR>
-enabled_scheduler_policies = low_cpu_density, high_cpu_density<BR>
-<BR>
-[low_cpu_density]<BR>
-cpu_allocation_ratio = 1.0<BR>
-<BR>
-[high_cpu_density]<BR>
-cpu_allocation_ratio = 8.0<BR>
-</tt>
 === Invocation (user story 2) ===
-The user will invoke an instance provisioning request specifying the desired policy via a dedicated scheduler hint. For example:
+The user will invoke an instance provisioning request specifying one of the flavors defined by the admin, as usual.
-<tt>
-$ nova boot --image 1 --flavor 1 --hint target_policy=low_cpu_density my-first-server
-</tt>
-''Note'': when no policy is specified, the default scheduler configuration will be used, as it has been done before.
+''Note'': when the flavor does not override any scheduler options, the default scheduler configuration (from nova.conf) will be used, as it has been done before.
-== Design Considerations ==
-=== Policy selection ===
-In order to enable flexibility in selection of the scheduling policy, the selection logic will be encapsulated in a separate class, specified in nova.conf. In the first implementation, we will provide a single implementation selecting the scheduler policy based on an explicit scheduler hint, as specified above.
-<tt>
+== Limitations and further enhancements ==
-[DEFAULT]<BR>
+This implementation has few limitations, which will be addressed via consequent patches/blueprints:
-scheduler_policy_selection=SchedulerHintTargetPolicySelection
+* The admin needs to ensure that if two flavors have conflicting scheduling policies (e.g., different CPU overcommit levels), then the corresponding instances will not be created on the same host (e.g., by keeping each of the flavors restricted to disjoint sets of aggregates)
-</tt>
+* Currently it is not possible to dynamically change the scheduling policy for VM instances provisioned from a given flavor
-=== Host selection ===
+* If the admin wants to manage workloads with the same virtual hardware under different scheduling policies, he will need to create several flavors (for each combination)
-Once a scheduling policy is specified, the scheduler logic needs to restrict the applicable target hosts to those which should be managed with this policy – i.e., hosts in host aggregate(s) that specify the given policy. In order to implement this restriction in FilterScheduler, a new scheduler filter has been implemented, SchedulerPolicyFilter, which filters out hosts associated with different policies. As a special case, hosts which do not belong to an aggregate, or belong to an aggregate without a specified scheduling policy, would be considered as if they have been associated with a default policy, and handled by scheduler configuration specified in the DEFAULT section. Provisioning requests that do not specify a scheduling policy would automatically map to the default policy.
-=== Potential conflicts between policies ===
+Ultimately, we plan to introduce scheduling policies as 'first class citizens' in Nova (DB, CRUD, association with flavors/aggregates/tenants/etc, etc). This will enable resolving most or all the above limitations.
-Once there are multiple scheduling policies defined in the environment, in some cases it is important to make sure that there are no conflicts between the different policies. For example, this may mean that the same policy that has been used to find placement for an instance at provisioning time should also be used for other operations, such as instance migration. Another example is that the same policy should be used for provisioning of all the instances running on a certain physical machine. However, today it might be possible to include a certain physical host in two different host aggregates, and to associate a different scheduling policy to each of them. In such a case our current implementation will choose the first applicable policy for a given operation – which may cause conflicts. One way to avoid conflicts is to ensure that host aggregates that specify scheduling policy are disjoint. However, we feel that this may be a too strong requirement in some cases. Therefore, the most flexible approach at the moment would be to document this as a best practice, and let the administrator to decide how host aggregates and policies are used in a particular deployment.

Difference between revisions of "Nova/MultipleSchedulerPolicies"