Jump to: navigation, search

Difference between revisions of "VolumeTypeScheduler"

Line 4: Line 4:
 
At the Essex Design Summit it was agreed that a working group would be established to improve nova-volume and volume-related scheduler code.
 
At the Essex Design Summit it was agreed that a working group would be established to improve nova-volume and volume-related scheduler code.
  
This page tries to cover the first effort of this group to create a basic Volume-type aware scheduler. More advanced volume scheduler to follow as well as integration with Distributed Schedulers.
+
This page tries to cover the first effort of this group to create a basic Volume-type aware scheduler. More advanced volume schedulers to follow as well as integration with Distributed/Zone-aware Schedulers.
  
 
== Overview ==
 
== Overview ==
Line 17: Line 17:
  
 
It will be up to schedulers to find the best node matching volume creation criteria.
 
It will be up to schedulers to find the best node matching volume creation criteria.
In the simplest case scheduler could just find the node supporting properties of particular volume type. More advanced schedulers will be able to load-balance volumes based on quantities or access patterns.
+
In the simplest case the scheduler could just find the node supporting properties of particular volume type. More advanced schedulers will be able to load-balance volumes based on quantities or access patterns.
 
It will be also possible to create generic schedulers and schedulers per each volume type.
 
It will be also possible to create generic schedulers and schedulers per each volume type.
  
Line 24: Line 24:
 
=== Volume types ===
 
=== Volume types ===
  
Cloud administrator will need to create volume types that will be compliant with volume drivers they are planning to use.
+
Cloud administrator will need to create volume types that will be compliant with volume drivers they plan to use.
 
All volume type properties (one or more) will be stored in extra_specs table in form of key/value pairs.
 
All volume type properties (one or more) will be stored in extra_specs table in form of key/value pairs.
  
 
Exemplary keys might include properties like type of drive (SATA/SAS/SSD), RPM, etc.
 
Exemplary keys might include properties like type of drive (SATA/SAS/SSD), RPM, etc.
At the same time for some environments it will be necessary to store things like "storage class" - iSCSI, FC SAN, local drives
+
At the same time for some environments it will be necessary to store things like "storage class": iSCSI, FC SAN, local drives.
  
 
=== Nova-volume & drivers ===
 
=== Nova-volume & drivers ===
Line 35: Line 35:
 
Such capabilities might be reported together with quantities and other additional information.
 
Such capabilities might be reported together with quantities and other additional information.
  
* ''Note: we will need to reserve some keywords like 'storage_class', 'total', 'free'. We could either use them as some abstract values affecting scheduling decisions or have them in particular units (like GBs). In this case Scheduler will be able to check the availability of requested amount of storage. ''
+
* ''Note: we will need to reserve some keywords like 'storage_class', 'total', 'free'. We could either use total/free as some abstract values affecting scheduling decisions or have them in particular units (like GBs). In this case Scheduler will be able to check the availability of requested amount of storage. ''
  
 
The current Diablo code supports reporting volume capabilities through method get_volume_stats().
 
The current Diablo code supports reporting volume capabilities through method get_volume_stats().
Line 75: Line 75:
 
]
 
]
  
Volume drivers could report capabilities in form of key/value pairs:
+
Volume drivers could report capabilities like:
  
 
[
 
[
Line 84: Line 84:
 
]
 
]
  
If scheduler will receive a request to create volume of type 'SATA volume' it will filter all host reporting 'type': 'SATA' in their capabilities and from them will choose the most suited one.
+
If scheduler will receive a request to create volume of type 'SATA volume' it will select all hosts reporting 'type': 'SATA' in their capabilities and from them will choose the most suited one.

Revision as of 19:25, 18 October 2011

Volume Type aware Scheduler

At the Essex Design Summit it was agreed that a working group would be established to improve nova-volume and volume-related scheduler code.

This page tries to cover the first effort of this group to create a basic Volume-type aware scheduler. More advanced volume schedulers to follow as well as integration with Distributed/Zone-aware Schedulers.

Overview

Similarly to ability to schedule instances on nodes with particular HW properties, there is a need to create volumes on nodes connected to storage of particular type. The storage might be directly connected to a single node (internal disks, DAS JBODs/Arrays, etc) or to multiple nodes (shared DAS, SAN, etc).

The idea of this approach is to use a flexible mechanism of volume types allowing to define storage types with different extra-specs (key/value pairs).

Nova volume drivers will report properties/specs of connected storage on every node. They may add some additional properties like quantities, access details, etc. This information will be forwarded to schedulers through the same mechanism as used for compute nodes.

It will be up to schedulers to find the best node matching volume creation criteria. In the simplest case the scheduler could just find the node supporting properties of particular volume type. More advanced schedulers will be able to load-balance volumes based on quantities or access patterns. It will be also possible to create generic schedulers and schedulers per each volume type.

Design

Volume types

Cloud administrator will need to create volume types that will be compliant with volume drivers they plan to use. All volume type properties (one or more) will be stored in extra_specs table in form of key/value pairs.

Exemplary keys might include properties like type of drive (SATA/SAS/SSD), RPM, etc. At the same time for some environments it will be necessary to store things like "storage class": iSCSI, FC SAN, local drives.

Nova-volume & drivers

Volume manager on every participating node will collect information from all its drivers about supported capabilities. Such capabilities might be reported together with quantities and other additional information.

  • Note: we will need to reserve some keywords like 'storage_class', 'total', 'free'. We could either use total/free as some abstract values affecting scheduling decisions or have them in particular units (like GBs). In this case Scheduler will be able to check the availability of requested amount of storage.

The current Diablo code supports reporting volume capabilities through method get_volume_stats(). It has an optional parameter 'refresh' that might be used for performing a rescan/discovery of underlying H/W (not used currently).

All capabilities will be reported to schedulers using update_service_capabilities()

  • Question: should we support multiple volume drivers per node?
  • Question: As part of reporting capabilities functionality, drivers could check the DB and assign volume types to reported storage classes. In this case matching on Scheduler level might be easier (but not as flexible). Do we want it?

Scheduler

On Scheduler level, capabilities from all nova-volume nodes are automatically stored in in-memory repository similarly to capabilities from "compute" and other nodes (in zone_manager.service_states)

Create Volume requests will arrive with all necessary volume_type information. The generic volume-type aware scheduler will:

  • Retrieve volume type key/value pairs for requested volume type
  • Filter nodes reporting availability of these pairs
  • Select the most appropriate node (plugable sub-classes):
    • any random node
    • the node with min number of scheduled volumes (based on DB data)
    • the node with min used capacity (same as above, based on DB data)
    • the node with max available capacity (based on data reported by volume drivers)

We could allow registration of schedulers for particular volume types. In this case generic volume-type scheduler will pass request to such scheduler.

If volume type for volume was not set we could either:

  • pick any node
  • register some properties for "default" volume type

Examples

In the simplest case list of all supported volumes type might look like:

[ {'id': 1, 'name': 'SATA volume', 'extra_specs': {'type': 'SATA'},

{'id': 2, 'name': 'SAS volume', 'extra_specs': {'type': 'SAS'}, ... ]

Volume drivers could report capabilities like:

[ {'type': 'SATA', 'RPM': 7200, ..., 'total': 4096, 'free': 1024},

{'type': 'SAS', 'RPM': 15000, ..., 'total': 1500, 'free': 500}, ... ]

If scheduler will receive a request to create volume of type 'SATA volume' it will select all hosts reporting 'type': 'SATA' in their capabilities and from them will choose the most suited one.