Summary

When API clients request a compute node or a volume, currently they can only specify the availability zone. As we support more sophisticated deployment environments with more advanced capabiltiies, we should offer clients the ability to request more fine-grained functionality. For example, a compute node might specify that it needs a video card with GPU functionality or might specify that its CPU requirements are bursty, or a volume might be better as RAID-5 or RAID-1, or we might want a compute node as close as possible to a particular volume etc. If clients provide their preferences using metadata tags, the schedulers can then match these requests to available resources.

Issues

A problem with the current proposal is that this functionality is not (currently) available in the AWS API, so would be OpenStack-API-only. AWS tags can only be specified after the item is created. As proposed, this functionality would simply not be available to AWS-API users, but they would continue to enjoy the limited functionality that this API offers. We could work with Eucalyptus to extend the Amazon API if desired. We could also stuff this information into another supported field; for example the zone (e.g. euca-run-instance --availability-zone friend-zone;openstack:near=volume-000001) (Packing this extra data into the zone instead of the instance type allows this to apply to volumes & instances, as vish pointed out)

See Also

Release Note

todo

Rationale

OpenStack supports environments much more sophisticated than previous clouds. We therefore need a new way to expose those capabilities to API clients.

Assumptions

Design

Currently, the only selection metadata supported is the zone. However, OpenStack supports diverse hardware which offers a lot more functionality, and we need a way to avoid the lowest-common-denominator problem. For example, with a SAN: thin-provisioning, RAID and SSD may be available. We may request that a compute node be placed as close as possible to a volume or to another compute node. We may request that they _not_ be placed in the same cabinet for redundancy. We may request that instances or volumes are created on HIPAA or PCI compliant hardware (whatever that means in practice!) In general, we can't predict all the extensions that may be available in future, so we need a flexible way to do this.

Some of this may be done through other means (GPU through instance types, avoiding cabinets through zones). However, we run the risk of an explosion in the number of instance types if we try to multiplex all this information onto a single key. Some of this cannot be done through instance types - in particular, specifying proximity.

The Amazon and CloudServers API already support the idea of metadata on entities. Amazon currently doesn't let you create that metadata at instance creation time, and CloudServers has an arbitrary limit (5 key/value pairs?) which would need to be increased. But this seems the right extension point for this capability-request metadata. By providing information here, it will be persisted so that it is available to the scheduler going forward (for example, when looking at moving machines in response to problems, the scheduler needs to consider the original request) In addition, this information is then available to clients through the API. Amazon already reserve the "aws:" prefix on keys for their own purposes; I propose that we reserve the "openstack:" prefix on keys for keys that go through the project code-approval process.

We are not going to define the entire set of available keys in this blueprint. However, I propose that we do define a few 'key' keys so that we don't disappear off into architecture-astronaut land:

* openstack:near specifies that we would like to create the volume/compute node as close as possible to a the specified compute/volume node(s). Those can be provided as a comma-separated list in the value. e.g. openstack:near=vol-00000001,i-00000002 The meaning of "as close as possible" is not specified here, but would typically probably mean something like (1) try the same host node then (2) try the same rack etc. A key use of this is creating compute nodes near their volume nodes.

* openstack:location specifies that we would like to create the volume/compute node in a specified zone/sub-zone e.g. openstack:location=dc1.north.rackspace.com. See Eric Day's proposal for naming here: https://lists.launchpad.net/openstack/msg00513.html

Other keys we may want to define now: openstack:iopattern (to specify the I/O behavior we expect). Any more???

Because this is potentially an open-ended set of keys, the OpenStack installation might not support all the capabilities that a client requests. Thus all capability requests should be done on a best-effort basis. However, if a client requires that a particular value be respected, I propose that they prefix the key with a plus (like with a google-search): openstack:+near=vol-000001 or openstack:+regulations=hipaa. If the OpenStack installation doesn't understand or support a required key, it should return an error (just as it does when resources are exhausted.)

It is expected that this metadata would then be consumed by the scheduler service and by the volume/compute/network services. These would match the requests to the capabilities of the resources under their management. The only initial impact is that any required keys should be rejected if not understood (and initially, none need be understood, though it would make sense to implement at least a few initial keys!)

User stories

todo

Implementation

This section should describe a plan of action (the "how") to implement the changes discussed.

todo

Wiki: RequestCapabilities (last edited 2011-02-11 02:45:49 by JustinSantaBarbara)