Blueprint-nova-compute-cells
- Launchpad Entry: NovaSpec:nova-compute-cells
- Created: Chris Behrens
- Contributors: Chris Behrens, Brian Elliott, Dragon, Alex Meade, Brian Lamar, Matt Sherborne, Sam Morrison
Contents
Summary
This blueprint introduces the new nova-cells service.
The aims of the service are:
- to allow additional scaling and (geographic) distribution without complicated database or message queue clustering
- to separate cell scheduling from host scheduling
Release Note
Rationale
Terminology, and History
Readers familiar with Amazon EC2 (eg), will understand that
- A Geographical Region has multiple Availability zones
- Availability Zones are distinct locations that are engineered to be insulated from failures in other Availability Zones and provide inexpensive, low latency network connectivity to other Availability Zones in the same Region. (AWS term)
- eg EU-West-1 has EU-West-1a EU-West-1b EU-West-1c.
- Client connects to the Region (i.e. EC2 endpoint)
- When asking for a VM, if an Availability Zone is not specified, the scheduler will choose which Availability Zone in the Region to use
- Alternately, if the user specifies an Availability Zone, the VM will start in that Availability Zone
Support for "Zones" has been present since early versions of OpenStack. The Bexar release implemented `Availability Zones` for an instance, based on Amazon terminology.
Zone
Later, the concept of a Nova `Zone` came up:
- A stand-alone Nova deployment was called a Zone.
- A Zone allowed you to partition your deployments into logical groups for load balancing and instance distribution. At the very least a Zone required an API node, a Scheduler node, a database and RabbitMQ. Zones shared nothing. No database, queue, user or project definition is shared between Zones. (OpenStack term)
Inter-Zone communication was considered untrusted and communications between Zones would be done using only the public OpenStack API. In Diablo, with the addition of Keystone, Zones were broken beyond usability, and in Essex they were removed entirely.
Introducing Cells
At the Folsom Design Summit, following some discussion on the mailing list, Chris Behrens proposed 'Folsom Compute Cells' as a new design replace the old 'Zone' concept. In contrast to before, Cell-Cell communication is trusted and goes via the AQMP bus.
Design
The service implementation is based on:
- A separate database and message broker per cell
- Inter-cell communication via pluggable driver (RPC is the only current driver available)
- A tree structure, with
- nova-API server in the 'top cell' only, not in children
- support for multiple parent cells
- Cell scheduling database from information pushed from children,
- based on periodic broadcasts of capabilities and capacities
- on database updates (instance update/destroy/fault_create)
Services per cell
An API cell contains:
- AMQP Broker
- Database
- nova-cells
- nova-api
A child cell contains:
- AMQP Broker
- Database
- nova-cells
- nova-scheduler
- nova-network
- nova-compute
Global services:
- Glance
- Keystone
Cell routing
TBD
Configuration
New configuration options are added for Cells within their own config group called 'cells'. One should create a [cells] section in their nova.conf.
Options:
- `enable` # enables the cells code
- `name` # A short name for the current cell. Think of this like a non-fully-qualified hostname like 'api'
- `capabilities` # Arbitrary key/value pairs to advertise to neighbor cells. (Unused in the first implementation)
Additionally, you'll need to configure other options in the DEFAULT section such as `compute_api_class` and `quota_driver`
Example API cell config:
[DEFAULT] # Swap out the compute_api class so actions are proxied to nova-cells service. compute_api_class=nova.compute.cells_api.ComputeCellsAPI [cells] name=api enable=true
Example Child cell config:
[GLOBAL] # Disable quota checking in child cells. Let API cell do it exclusively. quota_driver=nova.quota.NoopQuotaDriver [cells] enable=true name=cell1 # something unique per child cell
Before bringing services online, you'll want to tell each cell about each other. The global cell needs to know about its immediate children. The child cells need to know about their immediate parents. Information needed is the rabbit server credentials for the particular cell. We can add these via nova-manage in each cell.
nova-manage cell create usage:
> bin/nova-manage cell create -h Usage: nova-manage cell create <args> [options] Options: -h, --help show this help message and exit --name=<name> Name for the new cell --cell_type=<parent|child> Whether the cell is a parent or child --username=<username> Username for the message broker in this cell --password=<password> Password for the message broker in this cell --hostname=<hostname> Address of the message broker in this cell --port=<number> Port number of the message broker in this cell --virtual_host=<virtual_host> The virtual host of the message broker in this cell --woffset=<float> --wscale=<float>
Let's assume we have an API cell named 'api' and 2 child cells 'cell1' and 'cell2'. Within the api cell, we have the following rabbit server info:
rabbit_host=10.0.0.10 rabbit_port=5672 rabbit_username=api_user rabbit_password=api_passwd rabbit_virtual_host=api_vhost
And in the child cell named 'cell1' we have the following rabbit server info:
rabbit_host=10.0.1.10 rabbit_port=5673 rabbit_username=cell1_user rabbit_password=cell1_passwd rabbit_virtual_host=cell1_vhost
And in the child cell named 'cell2' we have the following rabbit server info:
rabbit_host=10.0.2.10 rabbit_port=5673 rabbit_username=cell2_user rabbit_password=cell2_passwd rabbit_virtual_host=cell2_vhost
We would run these in the API cell to tell it about its children:
> nova-manage cell create --name=cell1 --cell_type=child --username=cell1_user --password=cell1_passwd --hostname=10.0.1.10 --port=5673 --virtual_host=cell1_vhost --woffset=1.0 --wscale=1.0 > nova-manage cell create --name=cell2 --cell_type=child --username=cell2_user --password=cell2_passwd --hostname=10.0.2.10 --port=5673 --virtual_host=cell2_vhost --woffset=1.0 --wscale=1.0
In both child cells, we would run this to tell them about their parent:
> nova-manage cell create --name=api --cell_type=parent --username=api1_user --password=api1_passwd --hostname=10.0.0.10 --port=5672 --virtual_host=api_vhost --woffset=1.0 --wscale=1.0
References
- Original Etherpad: http://etherpad.openstack.org/FolsomComputeCells
- Original presentation: http://comstud.com/FolsomCells.pdf
- Grizzly presentation: http://comstud.com/GrizzlyCells.pdf
- Core review: https://review.openstack.org/#/c/15228/