Difference between revisions of "Nova"
Line 2: | Line 2: | ||
== Nova Cloud Review == | == Nova Cloud Review == | ||
The purpose of this document is to capture the pluses and minuses of using Nova's code as a part of Cloud servers v2 | The purpose of this document is to capture the pluses and minuses of using Nova's code as a part of Cloud servers v2 | ||
+ | |||
+ | === Glossary === | ||
+ | * Public API Servers - Know as "Nucleus" in cloud servers v1 or "Cloud Controller" in Eucalyptus. | ||
+ | * Pod - A group of physical host nodes. Known as "QB" in cloud server v1 or "Cluster Controller" in Eucalyptus. | ||
+ | * Nodes - individual physical hosts in a Pod | ||
=== What's Done === | === What's Done === | ||
* Scalable and elastic architecture - fully message based and asynchronous | * Scalable and elastic architecture - fully message based and asynchronous | ||
* Many months ahead of us | * Many months ahead of us | ||
− | * | + | * Written in good Python |
− | * | + | * Open source and it appears that they will be following an open development model |
− | * | + | * Have stubbed out all components for testing |
* Actually write SSH keys and authorized_keys properly | * Actually write SSH keys and authorized_keys properly | ||
+ | * openldap based authentication and authorization | ||
* All functionality is created via an adapter model, so implementations (for instance, storage backends, messaging backends, etc) can be swapped out as needed | * All functionality is created via an adapter model, so implementations (for instance, storage backends, messaging backends, etc) can be swapped out as needed | ||
=== What Needs to be Done === | === What Needs to be Done === | ||
+ | * Create a layer inside Nova that would be able to distinguish between different pods | ||
+ | ** Currently, the [[CloudController]] class in /endpoints/cloud.py represents a mixture of a public API server and a pod controller | ||
+ | *** The [[CloudController]] class receives public API requests and sends messages to the nodes to perform actions | ||
+ | *** Separate out the receipt and translation of public API requests to a separate APIServer class | ||
+ | *** Separate out the transmission of private action messages to a [[PodController]] class | ||
* Detach from Amazon/Eucalyptus specifics and make some things more generic | * Detach from Amazon/Eucalyptus specifics and make some things more generic | ||
** API: We need to add the Rackspace API, and a caching layer | ** API: We need to add the Rackspace API, and a caching layer | ||
*** It is not reasonable for us to use the Amazon API, we would be unable to innovate and would constantly be catch up | *** It is not reasonable for us to use the Amazon API, we would be unable to innovate and would constantly be catch up | ||
− | *** We would also need to add a distinct API for each service we layer on top, so they can be used with either the ec2 or | + | *** We would also need to add a distinct API for each service we layer on top, so they can be used with either the ec2 or rackspace API's |
* AOE | * AOE | ||
** Definitely needs to be adapted for other services like [[CloudFiles]], gluster, etc | ** Definitely needs to be adapted for other services like [[CloudFiles]], gluster, etc | ||
Line 30: | Line 41: | ||
** see /adminclient.py | ** see /adminclient.py | ||
* Only supports AMIs, we should add OVA support | * Only supports AMIs, we should add OVA support | ||
− | * Requires use of euca2ools, which are tainted | + | * Requires use of euca2ools, which are tainted, we need a set of ova tools and possibly a clean room rewrite of the AMI tools, if we care |
* Overarching documentation is sparse (though the code comments are pretty decent) | * Overarching documentation is sparse (though the code comments are pretty decent) | ||
* twisted (and Python) is, by nature, single-core, so it *may* be a bottleneck, but that remains to be demonstrated | * twisted (and Python) is, by nature, single-core, so it *may* be a bottleneck, but that remains to be demonstrated | ||
Line 39: | Line 50: | ||
** Puppet, Chef, or even a DKVS | ** Puppet, Chef, or even a DKVS | ||
** The "flavors" are hardcoded in /compute/node.py (grep for INSTANCE_TYPES) | ** The "flavors" are hardcoded in /compute/node.py (grep for INSTANCE_TYPES) | ||
− | + | * While there is decent unittest coverage, there is no real systems testing or documentation of plans for one | |
+ | ** There would need to be a good chunk of code written to automate the testing of pod deployments, the testing of network partitions, and more | ||
=== Unknowns === | === Unknowns === | ||
* Asked jm to take a looksie into any possible Windows issues with the code base (in using Windows as a host with Hyper-V? Not sure what this means) | * Asked jm to take a looksie into any possible Windows issues with the code base (in using Windows as a host with Hyper-V? Not sure what this means) | ||
** We know that ssh keys will not work with windows, so another method is necessary | ** We know that ssh keys will not work with windows, so another method is necessary |
Revision as of 21:01, 10 June 2010
Nova Cloud Review
The purpose of this document is to capture the pluses and minuses of using Nova's code as a part of Cloud servers v2
Glossary
- Public API Servers - Know as "Nucleus" in cloud servers v1 or "Cloud Controller" in Eucalyptus.
- Pod - A group of physical host nodes. Known as "QB" in cloud server v1 or "Cluster Controller" in Eucalyptus.
- Nodes - individual physical hosts in a Pod
What's Done
- Scalable and elastic architecture - fully message based and asynchronous
- Many months ahead of us
- Written in good Python
- Open source and it appears that they will be following an open development model
- Have stubbed out all components for testing
- Actually write SSH keys and authorized_keys properly
- openldap based authentication and authorization
- All functionality is created via an adapter model, so implementations (for instance, storage backends, messaging backends, etc) can be swapped out as needed
What Needs to be Done
- Create a layer inside Nova that would be able to distinguish between different pods
- Currently, the CloudController class in /endpoints/cloud.py represents a mixture of a public API server and a pod controller
- The CloudController class receives public API requests and sends messages to the nodes to perform actions
- Separate out the receipt and translation of public API requests to a separate APIServer class
- Separate out the transmission of private action messages to a PodController class
- Currently, the CloudController class in /endpoints/cloud.py represents a mixture of a public API server and a pod controller
- Detach from Amazon/Eucalyptus specifics and make some things more generic
- API: We need to add the Rackspace API, and a caching layer
- It is not reasonable for us to use the Amazon API, we would be unable to innovate and would constantly be catch up
- We would also need to add a distinct API for each service we layer on top, so they can be used with either the ec2 or rackspace API's
- API: We need to add the Rackspace API, and a caching layer
- AOE
- Definitely needs to be adapted for other services like CloudFiles, gluster, etc
- Defaults of VLANs could be changed
- though you can manually allocate IPs or use DHCP (see /compute/network.py)
- Functionality needed by hosting providers
- Metrics
- CPU, memory, disk usage, network RX/TX
- but, again, the backend storage is already taken care of...
- Metrics
- Billing
- Need to define the billable events in a model
- Admin client is AWS-specific and needs an adapter interface
- see /adminclient.py
- Only supports AMIs, we should add OVA support
- Requires use of euca2ools, which are tainted, we need a set of ova tools and possibly a clean room rewrite of the AMI tools, if we care
- Overarching documentation is sparse (though the code comments are pretty decent)
- twisted (and Python) is, by nature, single-core, so it *may* be a bottleneck, but that remains to be demonstrated
- No support for gluster or drbd, but there are adapters for plugging such functionality into the app domain
- Add an endpoint so different compute clusters can be discovered for different clusters, especially when distributed geographically.
- Configuration management is almost non-existent
- Need to plugin/adapt the configuration retrieval
- Puppet, Chef, or even a DKVS
- The "flavors" are hardcoded in /compute/node.py (grep for INSTANCE_TYPES)
- While there is decent unittest coverage, there is no real systems testing or documentation of plans for one
- There would need to be a good chunk of code written to automate the testing of pod deployments, the testing of network partitions, and more
Unknowns
- Asked jm to take a looksie into any possible Windows issues with the code base (in using Windows as a host with Hyper-V? Not sure what this means)
- We know that ssh keys will not work with windows, so another method is necessary