Difference between revisions of "Nova"

Revision as of 21:01, 10 June 2010

Nova Cloud Review

The purpose of this document is to capture the pluses and minuses of using Nova's code as a part of Cloud servers v2

Glossary

Public API Servers - Know as "Nucleus" in cloud servers v1 or "Cloud Controller" in Eucalyptus.
Pod - A group of physical host nodes. Known as "QB" in cloud server v1 or "Cluster Controller" in Eucalyptus.
Nodes - individual physical hosts in a Pod

What's Done

Scalable and elastic architecture - fully message based and asynchronous
Many months ahead of us
Written in good Python
Open source and it appears that they will be following an open development model
Have stubbed out all components for testing
Actually write SSH keys and authorized_keys properly
openldap based authentication and authorization
All functionality is created via an adapter model, so implementations (for instance, storage backends, messaging backends, etc) can be swapped out as needed

What Needs to be Done

Create a layer inside Nova that would be able to distinguish between different pods
- Currently, the CloudController class in /endpoints/cloud.py represents a mixture of a public API server and a pod controller
  - The CloudController class receives public API requests and sends messages to the nodes to perform actions
  - Separate out the receipt and translation of public API requests to a separate APIServer class
  - Separate out the transmission of private action messages to a PodController class
Detach from Amazon/Eucalyptus specifics and make some things more generic
- API: We need to add the Rackspace API, and a caching layer
  - It is not reasonable for us to use the Amazon API, we would be unable to innovate and would constantly be catch up
  - We would also need to add a distinct API for each service we layer on top, so they can be used with either the ec2 or rackspace API's
AOE
- Definitely needs to be adapted for other services like CloudFiles, gluster, etc
Defaults of VLANs could be changed
- though you can manually allocate IPs or use DHCP (see /compute/network.py)
Functionality needed by hosting providers
- Metrics
  - CPU, memory, disk usage, network RX/TX
  - but, again, the backend storage is already taken care of...
Billing
- Need to define the billable events in a model
Admin client is AWS-specific and needs an adapter interface
- see /adminclient.py
Only supports AMIs, we should add OVA support
Requires use of euca2ools, which are tainted, we need a set of ova tools and possibly a clean room rewrite of the AMI tools, if we care
Overarching documentation is sparse (though the code comments are pretty decent)
twisted (and Python) is, by nature, single-core, so it *may* be a bottleneck, but that remains to be demonstrated
No support for gluster or drbd, but there are adapters for plugging such functionality into the app domain
Add an endpoint so different compute clusters can be discovered for different clusters, especially when distributed geographically.
Configuration management is almost non-existent
- Need to plugin/adapt the configuration retrieval
- Puppet, Chef, or even a DKVS
- The "flavors" are hardcoded in /compute/node.py (grep for INSTANCE_TYPES)
While there is decent unittest coverage, there is no real systems testing or documentation of plans for one
- There would need to be a good chunk of code written to automate the testing of pod deployments, the testing of network partitions, and more

Unknowns

Asked jm to take a looksie into any possible Windows issues with the code base (in using Windows as a host with Hyper-V? Not sure what this means)
- We know that ssh keys will not work with windows, so another method is necessary

@@ Line 2: / Line 2: @@
 == Nova Cloud Review ==
 The purpose of this document is to capture the pluses and minuses of using Nova's code as a part of Cloud servers v2
+=== Glossary ===
+* Public API Servers - Know as "Nucleus" in cloud servers v1 or "Cloud Controller" in Eucalyptus.
+* Pod - A group of physical host nodes. Known as "QB" in cloud server v1 or "Cluster Controller" in Eucalyptus.
+* Nodes - individual physical hosts in a Pod
 === What's Done ===
 * Scalable and elastic architecture - fully message based and asynchronous
 * Many months ahead of us
-* written in good Python
+* Written in good Python
-* open source and it appears that they will be following an open development model
+* Open source and it appears that they will be following an open development model
-* have stubbed out all components for testing
+* Have stubbed out all components for testing
 * Actually write SSH keys and authorized_keys properly
+* openldap based authentication and authorization
 * All functionality is created via an adapter model, so implementations (for instance, storage backends, messaging backends, etc) can be swapped out as needed
 === What Needs to be Done ===
+* Create a layer inside Nova that would be able to distinguish between different pods
+** Currently, the [[CloudController]] class in /endpoints/cloud.py represents a mixture of a public API server and a pod controller
+*** The [[CloudController]] class receives public API requests and sends messages to the nodes to perform actions
+*** Separate out the receipt and translation of public API requests to a separate APIServer class
+*** Separate out the transmission of private action messages to a [[PodController]] class
 * Detach from Amazon/Eucalyptus specifics and make some things more generic
 ** API:  We need to add the Rackspace API, and a caching layer
 *** It is not reasonable for us to use the Amazon API, we would be unable to innovate and would constantly be catch up
-*** We would also need to add a distinct API for each service we layer on top, so they can be used  with either the ec2 or racjkspace API's
+*** We would also need to add a distinct API for each service we layer on top, so they can be used  with either the ec2 or rackspace API's
 * AOE
 ** Definitely needs to be adapted for other services like [[CloudFiles]], gluster, etc
@@ Line 30: / Line 41: @@
 ** see /adminclient.py
 * Only supports AMIs, we should add OVA support
-* Requires use of euca2ools, which are tainted
+* Requires use of euca2ools, which are tainted, we need a set of ova tools and possibly a clean room rewrite of the AMI tools, if we care
 * Overarching documentation is sparse (though the code comments are pretty decent)
 * twisted (and Python) is, by nature, single-core, so it *may* be a bottleneck, but that remains to be demonstrated
@@ Line 39: / Line 50: @@
 ** Puppet, Chef, or even a DKVS
 ** The "flavors" are hardcoded in /compute/node.py (grep for INSTANCE_TYPES)
+* While there is decent unittest coverage, there is no real systems testing or documentation of plans for one
+** There would need to be a good chunk of code written to automate the testing of pod deployments, the testing of network partitions, and more
 === Unknowns ===
 * Asked jm to take a looksie into any possible Windows issues with the code base (in using Windows as a host with Hyper-V? Not sure what this means)
 ** We know that ssh keys will not work with windows, so another method is necessary