Jump to: navigation, search

Difference between revisions of "Sahara/WhyNotHeat"

(Created page with "1. The first question is “Why doesn’t Savanna use Heat to provision VMs?” Generally using Heat underneath for infrastructure provisioning looks reasonable. In a tactic ...")
 
m (Sergey Lukjanov moved page Savanna/WhyNotHeat to Sahara/WhyNotHeat: Savanna project was renamed due to the trademark issues.)
 
(12 intermediate revisions by 3 users not shown)
Line 1: Line 1:
1. The first question is “Why doesn’t Savanna use Heat to provision VMs?
+
==1. Why doesn’t Savanna use Heat to provision VMs *now*?==
  
Generally using Heat underneath for infrastructure provisioning looks
+
https://wiki.openstack.org/wiki/Savanna/HeatIntegration
reasonable. In a tactic perspective there are few factors making Heat
 
usage underneath Savanna problematic:
 
* Heat stability for Grizzly release. Savanna currently maintains
 
Grizzly+ compatibility.
 
* Installation of large Hadoop clusters (100+ nodes). Will be
 
addressed by proposed architecture changes.
 
* Anti-affinity support for HDFS redundancy in cloud environment
 
* Circular dependencies - we should generate ‘/etc/hosts’ for all
 
instances in provisioned cluster. We can’t use cloud init for this
 
directly. There are a couple possible solutions using Heat, but none
 
of them looks like a straightforward solution.
 
* Level of complexity. We try to keep things as simple as possible.
 
Adding extra layer will increase overall complexity of the solution.
 
In addition both Savanna and Heat under active development changing
 
lots of internals and even APIs and will require extra effort to
 
coordinate.
 
  
Once Heat fulfills all the requirements we will be able and should use
+
==2. Why we need Savanna? Can’t we use Heat to do what Savanna does?==
Heat for VM provisioning.
 
  
 
+
* Savanna provides bunch of Hadoop-specific features. It’ll be hard to provide them as Heat plugin
2. Let’s answer the second question - why we need Savanna? Can’t we
+
* Savanna provides Hadoop-specific APIs and functionality. Heat use cases are mostly around provisioning/deployment.
use Heat to do what Savanna does?
+
* Savanna provides integration with various Hadoop distributions through pluggable mechanism
 
 
* Savanna provides bunch of Hadoop-specific features. It’ll be hard to
 
provide them as Heat plugin
 
* Savanna provides Hadoop-specific APIs and functionality. Heat use
 
cases are mostly around provisioning/deployment.
 
* Savanna provides integration with various Hadoop distributions
 
through pluggable mechanism
 
  
 
Now, more details on each item.
 
Now, more details on each item.
 
Hadoop specific features:
 
Hadoop specific features:
* Tight Swift integration. Hadoop can read and write from/to Swift
+
* Tight Swift integration. Hadoop can read and write from/to Swift object storage. Savanna provides required configs for the Hadoop cluster.
object storage. Savanna provides required configs for the Hadoop
 
cluster.
 
 
* Usage of anti-affinity to preserve data-redundancy of HDFS nodes
 
* Usage of anti-affinity to preserve data-redundancy of HDFS nodes
 +
  
 
Hadoop-specific APIs and functionality:
 
Hadoop-specific APIs and functionality:
 
* Hadoop cluster scaling
 
* Hadoop cluster scaling
 
* Elastic Data Processing: https://wiki.openstack.org/wiki/Savanna/EDP
 
* Elastic Data Processing: https://wiki.openstack.org/wiki/Savanna/EDP
 +
  
 
Integration with Hadoop distributions through pluggable mechanism:
 
Integration with Hadoop distributions through pluggable mechanism:
- Usually Hadoop cluster deployment is a multi-step operation. First
 
step is to install management console (for instance Apache Ambari).
 
Second step is to communicate with management console through REST API
 
to provision Hadoop on the cluster. Savanna wraps all this operations
 
under well-defined API.
 
  
I hope all the items above explain why we need Savanna as a separate
+
Usually Hadoop cluster deployment is a multi-step operation. First step is to install management console (for instance Apache Ambari). Second step is to communicate with management console through REST API to provision Hadoop on the cluster. Savanna wraps all this operations under well-defined API.
OpenStack service.
 
  
  
3. Why can’t Savanna be used as a plugin for Heat?
+
==3. Why can’t Savanna be used as a plugin for Heat?==
 
It should be and it will be someday.
 
It should be and it will be someday.

Latest revision as of 15:41, 7 March 2014

1. Why doesn’t Savanna use Heat to provision VMs *now*?

https://wiki.openstack.org/wiki/Savanna/HeatIntegration

2. Why we need Savanna? Can’t we use Heat to do what Savanna does?

  • Savanna provides bunch of Hadoop-specific features. It’ll be hard to provide them as Heat plugin
  • Savanna provides Hadoop-specific APIs and functionality. Heat use cases are mostly around provisioning/deployment.
  • Savanna provides integration with various Hadoop distributions through pluggable mechanism

Now, more details on each item. Hadoop specific features:

  • Tight Swift integration. Hadoop can read and write from/to Swift object storage. Savanna provides required configs for the Hadoop cluster.
  • Usage of anti-affinity to preserve data-redundancy of HDFS nodes


Hadoop-specific APIs and functionality:


Integration with Hadoop distributions through pluggable mechanism:

Usually Hadoop cluster deployment is a multi-step operation. First step is to install management console (for instance Apache Ambari). Second step is to communicate with management console through REST API to provision Hadoop on the cluster. Savanna wraps all this operations under well-defined API.


3. Why can’t Savanna be used as a plugin for Heat?

It should be and it will be someday.