Savanna has two major parts responsible for interaction with VMs:
- Spawning VMs, making common configuration like populating /etc/hosts, configuring passwordless access between instances;
- Performing Hadoop installation and/or configuration, that part is distribution specific and is implemented by corresponding plugin.
We should start integrating with Heat by converting the first part and then the second one. Our goal is to add extensionable mechanism to be able to eventually switch from “direct” orchestration using Nova, Cinder and Neutron to Heat-based without braking provisioning part of Savanna. When this process will be completed, “direct” plugin could be removed.
Spawning VMs, Common Configuration
Spawning VMs should be pretty straightforward with Heat, except a couple of things:
1. Savanna provides an option to spawn cluster nodes with anti-affinity filter, which seems like is not implemented in Heat yet.
Current plan is to wait until instance node groups (https://blueprints.launchpad.net/nova/+spec/instance-group-api-extension) are implemented in Nova in early Icehouse and integrate the feature into Heat.
2. Performing common configuration after instances have started. We also need to pass information received _after_ instances are started to instances. For example, generating /etc/hosts requires that: we need to pass IPs of all instances to each node.
Preliminary research shows that a combination of cloud-init, os-collect-config and os-apply-config should work there.
Hadoop installation and/or configuration
That part is more complex and less clear at the moment since it includes changing several existing plugins. We need to redefine plugin SPI in way that:
- Suites to all plugins needs;
- Could be partially moved into Heat template.
In general the following approach looks suitable: make a plugin able to define configuration for each node as a script. The script later could be passed into cloud-init. If configuration requires some knowledge that could be obtained only after cluster instances have started, it again could be performed with os-collect-config and os-apply-config. Such script could perform some preparations and after it Savanna will only communicate with installed and configured management console using, for example, REST API to finish Hadoop services provisioning.