Sahara/PluggableProvisioning

Image Registry:
1. register image in Savanna;
2. add/remove tags to/from images;
3. get images by tags;
VM Manager:
1. launch/terminate vms;
2. get vm status;
3. ssh/scp/etc to vm;
Plugins:
1. get extra conf (specific for the concrete plugin);
2. launch / terminate clusters;
3. add / remove node;
4. validation ops.

Savanna:
1. provides resources and infrastructure (pre-configured vms, dns, etc.);
2. cluster topologies, nodes and storage placement;
3. cluster/hadoop/tooling configurations and state storage;
Plugins:
1. cluster monitoring;
2. additional tools installation and management (Pig, Hive, etc.);
3. final cluster configuration and hadoop management;
4. add/remove nodes to/from cluster (with prepared by Savanna resources).

Cluster creation workflow for User:

get list of plugins;
specify cluster name;
choose plugin version and hadoop version (only minor variation);
specifies cluster configuration:
choose a common cluster configuration if needed;
specify flavors for job tracker and name node;
[optional] choose flavor for the management node (if applicable);
add worker nodes with specific node type (data node, task tracker or data node + task tracker) and flavor (each of them could be specified several # times with different flavors or templates);
[optional] fetch list of custom templates and override the cluster configuration, using these templates;
[optional] override some cluster parameters;
launch cluster;
savanna performs basic validation and passes cluster configuration to the plugin;
plugin validates request, if it’s valid then the infrastructure request will be generated;
infrastructure request will contain:
1. list of tuples (flavor, image, number of instances);
2. list of actions that are needed to be done after machine started e.g. password-less ssh, setup DNS;
savanna creates and prepares infrastructure and passes description to plugin;
plugin launches Hadoop cluster.

Savanna - plugins interoperability workflow:

User fetches extra cluster configs from Savanna API (Savanna delegates this call to the concrete provisioning plugin);
User launches cluster (adds/removes nodes) using Savanna API;
Savanna parses request and run common validations on it;
Savanna determines which provisioning plugin should be used;
Savanna runs plugin-specific validation for the current operation;
Savanna creates (modifies) cluster object in DB, returns response to user and starts background job that will provision and launch cluster;
User receives response with info about created (modified) cluster from Savanna API;
Savanna calls in background “launch cluster” (add/remove nodes) method if the provisioning plugin;
Plugin receives cluster configuration and can start vms from tagged images optionally using VM Manager and Image Registry;
VM Manager provides helpers for ssh/scp/etc to vms;
Plugin should configure and start 3rd party vendor management tool at the management vm and this tool will control Hadoop cluster;
Plugin can update cluster status and info to expose information about it.

Provisioning plugin functions:

get_versions() - get all versions of hadoop that could be used with plugin
get_configs() - list of all configs supported by plugin with descriptions, defaults and node process for which this config is applicable
get_supported_types() - list of all supported NodeTypes, for example, nn+jt and tt+dn
validate_cluster(cluster_description) - custom validation
get_infra(cluster_description) - cluster should return list of triplets (flavor, image, count, config=”reset_pswd, generate_keys, etc.”)
configure_cluster(cluster_description, vms)
start_cluster(cluster_description, vms)
on_terminate_cluster(cluster_description)

Image registry will provide an ability to set Glance properties to store some info about image, for example:

Image Registry functions:

Contents