Jump to: navigation, search


< Sahara
Revision as of 20:44, 19 April 2013 by Sergey Lukjanov (talk | contribs) (Workflows)


Savanna Pluggable Provisioning Mechanism aims to deploy Hadoop clusters and integrate them with 3rd party vendor management tools like Cloudera Management Console, Hortonworks Ambari, Intel Hadoop Distribution and monitoring tools like NagiOS, Zabbix, Ganglia.

Savanna Plugable Mechanism consists of three components:

  1. Image Registry;
  2. VM Manager;
  3. Plugins.

Components responsibility:

  1. Image Registry:
    1. register image in Savanna;
    2. add/remove tags to/from images;
    3. get images by tags;
  2. VM Manager:
    1. launch/terminate vms;
    2. get vm status;
    3. ssh/scp/etc to vm;
  3. Plugins:
    1. get extra conf (specific for the concrete plugin);
    2. launch / terminate clusters;
    3. add / remove node;
    4. validation ops.

Zones of responsibility

  1. Savanna:
    1. provides resources and infrastructure (pre-configured vms, dns, etc.);
    2. cluster topologies, nodes and storage placement;
    3. cluster/hadoop/tooling configurations and state storage;
  2. Plugins:
    1. cluster monitoring;
    2. additional tools installation and management (Pig, Hive, etc.);
    3. final cluster configuration and hadoop management;
    4. add/remove nodes to/from cluster (with prepared by Savanna resources).


Cluster creation workflow for User:

  1. get list of plugins;
  2. specify cluster name;
  3. choose plugin version and hadoop version (only minor variation);
  4. specifies cluster configuration:
  5. choose a common cluster configuration if needed;
  6. specify flavors for job tracker and name node;
  7. [optional] choose flavor for the management node (if applicable);
  8. add worker nodes with specific node type (data node, task tracker or data node + task tracker) and flavor (each of them could be specified several # times with different flavors or templates);
  9. [optional] fetch list of custom templates and override the cluster configuration, using these templates;
  10. [optional] override some cluster parameters;
  11. launch cluster;
  12. savanna performs basic validation and passes cluster configuration to the plugin;
  13. plugin validates request, if it’s valid then the infrastructure request will be generated;
  14. infrastructure request will contain:
  15. list of tuples (flavor, image, number of instances);
  16. list of actions that are needed to be done after machine started e.g. password-less ssh, setup DNS;
  17. savanna creates and prepares infrastructure and passes description to plugin;
  18. plugin launches Hadoop cluster.

Savanna - plugins interoperability workflow:

  1. User fetches extra cluster configs from Savanna API (Savanna delegates this call to the concrete provisioning plugin);
  2. User launches cluster (adds/removes nodes) using Savanna API;
  3. Savanna parses request and run common validations on it;
  4. Savanna determines which provisioning plugin should be used;
  5. Savanna runs plugin-specific validation for the current operation;
  6. Savanna creates (modifies) cluster object in DB, returns response to user and starts background job that will provision and launch cluster;
  7. User receives response with info about created (modified) cluster from Savanna API;
  8. Savanna calls in background “launch cluster” (add/remove nodes) method if the provisioning plugin;
  9. Plugin receives cluster configuration and can start vms from tagged images optionally using VM Manager and Image Registry;
  10. VM Manager provides helpers for ssh/scp/etc to vms;
  11. Plugin should configure and start 3rd party vendor management tool at the management vm and this tool will control Hadoop cluster;
  12. Plugin can update cluster status and info to expose information about it.

The Workflow service may leverage a Workflow Service or Orchestration service, as one matures. See also: Workflow Service vision.