Difference between revisions of "Sahara/SparkPlugin"

Revision as of 08:27, 20 November 2013

Introduction

Spark is an in-memory implementation of MapReduce written in Scala.
This blueprint proposes a Savanna provisioning plugin for Spark that can launch and resize Spark clusters and run EDP jobs.

Requirements

Support for version 0.8.0 of Spark and later is planned, since it has relaxed dependencies on Hadoop and HDFS library versions. Spark in standalone mode is targeted, no support for Mesos or YARN is planned for now.

Status

Developing a script to configure Spark and HDFS on a cluster starting from a VM snapshot.

Later a VM image will be created and the script translated in the new plugin, reusing the code from vanilla to configure HDFS.

Implementation Notes

TBD

@@ Line 7: / Line 7: @@
 Support for version 0.8.0 of Spark and later is planned, since it has relaxed dependencies on Hadoop and HDFS library versions. Spark in ''standalone'' mode is targeted, no support for Mesos or YARN is planned for now.
+== Status ==
+Developing a script to configure Spark and HDFS on a cluster starting from a VM snapshot.
+Later a VM image will be created and the script translated in the new plugin, reusing the code from vanilla to configure HDFS.
 == Implementation Notes ==