Jump to: navigation, search

Difference between revisions of "Sahara/ClusterHA"

(=test)
(Design)
Line 4: Line 4:
 
==Release Note==
 
==Release Note==
 
When this is implemented, the system shall be able to complete the Hadoop provisioning without defect or error.
 
When this is implemented, the system shall be able to complete the Hadoop provisioning without defect or error.
 +
 +
==User stories==
 +
# User doesn't know the vm is rebuilt.
 +
# Operator gets the list of failed host from monitoring system or nova network/compute state.
 +
# Operator gets the lists of the instances on the failed hosts.
 +
# Operator should rebuild the instances by using this operation.
  
 
== Design ==
 
== Design ==
 +
 +
==Implementation==
 +
* Scheduler should call the compute manager to rebuild the instance.
 +
* Compute Manager should get a list of dictionaries of network data of an instance.
 +
* Compute Manager should update the volume db and instance db.
 +
* Compute Manager should setup volumes for block device mapping by using the volume manager.
 +
* Compute Manager should spawn the instance by using the virt driver.
 +
* Compute Manager should associate the floating ip by using the network manager.
 +
* Compute Manager should update the instance db.
 +
* Compute Manager should restart the network module.
 +
 +
==Code Changes==
 +
 +
 +
==Test/Demo Plan==
 +
This need not be added or completed until the specification is nearing beta.
 +
 +
==Unresolved issues==
 +
TBD

Revision as of 06:13, 10 September 2013

Sumarry

It Shall provided system level HA. So even if a component fails during Hadoop provisioning, the system shall be able to complete the Hadoop provisioning without defect or error.

Release Note

When this is implemented, the system shall be able to complete the Hadoop provisioning without defect or error.

User stories

  1. User doesn't know the vm is rebuilt.
  2. Operator gets the list of failed host from monitoring system or nova network/compute state.
  3. Operator gets the lists of the instances on the failed hosts.
  4. Operator should rebuild the instances by using this operation.

Design

Implementation

  • Scheduler should call the compute manager to rebuild the instance.
  • Compute Manager should get a list of dictionaries of network data of an instance.
  • Compute Manager should update the volume db and instance db.
  • Compute Manager should setup volumes for block device mapping by using the volume manager.
  • Compute Manager should spawn the instance by using the virt driver.
  • Compute Manager should associate the floating ip by using the network manager.
  • Compute Manager should update the instance db.
  • Compute Manager should restart the network module.

Code Changes

Test/Demo Plan

This need not be added or completed until the specification is nearing beta. 

Unresolved issues

TBD