Difference between revisions of "Sahara/ClusterHA"
< Sahara
(→User stories) |
(→Code Changes) |
||
Line 25: | Line 25: | ||
==Code Changes== | ==Code Changes== | ||
− | + | * service/api.py | |
+ | * service/instances.py | ||
+ | * service/volumes.py | ||
+ | * plugins/hdp/hadooserver.py | ||
+ | * plugins/hdp/ambariplugin.py | ||
+ | * conductor/api.py | ||
+ | * conductor/manager.py | ||
+ | * db/api.py | ||
+ | * db/sqlalchemy/api.py | ||
+ | * db/sqlalchemy/models.py | ||
+ | ...etc... | ||
==Test/Demo Plan== | ==Test/Demo Plan== |
Revision as of 06:51, 10 September 2013
Contents
Sumarry
It Shall provided system level HA. So even if a component fails during Hadoop provisioning, the system shall be able to complete the Hadoop provisioning without defect or error.
Release Note
When this is implemented, the system shall be able to complete the Hadoop provisioning without defect or error.
User stories
- Operator gets the list of failed Cluster through savanna web
- Operator clicks the resume icon
- The cluster will be recreated by using this operation.
Design
Implementation
- Check the cluster status
- Instance (is up? accessible?)
- Volume creation, attachment and mount
- ambari server/agent installment and configuring
- If the error is generated, below steps will be done.
- Update the DB which is used by ClusterHA module. (Table name: ClusterHA)
- Delete a Instance
- Detach a volume and deleting a volume
- Jump to return the value (cluter_id, status)
- Resume cluster creation
Code Changes
- service/api.py
- service/instances.py
- service/volumes.py
- plugins/hdp/hadooserver.py
- plugins/hdp/ambariplugin.py
- conductor/api.py
- conductor/manager.py
- db/api.py
- db/sqlalchemy/api.py
- db/sqlalchemy/models.py
...etc...
Test/Demo Plan
This need not be added or completed until the specification is nearing beta.
Unresolved issues
TBD