Difference between revisions of "Nova-vm-state-management"
(minor formatting/spelling fixups) |
m (Text replace - "__NOTOC__" to "") |
||
Line 1: | Line 1: | ||
− | + | ||
* '''Launchpad Entry''': [https://blueprints.launchpad.net/nova/+spec/nova-vm-state-management Improve VM State Management to constrain state transitions] | * '''Launchpad Entry''': [https://blueprints.launchpad.net/nova/+spec/nova-vm-state-management Improve VM State Management to constrain state transitions] | ||
* '''Created''': 12 Oct 2011 | * '''Created''': 12 Oct 2011 |
Latest revision as of 23:30, 17 February 2013
- Launchpad Entry: Improve VM State Management to constrain state transitions
- Created: 12 Oct 2011
- Contributors: Phil Day (HP Cloud Services)
Contents
Summary
This blueprint would constrain the valid state transitions to a limited subset, and ensure that the remaining transitions lead to consistent and deterministic behavior.
Specifically:
- Limit the valid operations in each state (for example can only resume a paused instance)
- Make some minor changes to state sequence to make the abiove robust
- Ensure that long running operations check current state rather than assuming it is unchanged
Rationale
Current checks on valid state transitions are limited to a few cases, leading to multiple opportunities for non-deterministic behavior. In addition some long running tasks can lead to odd behavior – for example a VM in the building state can spend a long time in image download, be terminated, and when the image download completes go ahead and launch the VM.
Design
VM State is recorded in three instance attributes:
"power_state" derived from the hypervisor "vm_state" changed by Nova code generally at the start and end of main actions "task_state" changed by Nova code to reflect transient steps within an action
For example the following shows how these state values are updated during a Create action
Node | power_state | vm_state |
API | Building | |
Scheduler | Building | |
Compute | Building | |
Building | ||
Building | ||
Running | Active |
The full set of state transitions will be mapped out and provided back to the documentation team. From those already mapped we can make the following Observations:
- Most actions set vm_state and task_state early (in compute/api.py), so in-progress tasks can be determined by task_state != None
- Most actions clear task_state on completion, so may actions can be checked by a combination of vm_state and task_state = None
- Always need to leave at least one valid action (terminate)
- Long running actions (such as image download) should periodically update task_state so users can tell that progress is being made
- Long running actions should check for and honour state changes (specifcally terminated)
- The reported state should be a combination of vm_state and task_state
The initial proposal for valid transition is as follows:
vm_state | task_state |
!=None | |
Active | Resize_verify |
Active | None |
Building | |
ReBuilding | |
Paused | |
Suspended | |
Rescued | |
Deleted | |
Stopped | |
Migrating | |
Resizing | |
Error |
UI Changes
No changes are required to the UI.
Code Changes
The checks for valid actions will be implemented as a decorator, for example
@check_vm_state("delete") @scheduler_api.reroute_compute("delete") def delete(self, context, instance_id): """Terminate an instance.""“
Some other changes may be required to ensure that vm_state and task_state are set consistently (for example task_state is currently to None for a short period during Rebuild, and live_migration doesn't update state at all.)
Migration
TBD
Test/Demo Plan
TBD
BoF agenda and discussion
Etherpad from Boston Design Summit