Jump to: navigation, search

Status transition issues

Revision as of 23:30, 17 February 2013 by Ryan Lane (talk | contribs) (Text replace - "__NOTOC__" to "")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Status Transition Issues

The instance status's in Nova don't quite match what an end user would expect. In many cases the server action that is in progress is not properly identified with a state transition via the OSAPI:

API Issues (What currently happens via the OSAPI)

  • When creating an instance (POST /servers): BUILD --> SHUTDOWN --> ACTIVE.
    • * there is sometimes a momentary SHUTDOWN state right before the instance comes online.
  • when rebuilding an instance (POST /servers/<id>/action): BUILD --> ACTIVE.
    • * It should actually be ACTIVE --> REBUILD --> ACTIVE
  • when resizing an instance (POST /servers/<id>/action): ACTIVE --> SHUTDOWN ** quick, RESIZE-CONFIRM
    • * There is a quick SHUTDOWN before the RESIZE-CONFIRM
    • * The state transitions for resize should be: ACTIVE --> QUEUE_RESIZE --> PREP_RESIZE --> RESIZE --> VERIFY_RESIZE
  • when deleting a server (DELETE /servers/<id>): ACTIVE --> BUILD -- (GONE)
  • * The instance should not go into the BUILD state momentarily before deletion.

---

Automatic state updates can cause confusion

If a catastrophic error occurs (an exception, etc.) when creating or manipulating an instance the state may get set to ERROR. The ERROR state won't persist for long however because we have a _poll_instance_states function in the Nova compute manager that will poll for instance/VM mismatches and correct them. An instance in an ERROR state will eventually (every 2 minutes) get set to the SHUTDOWN state. Again this is somewhat confusing to the end user since the state that gives insight into what actually happened (an error) has been altered by the system.

Questions

Nova stores instance state information in the 'instances' table. The 'state' and 'state_description' columns are used.

These columns feel like they are a bit overloaded so perhaps adding a new transition state column is needed?

Do we need a flag to disable the _poll_instance_states if a service provider doesn't want states to automatically be updated?