Difference between revisions of "Evacuate"
Line 2: | Line 2: | ||
* '''Launchpad Entry''': [[NovaSpec]]:rebuild-for-ha | * '''Launchpad Entry''': [[NovaSpec]]:rebuild-for-ha | ||
* '''Created''': 1 Aug 2012 | * '''Created''': 1 Aug 2012 | ||
− | * '''Contributors''': Alex Glikson | + | * '''Contributors''': Alex Glikson, Oshrit Feder, Pavel Kravchenco |
== Summary == | == Summary == | ||
Line 20: | Line 20: | ||
* VM to evacuate is down due to node failure, and is in started/powered off state | * VM to evacuate is down due to node failure, and is in started/powered off state | ||
− | * VMs' storage is accessible from other nodes (e.g. shared storage) | + | * VMs' storage is accessible from other nodes (e.g. shared storage), if not - rebuild is performed (re-create disk from image) |
* The administrator selected a valid target node to rebuild the VM on | * The administrator selected a valid target node to rebuild the VM on | ||
* Post evacuation and rebuild on target node, administrator responsible for any VM inconsistency that might occur during the sudden node failure (e.g. partial disk writes) | * Post evacuation and rebuild on target node, administrator responsible for any VM inconsistency that might occur during the sudden node failure (e.g. partial disk writes) | ||
Line 26: | Line 26: | ||
== Recovery from compute node failure == | == Recovery from compute node failure == | ||
− | + | With several changes, the existing rebuild instance functionality can be extended to support the HA scenario. | |
− | + | When the administrator detects a compute node failure, all the VMs that ran on it are now down. To evacuate a selected VM to a specified running target compute node evacuate REST API is invoked. | |
+ | |||
+ | nova.compute.api exposes the evacuate method. The target compute node receives the evacuation request, and invokes the rebuild method (nova.compute.rpcapi) with no image_ref but with recreate flag set. Rebuild perform several additional tests when the recreate flag is set: 1. Makes sure that VM with the same name does not exists. 2. Validate shared storage (if instance's disk is not on shared storage, image_ref is updated with the image, and the process continue as pure rebuild) , Volumes and networks are re-connected, instance identity and state are persistent. Cleanup of persistent information connecting the instance to the failed host. | ||
When/if the failed node is back online, further self cleanup is needed (cleanup stale instances in virt), to ensure that the recovered node is aware of the evacuated VMs and does not re-launch them. Evacuated VMs are locked to ensure single handler as the recovery of the failed node might happen while the evacuation is in progress (e.g. the VM has not yet rebuilt on the target node). | When/if the failed node is back online, further self cleanup is needed (cleanup stale instances in virt), to ensure that the recovered node is aware of the evacuated VMs and does not re-launch them. Evacuated VMs are locked to ensure single handler as the recovery of the failed node might happen while the evacuation is in progress (e.g. the VM has not yet rebuilt on the target node). |
Revision as of 10:04, 1 August 2012
- Launchpad Entry: NovaSpec:rebuild-for-ha
- Created: 1 Aug 2012
- Contributors: Alex Glikson, Oshrit Feder, Pavel Kravchenco
Summary
High availability for VMs minimizes the effect of a nova-compute node failure. Upon failure detection, VMs whose storage is accessible from other nodes (e.g. shared storage) could be rebuilt and restarted on a target node
Release Note
Administrators detecting a compute node failure could evacuate the nodes' VMs to target nodes
Rationale
On commodity hardware, failures are common and should be considered to provide high service level. With VM HA support, administrators can evacuate VMs from a failed node, while keeping the VM characteristics such as identity, volumes, networks and state to ensure VM availability over time
User stories
- Administrator wants to evacuate and rebuild VMs from failed nodes
Assumptions
- VM to evacuate is down due to node failure, and is in started/powered off state
- VMs' storage is accessible from other nodes (e.g. shared storage), if not - rebuild is performed (re-create disk from image)
- The administrator selected a valid target node to rebuild the VM on
- Post evacuation and rebuild on target node, administrator responsible for any VM inconsistency that might occur during the sudden node failure (e.g. partial disk writes)
Recovery from compute node failure
With several changes, the existing rebuild instance functionality can be extended to support the HA scenario.
When the administrator detects a compute node failure, all the VMs that ran on it are now down. To evacuate a selected VM to a specified running target compute node evacuate REST API is invoked.
nova.compute.api exposes the evacuate method. The target compute node receives the evacuation request, and invokes the rebuild method (nova.compute.rpcapi) with no image_ref but with recreate flag set. Rebuild perform several additional tests when the recreate flag is set: 1. Makes sure that VM with the same name does not exists. 2. Validate shared storage (if instance's disk is not on shared storage, image_ref is updated with the image, and the process continue as pure rebuild) , Volumes and networks are re-connected, instance identity and state are persistent. Cleanup of persistent information connecting the instance to the failed host.
When/if the failed node is back online, further self cleanup is needed (cleanup stale instances in virt), to ensure that the recovered node is aware of the evacuated VMs and does not re-launch them. Evacuated VMs are locked to ensure single handler as the recovery of the failed node might happen while the evacuation is in progress (e.g. the VM has not yet rebuilt on the target node).
This is just one possible design for this feature (keep that in mind). At its simplest, a server template consists of a core image and a metadata map. The metadata map defines metadata that must be collected during server creation and a list of files (on the server) that must be modified using the defined metadata.
Here is a simple example: let's assume that the server template has a Linux server with Apache HTTP installed. Apache needs to know the IP address of the server and the directory on the server that contains the HTML files.
The metadata map would look something like this:
metadata { IP_ADDRESS; HTML_ROOT : string(1,255) : "/var/www/"; } map { /etc/httpd/includes/server.inc }
In this case, the metadata
section defines the metadata components required; the map
section defines the files that must be parsed and have the metadata configured. Within the metadata
section, there are two defined items. IP_ADDRESS
is a predefined (built-in) value, and HTML_ROOT
is the root directory of the web server.
For HTML_ROOT
, there are three sub-fields: the name, the data type, and (in this case) the default value. The token required
could be used for items that must be supplied by the user.
When the server is created, a (as-yet-undefined) process would look at the files in the map
section and replace metadata tokens with the defined values. For example, the file might contain:
<VirtualHost {{IP_ADDRESS}}:*> DocumentRoot "{{HTML_ROOT}}"; </VirtualHost>
Implementation
This section should describe a plan of action (the "how") to implement the changes discussed. Could include subsections like:
REST API
Admin API: v2/{tenant_id}/servers/{server_id}/action, with {server_id}=the server to evacuate, parameters: action=evacuate, host=target compute node to rebuild the server on
Code Changes
Code changes should include an overview of what needs to change, and in some cases even the specific details.
Related entries
http://wiki.openstack.org/Rebuildforvms
Test/Demo Plan
This need not be added or completed until the specification is nearing beta.
Unresolved issues
This should highlight any issues that should be addressed in further specifications, and not problems with the specification itself; since any specification with problems cannot be approved.