Jump to: navigation, search

XenServerMigrations

Revision as of 18:38, 2 February 2011 by Cerberus (talk)

Summary

Release Note

We need the ability to move XenServer instances from one host in a zone to another host within the same zone. This provides a substantial amount of useful functionality, such as the ability to replace a dying or obsolete host with newer hardware by evacuating instances from it. Migrations are also the base dependency for functional resizes.

Rationale

As above, the ability to evacuate a host for any physical reason is highly desirable. Furthermore, the ability for a customer to resize an instance (increasing the RAM and disk allotment) without snapshotting and creating a new instance is useful.

User stories

  1. As operations, I want to be able to evacuate a host with failing hardware so that I can replace the box with minimal impact to customers
  2. As a user, I want to be able to migrate my instance so that I can move to faster and/or newer hardware

Assumptions

The ability to snapshot a running XenServer instance already exists

Design

Implementation

The abstracted websequence diagram is as follows:

File:XenServerMigrations$index.php.png

A few issues arise in this design. First, because of the current scheduler implementation, it isn't safe to assume that the scheduler will be aware of the ability to migrate. The "simple" scheduler implementation shows that this is possible, but I don't think we should depend on that ability, at least for the time being. The concession, as above, is to cast the message to the destination first, which simply identifies itself and proxies the message right over to the source.

We can provide for the notion of a smarter scheduler by making the initial migration call context sensitive. If we are the source and a destination argument is also present, we simply begin the Rsync. Otherwise, cast to the source, appending ourself to the message arguments. It feels a little bit inconsistent, but it would be more efficient to have a migration-aware scheduler.

First pass will be to implement it as the sequence diagram above indicates, with a second pass to add the above functionality if deemed necessary.

Optimization

  • We could suspend the instance instead of shutting it down, and then migrate the RAM VHD in the process so users don't lose their uptime

Code Changes

The Openstack API will modify the "action" endpoint to expose "resize" functionality, which is simply a migration with a larger RAM and disk quota.

Migration

Functionality already exists within the Openstack API, but returns HTTP 501 at this time. Afterwards, existing API clients should be able to successfully migrate through the "resize" functionality present in the API. Additionally, functionality will be exposed through the Admin API for migration without resizing the instance.

Test/Demo Plan

This need not be added or completed until the specification is nearing beta.