XenServer/LiveMigration

= XenServer Live Migration =

Here are some docs relating to XenServer live migration.

Types of Migration
There are two main cases for Live Migration:
 * VM on shared storage
 * VM not on shared storage

When the VM is on shared storage, you are able to use XenServer's pool concept to perform the migration. However, the source and destination host must both be in that pool.

When not using shared storage, we have to move the VMs disk to the new server. A new feature of XenServer (called Storage XenMotion) will be able to help: http://wiki.xen.org/wiki/CrossPoolMigrationv3

Live Migration with Block Migration
This requires XenServer 6.1 (with an Advanced license), XenServer 6.2 or later (with the free license) or XCP 1.6 and later.

You do not need to enable pools (using aggregates) to get live migration with XenServer when one of the above is in use and you are using Havana or newer release of OpenStack, simply add the --block-migrate option to the migration command.

Live Migration using Host Aggregate
If you are using XenServer pools, configured using Host Aggregates, and you configure that pool to use shared storage as the default storage, you can migrate your VMs between the different hosts in that pool.

NOTE: Using host aggregates to create XenServer pools is not well tested. The primary recommendation is to use block migration, as described in the previous section, rather than attempting to use host aggregates with XenServer.

Setting up a XenServer pool using Host Aggregates
To setup a pool, you need at least two servers, and they must be compatible: http://docs.vmd.citrix.com/XenServer/6.0.0/1.0/en_gb/reference.html#pooling_homogeneity_requirements

There are several requirements on the nova configuration:
 * You must configure real management IP of the XenServer (not 169.254.0.1) and the nova DomU should also be on that network
 * You must configure to use the default pool storage:

Once you have OpenStack running on your servers (using XenServer/DevStack or otherwise), you need to pool them together.

Now you can create the aggregate using the nova cli:

nova aggregate-create my_test_pool my_availability_zone

nova aggregate-set-metadata  hypervisor_pool=true nova aggregate-set-metadata  operational_state=created

You can use aggregate-list to get the id, then tool to add your master (using the hostname of the nova compute on that hypervisor, you can use "nova-manage service list" to check the name):

nova aggregate-add-host  my_master_host

Now you can use xe or XenCenter to add some shared storage, and make the the default pool storage repository: http://docs.vmd.citrix.com/XenServer/6.0.0/1.0/en_gb/reference.html#id1002701

Now you can add your further hosts to the pool (keeping in mind the requirements for pool compatibility noted above):

nova aggregate-add-host  my_slave_host_1 nova aggregate-add-host  my_slave_host_2

You should now see these servers shutdown your nova compute VM, join your pool, then start back up the nova compute VM.

NOTE: in some versions of XenServer, you will now need to re-configure eth0 on your nova compute VM before the VM will boot. You should add the VIF another network. This should be fixed in future versions of XenServer.

To try this feature out, start a VM, and you should see it make use of the shared storage.

To live-migrate the VM, you make use of the usual nova cli command:

nova live-migration  

Live Migration RPC Calls
Let's consider the two hosts called:
 * src (where inst runs)
 * dest

The user call maps to:
 * scheduler rpc: live_migration(block_migration, disk_over_commit, instance_id, dest)

Scheduler driver does the following:
 * check instance exists
 * check source alive
 * check destination alive, and has enough memory
 * check source + destination hypervisor type match and dest is same or newer version than src
 * compute: check_can_live_migrate_on_dest (on dest) *TODO*
 * updates instance db entry to "migrating"
 * compute: live_migration (on src)

Compute:check_can_live_migrate_on_destination does the following (on dest): *TODO*
 * calls compute_driver check_can_live_migration_on_dest
 * which can call the delegate that calls check_can_live_migrate_on_src

Compute:check_can_live_migrate_on_src does the following (on src): *TODO*
 * takes a dictionary from dest
 * possibly throws an exception if there is an issue

Compute: live_migration does the following (on src):
 * check_for_export with volume
 * calls pre_live_migration on dest
 * does rollback on exception
 * calls driver

Compute_driver: live_migration (on src)
 * on success calls manager's post_live_migration
 * on failure calls manager's rollback_live_migration

Compute: post_live_migration (on src):
 * updates floating ips
 * deletes old traces of image
 * calls post_live_migration_at_destination on dest

Compute: rollback_live_migration (on src)
 * updates DB
 * sorts out volumes and networks
 * on block migration, calls rollback_live_migration_at_destination on dest

Compute: post_live_migration_at_destination (on dest)
 * sets up networking
 * sets task as complete

Compute: rollback_live_migration_at_destination (on dest)
 * does some clean up