Difference between revisions of "Obsolete:LiveMigrationUsage"
m (Fifieldt moved page LiveMigrationUsage to Obsolete:LiveMigrationUsage: this is now in the docs) |
|||
(4 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
− | |||
= Using live migration feature = | = Using live migration feature = | ||
− | |||
== Overview of this feature == | == Overview of this feature == | ||
− | |||
* OS: Ubuntu lucid 10.04/10.10 (both instances and host). | * OS: Ubuntu lucid 10.04/10.10 (both instances and host). | ||
* Shared strorage: The NOVA-INST-DIR/instances directory of nova-computes have to be mounted same shared storage(tested using NFS) | * Shared strorage: The NOVA-INST-DIR/instances directory of nova-computes have to be mounted same shared storage(tested using NFS) | ||
Line 12: | Line 9: | ||
* (NOTE2:) this feature is admin only, since nova-manage is necessary. | * (NOTE2:) this feature is admin only, since nova-manage is necessary. | ||
− | == Sample Nova Installation before starting== | + | == Sample Nova Installation before starting == |
* Prepare 3 Ubuntu lucid 10.04/10.10 host at least, lets say, HostA, HostB, HostC | * Prepare 3 Ubuntu lucid 10.04/10.10 host at least, lets say, HostA, HostB, HostC | ||
* nova-api/nova-network/nova-volume/nova-objectstore/nova-scheduler(and other daemon) are running on HostA. | * nova-api/nova-network/nova-volume/nova-objectstore/nova-scheduler(and other daemon) are running on HostA. | ||
Line 55: | Line 52: | ||
HostA:/ DIR nfs4 defaults 0 0 | HostA:/ DIR nfs4 defaults 0 0 | ||
</nowiki></pre> | </nowiki></pre> | ||
− | |||
Then try to mount at compute node. Check exported directory is successfully mounted. | Then try to mount at compute node. Check exported directory is successfully mounted. | ||
Line 106: | Line 102: | ||
HostA: 921515008 101921792 772783104 12% /opt ( <--- this line is important.) | HostA: 921515008 101921792 772783104 12% /opt ( <--- this line is important.) | ||
</nowiki></pre> | </nowiki></pre> | ||
− | |||
(b) libvirt settings (Necessary to use simple tcp (qemu+tcp://)). | (b) libvirt settings (Necessary to use simple tcp (qemu+tcp://)). | ||
Line 163: | Line 158: | ||
== Usage == | == Usage == | ||
If a physical server is going to under maintenance, he may want to know any instances running onto it. | If a physical server is going to under maintenance, he may want to know any instances running onto it. | ||
+ | |||
<pre><nowiki> | <pre><nowiki> | ||
Line 175: | Line 171: | ||
Next, he may look for the destination host that instances are migrated to. | Next, he may look for the destination host that instances are migrated to. | ||
+ | |||
<pre><nowiki> | <pre><nowiki> | ||
Line 188: | Line 185: | ||
but HostC may be lack of resource(cpu/memory/hdd..) so, checking. | but HostC may be lack of resource(cpu/memory/hdd..) so, checking. | ||
+ | |||
<pre><nowiki> | <pre><nowiki> | ||
Line 200: | Line 198: | ||
</nowiki></pre> | </nowiki></pre> | ||
+ | Remember that updateresource first, then describeresource. Otherwise, Host(used) is not updated. | ||
− | |||
* cpu : the number of vcpu | * cpu : the number of vcpu | ||
* memory_mb : total amount of memory (MB) | * memory_mb : total amount of memory (MB) | ||
Line 207: | Line 205: | ||
* 1st list shows total amount of resource physical server has. | * 1st list shows total amount of resource physical server has. | ||
* 2nd list shows current used resource. | * 2nd list shows current used resource. | ||
− | * 3rd line and under is used resource per project. | + | * 3rd line and under is used resource per project. |
− | Remember that any value of 1st and 2nd line is obtained from os/libvirt. | + | Remember that any value of 1st and 2nd line is obtained from os/libvirt. But 3rd line sums the value of instance_types. This means instances "dont use" resources described 3rd line, just "can use at maximum". |
− | But 3rd line sums the value of instance_types. This means instances "dont use" | ||
− | resources described 3rd line, just "can use at maximum". | ||
− | Now, admin checks Host(used) line at first, if there is enough resource available, HostC is good candidate for live migration destination. | + | Now, admin checks Host(used) line at first, if there is enough resource available, HostC is good candidate for live migration destination. But if sums HostC-p1 and HostC-p2 line is over HostC(total), HostC is not good candidate. |
− | But if sums HostC-p1 and HostC-p2 line is over HostC(total), HostC is not good candidate. | ||
OK, now its the time to live migration. | OK, now its the time to live migration. | ||
+ | |||
<pre><nowiki> | <pre><nowiki> | ||
Line 223: | Line 219: | ||
</nowiki></pre> | </nowiki></pre> | ||
+ | Few seconds later, confirm HostB->HostC below. | ||
− | |||
<pre><nowiki> | <pre><nowiki> | ||
Line 233: | Line 229: | ||
INSTANCE i-00000003 ami-ubuntu-lucid a.b.c.d e.f.g.h running testkey (admin, HostC) 0 m1.small 2011-02-15 07:28:32 nova | INSTANCE i-00000003 ami-ubuntu-lucid a.b.c.d e.f.g.h running testkey (admin, HostC) 0 m1.small 2011-02-15 07:28:32 nova | ||
</nowiki></pre> | </nowiki></pre> | ||
− | |||
If that still shows HostB, maybe some problem occurs. see Trouble shooting section. | If that still shows HostB, maybe some problem occurs. see Trouble shooting section. | ||
== Trouble shooting == | == Trouble shooting == | ||
− | + | When live migration fails somehow, error messages are shown at: | |
− | + | ||
− | + | * scheduler logfile | |
− | + | * source compute node logfile | |
+ | * dest compute node logfile |
Latest revision as of 19:16, 25 July 2013
Contents
Using live migration feature
Overview of this feature
- OS: Ubuntu lucid 10.04/10.10 (both instances and host).
- Shared strorage: The NOVA-INST-DIR/instances directory of nova-computes have to be mounted same shared storage(tested using NFS)
- Instances : Instance can be migrated with ISCSI/AoE based volumes.
- Hypervisor: KVM with libvirt
- (NOTE1:) "NOVA-INST-DIR/instance" is expected that vm image is put on to. see "flags.instances_path" in nova.compute.manager for the default value
- (NOTE2:) this feature is admin only, since nova-manage is necessary.
Sample Nova Installation before starting
- Prepare 3 Ubuntu lucid 10.04/10.10 host at least, lets say, HostA, HostB, HostC
- nova-api/nova-network/nova-volume/nova-objectstore/nova-scheduler(and other daemon) are running on HostA.
- nova-compute is running on both HostB and HostC.
- All nova daemon runs as root.
- HostA export NOVA-INST-DIR/instances, HostB and HostC mount it.
- To avoid any confusion, NOVA-INST-DIR is same at HostA/HostB/HostC("NOVA-INST-DIR" shows top of install dir) .
- HostA export NOVA-INST-DIR/instances, HostB and HostC mount it.
- detail description is below.
Pre-requisite settings
(a) /etc/hosts setting at HostA/HostB/HostC
Make sure 3 Hosts can name-resolution with each other. ping with each other is better way to test.
ping HostA ping HostB ping HostC
(b) NFS settings at HostA. add the below desctiption to /etc/exports.
NOVA-INST-DIR/instances HostA/255.255.0.0(rw,sync,fsid=0,no_root_squash
Change "255.255.0.0" appropriate netmask, which should include HostB/HostC. Then restart nfs server.
/etc/init.d/nfs-kernel-server restart /etc/init.d/idmapd restart
Also, at any compute nodes, add below line to /etc/fstab:
HostA:/ DIR nfs4 defaults 0 0
Then try to mount at compute node. Check exported directory is successfully mounted.
mount -a -v
If fail, try this at any hosts.
iptables -F
Also, check file/daemon permissions. We expect any nova daemons are running as root.
root@openstack2-api:/opt/nova-2010.4# ps -ef | grep nova root 5948 5904 9 11:29 pts/4 00:00:00 python /opt/nova-2010.4//bin/nova-api root 5952 5908 6 11:29 pts/5 00:00:00 python /opt/nova-2010.4//bin/nova-objectstore ... (snip)
"NOVA-INST-DIR/instances/" directory can be seen at HostA:
root@openstack:~# ls -ld NOVA-INST-DIR/instances/ drwxr-xr-x 2 root root 4096 2010-12-07 14:34 nova-install-dir/instances/
Also, at HostB and HostC:
# ls -ld NOVA-INST-DIR/instances/ drwxr-xr-x 2 root root 4096 2010-12-07 14:34 nova-install-dir/instances/ # df -k Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda1 921514972 4180880 870523828 1% / none 16498340 1228 16497112 1% /dev none 16502856 0 16502856 0% /dev/shm none 16502856 368 16502488 1% /var/run none 16502856 0 16502856 0% /var/lock none 16502856 0 16502856 0% /lib/init/rw HostA: 921515008 101921792 772783104 12% /opt ( <--- this line is important.)
(b) libvirt settings (Necessary to use simple tcp (qemu+tcp://)).
Modify /etc/libvirt/libvirt.conf :
before : #listen_tls = 0 after : listen_tls = 0 before : #listen_tcp = 1 after : listen_tcp = 1
Adding following line to /etc/libvirt/libvirt.conf :
auth_tcp = "none"
Modify /etc/init/libvirt-bin.conf
before : exec /usr/sbin/libvirtd -d after : exec /usr/sbin/libvirtd -d -l
Modify modify /etc/default/libvirt-bin
before :libvirtd_opts=" -d" after :libvirtd_opts=" -d -l"
then, restart libvirt
# stop libvirt-bin && start libvirt-bin ps -ef | grep libvirt
Make sure you get the below result.
# /opt/nova-2010.2# ps -ef | grep libvirt root 1145 1 0 Nov27 ? 00:00:03 /usr/sbin/libvirtd -d -l
if you would like to use qemu+ssh, change "live_migration_uri" flag at nova.vir.libvirt_conn and make appropriate settings described at http://libvirt.org/.
Usage
If a physical server is going to under maintenance, he may want to know any instances running onto it.
# euca-describe-instance root@openstack2-api:/opt/live-migration# euca-describe-instances Reservation:r-2raqmabo RESERVATION r-2raqmabo admin default INSTANCE i-00000003 ami-ubuntu-lucid a.b.c.d e.f.g.h running testkey (admin, HostB) 0 m1.small 2011-02-15 07:28:32 nova
"HostB" is physical server name. then he know which instance should be migrated.
Next, he may look for the destination host that instances are migrated to.
# nova-manage service list HostA nova-scheduler enabled :-) None HostA nova-volume enabled :-) None HostA nova-network enabled :-) None HostB nova-compute enabled :-) None HostC nova-compute enabled :-) None
Now he knows HostB and HostC is available.
but HostC may be lack of resource(cpu/memory/hdd..) so, checking.
# nova-manage service updateresource HostC # nova-manage service describeresource HostC HOST PROJECT cpu mem(mb) disk(gb) HostC(total) 16 32232 878 HostC(used) 13 21284 442 HostC p1 5 10240 150 HostC p2 5 10240 150 .....
Remember that updateresource first, then describeresource. Otherwise, Host(used) is not updated.
- cpu : the number of vcpu
- memory_mb : total amount of memory (MB)
- memory_mb : total amount of NOVA-INST-DIR/instances(GB)
- 1st list shows total amount of resource physical server has.
- 2nd list shows current used resource.
- 3rd line and under is used resource per project.
Remember that any value of 1st and 2nd line is obtained from os/libvirt. But 3rd line sums the value of instance_types. This means instances "dont use" resources described 3rd line, just "can use at maximum".
Now, admin checks Host(used) line at first, if there is enough resource available, HostC is good candidate for live migration destination. But if sums HostC-p1 and HostC-p2 line is over HostC(total), HostC is not good candidate.
OK, now its the time to live migration.
# nova-manage instance live_migration i-00000003 HostC Migration of i-00000001 initiated. Check its progress using euca-describe-instances.
Few seconds later, confirm HostB->HostC below.
# euca-describe-instance root@openstack2-api:/opt/live-migration# euca-describe-instances Reservation:r-2raqmabo RESERVATION r-2raqmabo admin default INSTANCE i-00000003 ami-ubuntu-lucid a.b.c.d e.f.g.h running testkey (admin, HostC) 0 m1.small 2011-02-15 07:28:32 nova
If that still shows HostB, maybe some problem occurs. see Trouble shooting section.
Trouble shooting
When live migration fails somehow, error messages are shown at:
- scheduler logfile
- source compute node logfile
- dest compute node logfile