Nova Support for Volume Backed Storage Repositories in Xen/XCP
Launchpad Entry: nova-support-for-xen-volume-storage-repositories
Created: https://launchpad.net/~cyoungworth
Contributors:
Table of contents:
Summary
Upgrade Nova XCP to support the full range of XEN storage repository types.
Alter virtual hard disk image, (VHD) creation based on "DISK_VHD" type NOVA images in the Xen NOVA Plug-in to support volume storage and other storage repository types that do not mount their repositories within the XCP Dom0 file system.
Release Note
No specific release note will be needed. If there is an existing restriction cited for volume based storage repositories, it should be removed.
Rationale
Customers relying on the existing Nova implementation for Xen are limited to using only XEN storage repositories that mount their contents within the XCP/Dom0 file system space.
For "DISK_VHD" type images, the Nova XCP plugin creates files within the images of XCP storage repositories and copies the Nova VM glance disk data into them. It does this by relying on the XCP mounting of the filesystems associated with some storage repository, (SR), objects within Dom0. This is an implementation specific behavior that is not shared by all of the storage repository types.
Specifically the present implementation precludes the use of Xen volume storage backed repositories. Further, the existing implementation relies on implementation specific elements of XEN virtual machine image handling, Xen virtual disk image naming and layout within the mounted storage repository, and access behaviors of the XEN virtual disk storage retrieval code.
The proposed changes rely on XEN low level commands for manipulation of virtual disk images within the Xen storage repository space. This automatically extends the support of Nova images to all supported storage repository types and decreases the fragility for currently available storage types.
Risk
Changes will be made to the Xen Plug-in routines for Nova virtual machine image manipulation. Any errors in the new implementation could potentially result in corrupted images and loss of data. The changes are meant as a replacement for existing code. While this extends the scope of impact to presently supported data types, it simplifies testing in that there will be one path for execution regardless of data storage type.
The scope of impact is further limited. No changes are anticipated beyond those for the Xen Plug-in therefore KVM and VMware support should be unaffected
Usage
The system administrator will be free to use XCP supported volume attach storageas the default repository on NOVA mediated Xen installations.
Design
It is proposed that the Nova XCP "download_vhd" plugin be changed to prepare a virtual disk image via an "xe create-vdi" command and that the consequent VHD be repurposed via a "dd" to the underlying device. In the case of volume devices this can be done directly by "dding" with an outfile pointed to the logical volume associated with the vdi.
e.g. dd if=image.vhd of=/dev/vps/vm101_img_snapshot
Both direct data and volume devices however can be serviced by creating a virtual block device, (VBD) linking the target VDI to DOM0 and then using the resulting device as the outfile target in a "dd" action.
After the contents of the new VHD have been written, the VBD for DOM0, (if one was created), is to be destroyed and the VDI uuid is to be returned to the caller as it is in the present implementation.
The new "dd" action can be employed for all other direct manipulation of VHD contents via NOVA administrative action.
Implementation
Example: Virtual Machine creation from a NOVA glance image
Background Calling Context:
Nova virtual machine creation proceeds from a compute manager "run_instance" request to a "VMops:spawn" call. Prior to creating the virtual machine the spawn function makes a callout to "_create_disks" to create the VDI and place the Glance image data in it. "_create_disks" calls the xenapi "vm_utils.py:fetch_image" function.
The "fetch_image" function calls "_fetch_image_glance_vhd" for Nova images of type "DISK_VHD". "_fetch_image_glance_vhd" will invoke the XCP resident NOVA plugin for "download_vhd". The proper storage repository, (SR), is determined by "_fetch_image_glance_vhd" and it's UUID is embedded in the path that corresponds to the images mounting point on DOM0. The path is passed as a parameter on the "download_vhd" call. The SR uuid can be recovered from the path and used for our purposes.
Inside of the XCP plugin, "download_vhd" calls "_download_tarball_with_retry". "_download_tarball_with_retry" sets up the url connection to the glance repository. At this point, we diverge from existing support.
Functional Outline of Design Change:
An area must be sequestered for temporary image manipulation. In the present design, space is carved out of the target storage repository as a temporary directory located off of the target location for the new VHD's. The new design will require space for the compressed image. A space in the general Dom0 file system may be used. However, ensuring sufficient space may be difficult. It is recommended that a fixed file system location be chosen, e.g. /var/Nova/staging_area. This area may be part of the root file system if it is large enough or may be the mount point for local or remote disk. This location would be replaced by a general cache service if such is developed. In any event, by choosing a location in the file system, the nature of the storage can be isolated from the design of the download_vhd function. The download_vhd function can create the directory if it is missing. If it is not, the nature of the backing storage is hidden.
Long term storage of a subset of the compressed images might be investigated as a means of caching common objects. At the other end of implementation choice, a temporary directory can be created for the glance image OVA tarball. This is the approach recommended here for the initial implementation. The temporary directory and its contents will be removed after the VDI objects are constructed.
The procurement of the temporary staging area can be done in the "_make_staging_area" function. The present invocation of "_make_staging_area" out of "_download_tarball_with_retry" should work without change.
Once the temporary staging area is procured the image must be downloaded into it. The "_download_tarball" function will work but the untarring function itself should be suppressed. The OVA should be left in its compressed state.
Upon return from "_download_tarball" we unwind back to download_vhd. Here we call _import_vhds. This is where we add the bulk of the new function. A function to create a VDI of proper size is substituted for the "prepare_if_exists" function. A "tar -tf" directory query of the Image file is executed. If there is an entry with the proper name, e.g. image.vhd, a VDI of the same size is created. From here we can create a VBD to associate a device object in Dom0 with the target VDI. This will work for direct storage or for logical volumes. The logical volumes however already have accessible device objects and can be accessed directly by a "dd" command.
Example of VBD association and image copy for VDI, <vdi_uuid>
xe vbd-create bootable=false device=autodetect mode=RW type=Disk unpluggable=true vdi-uuid=<vdi_uuid> vm-uuid=<Dom0_uuid> xe vbd-plug uuid=<vbd_uuid_from_above> xe vbd-param-get uuid=<vbd_uuid_from_above> param-name=device
lets assume that device=xvda
use tar to push VHD data into stdout stream, pipe this to the dd comand
tar -x image.vhd -f "staging_area/Nova.ova" | dd of=/dev/xvda conv=notrunc
For volume devices we can alternately substitute the volume device with the vdi_uuid name in the tar command. e.g.
tar -x image.vhd -f "staging_area/Nova.ova" | dd of=dev/VG_XenStorage-<uuid-of-volume-sr>/<uuid-of-vdi> conv=notrunc
The temporary, Dom0 oriented VBD's created to access the newly created VDI's can be destroyed in the "_cleanup_staging_area" call. The new VHD/VDIs can now be returned and incorporated into the new virtual machine just as they were with the earlier approach.
Other users of the new _make_staging_area capability such as "upload_vhd" will benefit from the newly expanded function.
Migration from earlier Nova Versions
No change is anticipated for OVA images, no migration step will be needed when upgrading from earlier NOVA versions.
Test/Demo Plan
Running standard "spawning" tests with any storage repository type will exercise the new code.