HeterogeneousGpuAcceleratorSupport


 * Launchpad Entry: NovaSpec:heterogeneous-gpu-accelerator-support
 * Created: Brian Schott
 * Current maintainer: John Paul Walters
 * Contributors: USC Information Sciences Institute

Summary
This blueprint proposes to add support for GPU-accelerated machines as an alternative machine type in OpenStack.

The target release for this is Grizzly. We plan to have a stable branch at https://code.launchpad.net/~usc-isi/nova/hpc-testing.

The USC-ISI team has a functional prototype here:
 * https://code.launchpad.net/~usc-isi/nova/hpc-trunk

This blueprint is related to the HeterogeneousInstanceTypes blueprint here:
 * http://wiki.openstack.org/HeterogeneousInstanceTypes

An etherpad for discussion of this blueprint is available at http://etherpad.openstack.org/heterogeneousultravioletsupport

Release Note
Nova has been extended to make NVIDIA GPUs available to provisioned instances for CUDA programming.

Rationale
See HeterogeneousInstanceTypes.

The goal of this blueprint is to allow GPU-accelerated computing in OpenStack.

User stories
Jackie has a CUDA-accelerated application and wants to run it on an instance that has access to GPU hardware. She chooses a cg1.xlarge instance type, that provides access to two NVIDIA Fermi GPUs:

$ nova flavor-list ++---+---+--+---+--+---+-+---+--+ ++---+---+--+---+--+---+-+---+--+ ++---+---+--+---+--+---+-+---+--+ $ nova boot --flavor 9 --key-name mykey --image 2b1509fe-b573-488a-be4d-d61d25c7ab4f gpu_test
 * ID | Name     | Memory_MB | Disk | Ephemeral | Swap | VCPUs | RXTX_Factor | Is_Public | extra_specs                                  |
 * 9 | cg1.xlarge | 16384     | 160  | 0         |      | 8     | 1.0         | True      | {u'hypervisor': u's== LXC', u'gpus': u'= 2', u'gpu_arch':u's== fermi'} |

Assumptions
The only approach that has been successful for CUDA access from a kvm virtual machine that we know of is gVirtuS. Here we propose direct access of gpus from LXC instances. We assume that the host system's kernel supports 'lxc-attach', and the utilities for 'lxc-attach' are installed.

Design
We have have new nova.virt.GPULibvirt which is an extension of nova.virt.libvirt to instantiate a GPU-enabled virtual machine when requested.


 * When an instance is spawned (or rebooted), nova starts an LXC VM
 * The requested gpu(s) is(are) marked as allocated and its(their) device(s) is(are) created inside LXC using 'lxc-attach'
 * Access permission to the gpu(s) is added to /cgroup
 * Boot finalizes
 * When an instance is terminated (destroyed), the gpu(s) are deallocated.

Implementation
The USC-ISI team has a functional prototype: https://code.launchpad.net/~usc-isi/nova/hpc-trunk

Code Changes
Inherits LibvirtDriver and extends a few methods to provision gpus Adds a few flags to describe gpu architecture, number of gpus, device ids, etc.
 * added nova/virt/gpu/driver.py

Gpu provisioning routines
 * added nova/virt/gpu/utils.py

Migration
Unless the migration target supports gpus of the same indices, it may not work.

Test/Demo Plan
This need not be added or completed until the specification is nearing beta.

BoF agenda and discussion
Use this section to take notes during the BoF; if you keep it in the approved spec, use it for summarising what was discussed and note any options that were rejected.