HeterogeneousGpuAcceleratorSupport

Launchpad Entry: NovaSpec:heterogeneous-gpu-accelerator-support
Created: Brian Schott
Current maintainer: John Paul Walters
Contributors: USC Information Sciences Institute

Summary

This blueprint proposes to add support for GPU-accelerated machines as an alternative machine type in OpenStack.

The target release for this is Grizzly. We plan to have a stable branch at https://code.launchpad.net/~usc-isi/nova/hpc-testing.

The USC-ISI team has a functional prototype here:

https://code.launchpad.net/~usc-isi/nova/hpc-trunk

This blueprint is related to the HeterogeneousInstanceTypes blueprint here:

http://wiki.openstack.org/HeterogeneousInstanceTypes

An etherpad for discussion of this blueprint is available at http://etherpad.openstack.org/heterogeneousultravioletsupport

Release Note

Nova has been extended to make NVIDIA GPUs available to provisioned instances for CUDA programming.

Rationale

See HeterogeneousInstanceTypes.

The goal of this blueprint is to allow GPU-accelerated computing in OpenStack.

User stories

Jackie has a CUDA-accelerated application and wants to run it on an instance that has access to GPU hardware. She chooses a cg1.xlarge instance type, that provides access to two NVIDIA Fermi GPUs:

$ nova flavor-list
+----+-----------+-----------+------+-----------+------+-------+-------------+-----------+----------------------------------------------+
| ID | Name      | Memory_MB | Disk | Ephemeral | Swap | VCPUs | RXTX_Factor | Is_Public | extra_specs                                  |
+----+-----------+-----------+------+-----------+------+-------+-------------+-----------+----------------------------------------------+
| 9  | cg1.xlarge | 16384     | 160  | 0         |      | 8     | 1.0         | True      | {u'hypervisor': u's== LXC', u'gpus': u'= 2', u'gpu_arch':u's== fermi'} |
+----+-----------+-----------+------+-----------+------+-------+-------------+-----------+----------------------------------------------+
 
$ nova boot --flavor 9 --key-name mykey --image 2b1509fe-b573-488a-be4d-d61d25c7ab4f  gpu_test

Assumptions

The only approach that has been successful for CUDA access from a kvm virtual machine that we know of is gVirtuS [1]. Here we propose direct access of gpus from LXC instances. We assume that the host system's kernel supports 'lxc-attach', and the utilities for 'lxc-attach' are installed.

Design

We have have new nova.virt.GPULibvirt which is an extension of nova.virt.libvirt to instantiate a GPU-enabled virtual machine when requested.

When an instance is spawned (or rebooted), nova starts an LXC VM
The requested gpu(s) is(are) marked as allocated and its(their) device(s) is(are) created inside LXC using 'lxc-attach'
Access permission to the gpu(s) is added to /cgroup
Boot finalizes
When an instance is terminated (destroyed), the gpu(s) are deallocated.

Schema Changes

Implementation

The USC-ISI team has a functional prototype: https://code.launchpad.net/~usc-isi/nova/hpc-trunk

GPUs (NVIDIA Teslas)

Code Changes

added nova/virt/gpu/driver.py

       Inherits LibvirtDriver and extends a few methods to provision gpus
       Adds a few flags to describe gpu architecture, number of gpus, device ids, etc.

added nova/virt/gpu/utils.py

       Gpu provisioning routines

Migration

Unless the migration target supports gpus of the same indices, it may not work.

Test/Demo Plan

This need not be added or completed until the specification is nearing beta.

Unresolved issues

BoF agenda and discussion

Use this section to take notes during the BoF; if you keep it in the approved spec, use it for summarising what was discussed and note any options that were rejected.