Jump to: navigation, search

HeterogeneousGpuAcceleratorSupport

Summary

This blueprint proposes to add support for GPU-accelerated machines as an alternative machine type in OpenStack.

The target release for this is Grizzly. We plan to have a stable branch at https://code.launchpad.net/~usc-isi/nova/hpc-testing.

The USC-ISI team has a functional prototype here:

This blueprint is related to the HeterogeneousInstanceTypes blueprint here:

An etherpad for discussion of this blueprint is available at http://etherpad.openstack.org/heterogeneousultravioletsupport

Release Note

Nova has been extended to make NVIDIA GPUs available to provisioned instances for CUDA programming.

Rationale

See HeterogeneousInstanceTypes.

The goal of this blueprint is to allow GPU-accelerated computing in OpenStack.

User stories

Jackie has a CUDA-accelerated application and wants to run it on an instance that has access to GPU hardware. She chooses a cg1.xlarge instance type, that provides access to two NVIDIA Fermi GPUs:

$ nova flavor-list
+----+-----------+-----------+------+-----------+------+-------+-------------+-----------+----------------------------------------------+
| ID | Name      | Memory_MB | Disk | Ephemeral | Swap | VCPUs | RXTX_Factor | Is_Public | extra_specs                                  |
+----+-----------+-----------+------+-----------+------+-------+-------------+-----------+----------------------------------------------+
| 9  | cg1.xlarge | 16384     | 160  | 0         |      | 8     | 1.0         | True      | {u'hypervisor': u's== LXC', u'gpus': u'= 2', u'gpu_arch':u's== fermi'} |
+----+-----------+-----------+------+-----------+------+-------+-------------+-----------+----------------------------------------------+
 
$ nova boot --flavor 9 --key-name mykey --image 2b1509fe-b573-488a-be4d-d61d25c7ab4f  gpu_test


Assumptions

The only approach that has been successful for CUDA access from a kvm virtual machine that we know of is gVirtuS [1]. Here we propose direct access of gpus from LXC instances. We assume that the host system's kernel supports 'lxc-attach', and the utilities for 'lxc-attach' are installed.

Design

We have have new nova.virt.GPULibvirt which is an extension of nova.virt.libvirt to instantiate a GPU-enabled virtual machine when requested.

  • When an instance is spawned (or rebooted), nova starts an LXC VM
  • The requested gpu(s) is(are) marked as allocated and its(their) device(s) is(are) created inside LXC using 'lxc-attach'
  • Access permission to the gpu(s) is added to /cgroup
  • Boot finalizes
  • When an instance is terminated (destroyed), the gpu(s) are deallocated.

Schema Changes

Implementation

The USC-ISI team has a functional prototype: https://code.launchpad.net/~usc-isi/nova/hpc-trunk

GPUs (NVIDIA Teslas)

Code Changes

  • added nova/virt/gpu/driver.py
       Inherits LibvirtDriver and extends a few methods to provision gpus
       Adds a few flags to describe gpu architecture, number of gpus, device ids, etc.
  • added nova/virt/gpu/utils.py
       Gpu provisioning routines

Migration

Unless the migration target supports gpus of the same indices, it may not work.

Test/Demo Plan

This need not be added or completed until the specification is nearing beta.

Unresolved issues

BoF agenda and discussion

Use this section to take notes during the BoF; if you keep it in the approved spec, use it for summarising what was discussed and note any options that were rejected.