HeterogeneousGpuAcceleratorSupport

Launchpad Entry: NovaSpec:heterogeneous-gpu-accelerator-support
Created: Brian Schott
Contributors: USC Information Sciences Institute

Summary

This blueprint proposes to add support for GPU-accelrated machines as an alternative machine type in OpenStack. This blueprint is dependent on the schema changes described in the HeterogeneousInstanceTypes blueprint and the scheduler in HeterogeneousArchitectureScheduler.

The target release for this is Diablo, however the USC-ISI team intends to have a stable test branch and deployment at Cactus release.

The USC-ISI team has a functional prototype here:

This blueprint is related to the HeterogeneousInstanceTypes blueprint here:

http://wiki.openstack.org/HeterogeneousInstanceTypes

We are also drafting blueprints for other machine types:

An etherpad for discussion of this blueprint is available at http://etherpad.openstack.org/heterogeneousultravioletsupport

Release Note

Nova has been extended to make NVIDIA GPUs available to provisioned instances for CUDA and OpenCL programming.

Rationale

See HeterogeneousInstanceTypes.

The goal of this blueprint is to allow GPU-accelerated computing in OpenStack.

User stories

Jackie has a CUDA-accelerated application and wants to run it on an instance that has access to GPU hardware. She chooses a cg1.4xlarge instance type, that provides access to two NVIDIA Fermi GPUs:

euca-run-instances -t cg1.4xlarge -k jackie -keypair emi-12345678

Assumptions

This blueprint is dependent on cg1.4xlarge being a selectable instance type and that the scheduler knows this large of an instance must get routed to a machine with GPU accelerator attached. See HeterogeneousArchitectureScheduler.

The only approach that has been successful for CUDA access from a kvm virtual machine that we know of is gVirtuS [1]. We are actively looking for alternative approaches with kvm or XEN. We assume that library has been installed.

Design

We propose to add cpu_arch, cpu_info, xpu_arch, xpu_info, xpus, net_arch, net_info, and net_mbps as attributes to instance_types, instances, and compute_nodes tables. See HeterogeneousInstanceTypes.

We have added the necessary gVirtuS hooks for libvirt and created a custom connector at nova.virt.libvirt_conn_gpu.

Schema Changes

See HeterogeneousInstanceTypes.

We're proposing the following default values added to the instance_types table:

     # x86+GPU                                                                                                                        
   # TODO: we need to identify machine readable string for xpu arch                                                                 
   'cg1.small': dict(memory_mb=2048, vcpus=1, local_gb=20,
                     flavorid=100,
                     cpu_arch="x86_64", xpu_arch="fermi", xpus=1),
   'cg1.medium': dict(memory_mb=4096, vcpus=2, local_gb=40,
                      flavorid=101,
                      cpu_arch="x86_64", xpu_arch="fermi", xpus=1),
   'cg1.large': dict(memory_mb=8192, vcpus=4, local_gb=80,
                     flavorid=102,
                     cpu_arch="x86_64", xpu_arch="fermi", xpus=1,
                     net_mbps=1000),
   'cg1.xlarge': dict(memory_mb=16384, vcpus=8, local_gb=160,
                      flavorid=103,
                      cpu_arch="x86_64", xpu_arch="fermi", xpus=1,
                      net_mbps=1000),
   'cg1.2xlarge': dict(memory_mb=16384, vcpus=8, local_gb=320,
                       flavorid=104,
                       cpu_arch="x86_64", xpu_arch="fermi", xpus=2,
                       net_mbps=1000),
   'cg1.4xlarge': dict(memory_mb=22000, vcpus=8, local_gb=1690,
                       flavorid=105,
                       cpu_arch="x86_64", cpu_info='{"model":"Nehalem"}',
                       xpu_arch="fermi", xpus=2,
                       xpu_info='{"model":"Tesla 2050", "gcores":"448"}',
                       net_arch="ethernet", net_mbps=10000),

Implementation

The USC-ISI team has a functional prototype: https://code.launchpad.net/~usc-isi/nova/hpc-trunk

Our approach currently leverages the gVirtuS drivers: http://www.ohloh.net/p/gvirtus

UI Changes

The following will be available as new default instance types.

GPUs (NVIDIA Teslas)

Available resources per physical node: 8 cores, 24-4=20 GB RAM, 1000GB - 100GB = 900 GB. These match the non-GPU small, medium, large, xlarge, 2xlarge, 4xlarge definitions. In addition, the cg1.2xlarge is the same as Amazon GPU node definition. The cpu_arch is "x86_64" and the xpu_arch is "fermi".

GPU small

API name: cg1.small
1 Fermi GPU
2 GB RAM (2048 MB)
1 virtual core
20 GB of instance storage

GPU medium

API name:cg1.medium
1 Fermi GPUs
4 GB RAM (4096 MB)
2 virtual cores
40 GB of instance storage

GPU large

API name: cg1.large
1 Fermi GPUs
8 GB RAM (8192 MB)
4 virtual cores
80 GB of instance storage

GPU xlarge

API name: cg1.xlarge
1 Fermi GPUs
8 GB RAM (8192 MB)
8 virtual cores
160 GB of instance storage

GPU 2xlarge

API name: cg1.2xlarge
2 Fermi GPUs
16 GB RAM (16384 MB)
8 virtual cores
320 GB of instance storage

GPU 4xlarge

API name: cg1.4xlarge
2 Fermi GPUs
22 GB RAM (22000 MB)
8 virtual cores
1.6 TB (1690MB) of instance storage

Code Changes

db/sqlalchemy/migrate_repo/versions/013_add_architecture_to_instance_types.py

  - add default instance types for shared memory systems

Migration

Very little needs to change in terms of the way deployments will use this if we set sane defaults like "x86_64" as assumed today.

Test/Demo Plan

This need not be added or completed until the specification is nearing beta.

Unresolved issues

One of the challenges we have is that the flavorid field in the instance_types table isn't auto-increment. We've selected high numbers to avoid collisions, but the community should discuss how flavorid behaves and the best approach for adding future new instance types.

BoF agenda and discussion

Use this section to take notes during the BoF; if you keep it in the approved spec, use it for summarising what was discussed and note any options that were rejected.