ScheduleHeterogeneousInstances

= Schedule heterogeneous instances =


 * Launchpad Entry: NovaSpec:heterogeneous-instance-types
 * Creator: Lorin Hochstein
 * Contributors: USC Information Sciences Institute

This supersedes: HeterogeneousInstanceTypes

See also HeterogeneousArchitectureScheduler

Summary
Nova should have support for cpu architectures, accelerator architectures, and network interfaces as part of the definition of an instance type (or flavor using RackSpace API parlance).

User stories
Mary manages a cloud datacenter. In addition to her x86 blades, she wants to advertise her power7 high performance computing cloud with 40Gbit QDR Infiniband support to customers. Mary uses nova-manage instance_types create to define "p7.sippycup", "p7.tall", "p7.grande", and "p7.venti" with cpu_arch="power7" and an increasing number of default memory, storage, cores, and reserved bandwidth. Mary also has a small number of GPU-accelerated systems, so she defines "p7f.grande" and "p7f.venti" options with xpu_arch="fermi", and xpus = 1 for grande and xpus = 2 for venti.

Fred wants to run an 8 core machine with 1 fermi-based GPU accelerator. He looks on Mary's web site for text description, then wants the p7f.grande virtual machine. He runs:

euca-run-instances -t p7f.grande -k fred-keypair emi-12345678

George wants to launch an instance with 3 GPUs. He runs:

euca-run-instances -t "m1.xlarge;xpu_arch=fermi;xpus=3" -k fred-keypair emi-87654321

Design
Add a new InstanceTypeMetadata table to the database that stores key/value pairs about specs: the additional capabilities that a particular instance would require beyond the common cores/memory/storage requirements. This information will be retrieved by the scheduler based on flavor_id.

Each nova installation would use their own data representation scheme. ISI will likely use the following scheme to represent this data for our hetereogeneous cluster:


 * xpus: # of xpus (accelerators) needed
 * xpu_arch: architecture of xpu: ex. "fermi"
 * cpu_info: may contain vendor features
 * xpu_info: may contain vendor features
 * hypervisor_type: hypervisor type to use (kvm, xen, lxc, etc.)

The value of each key can be a multi-level dictionary in json-format, similar to the cpu_info field currently in use in the ComputeNode table.

'cpu_info': '{"arch": "x86_64", "model": "Nehalem", "vendor": "Intel", "features": ["rdtscp", "dca", "xtpr", "tm2", "est", "vmx", "ds_cpl", "monitor", "pbe", "tm", "ht", "ss", "acpi", "ds", "vme"], "topology": {"cores": "4", "threads": "1", "sockets": "2"} }'