Jump to: navigation, search

Difference between revisions of "ScheduleHeterogeneousInstances"

Line 30: Line 30:
  
 
<pre><nowiki>
 
<pre><nowiki>
euca-run-instances -t "m1.xlarge;xpu=gpu;xpus=3" -k fred-keypair emi-87654321
+
euca-run-instances -t "m1.xlarge;xpu_arch=fermi;xpus=3" -k fred-keypair emi-87654321
 
</nowiki></pre>
 
</nowiki></pre>
  

Revision as of 17:59, 25 May 2011

Schedule heterogeneous instances

This supersedes: HeterogeneousInstanceTypes

See also HeterogeneousArchitectureScheduler

Summary

Nova should have support for cpu architectures, accelerator architectures, and network interfaces as part of the definition of an instance type (or flavor using RackSpace API parlance).

User stories

Mary manages a cloud datacenter. In addition to her x86 blades, she wants to advertise her power7 high performance computing cloud with 40Gbit QDR Infiniband support to customers. Mary uses nova-manage instance_types create to define "p7.sippycup", "p7.tall", "p7.grande", and "p7.venti" with cpu_arch="power7" and an increasing number of default memory, storage, cores, and reserved bandwidth. Mary also has a small number of GPU-accelerated systems, so she defines "p7f.grande" and "p7f.venti" options with xpu_arch="fermi", and xpus = 1 for grande and xpus = 2 for venti.

Fred wants to run an 8 core machine with 1 fermi-based GPU accelerator. He looks on Mary's web site for text description, then wants the p7f.grande virtual machine. He runs:


euca-run-instances -t p7f.grande -k fred-keypair emi-12345678


George wants to launch an instance with 3 GPUs. He runs:


euca-run-instances -t "m1.xlarge;xpu_arch=fermi;xpus=3" -k fred-keypair emi-87654321


Design

Add a new InstanceTypeMetadata table to the database that stores key/value pairs about specs: the additional capabilities that a particular instance would require beyond the common cores/memory/storage requirements. This information will be retrieved by the scheduler based on flavor_id.

Each nova installation would use their own data representation scheme. ISI will likely use the following scheme to represent this data for our hetereogeneous cluster:

  • xpus: # of xpus (accelerators) needed
  • xpu_arch: architecture of xpu: ex. "fermi"
  • cpu_info: may contain vendor features
  • xpu_info: may contain vendor features

The value of each key can be a multi-level dictionary in json-format, similar to the cpu_info field currently in use in the ComputeNode table.


'cpu_info': '{"arch": "x86_64", 
              "model": "Nehalem", 
              "vendor": "Intel",               
              "features": ["rdtscp", "dca", "xtpr", "tm2", "est", "vmx", 
                           "ds_cpl", "monitor", "pbe", "tm", "ht", "ss", 
                           "acpi", "ds", "vme"], 
              "topology": {"cores": "4", "threads": "1", "sockets": "2"}
             }'