Jump to: navigation, search

HeterogeneousInstanceTypes

Revision as of 00:31, 4 March 2011 by BrianSchott (talk)

Summary

Nova should have support for cpu architectures, accelerator architectures, and network interfaces as part of the definition of an instance type (or flavor using RackSpace API parlance).

An etherpad for discussion of this blueprint is available at http://etherpad.openstack.org/heterogeneousinstancetypes

Release Note

Nova has been extended to allow deployments to advertise and users to request specific processor, accelerator, and network interface options using instance_types (or flavors).

The nova-manage instance_types command supports additional fields:

  • cpu_arch - processor architecture. Ex: "x86_64", "i386", "P7", etc. (default x86_64)
  • cpu_info - json-formatted extended processor information
  • xpu_arch - accelerator architecture Ex: "fermi" (default "")
  • xpu_info - json-formatted extended accelerator information
  • xpus - Number of accelerators or accelerator processors
  • net_arch - primary network interface. Ex: "ethernet", "infiniband", "myrinet"
  • net_info - json-formatted extended network information
  • net_mbps - allocated network bandwidth (megabits per second)

Amazon GPU Node Example:

22 GB of memory 33.5 EC2 Compute Units (2 x Intel Xeon X5570, quad-core “Nehalem” architecture) 2 x NVIDIA Tesla “Fermi” M2050 GPUs 1690 GB of instance storage 64-bit platform I/O Performance: Very High (10 Gigabit Ethernet) API name: cg1.4xlarge


cg1.4xlarge:
 * memory_mb= 22000
 * vcpus = 8
 * local_gb = 1690
 * cpu_arch = "x86_64"
 * cpu_info = '{"model":"Nehalem", "features":["tdtscp", "xtpr"]}' 
 * xpu_arch = "gpu"
 * xpus = 2
 * xpu_info ='{"gpu_arch":"fermi", "model":"Tesla 2050", "gcores":"448"}'
 * net_arch = "ethernet"
 * net_info = '{"encap":"Ethernet", "MTU":"8000"}'
 * net_mbps = 10000 


Rationale

Currently AWS supports two different CPU architecture types, "i386" and "x86_64". In addition, AWS describes many other instance type attributes by reference, such as: I/O Performance: (Moderate/High/Very High 10Gigabit Ethernet), extended CPU information (Intel Xeon X5570, quad-core “Nehalem” architecture), and now GPU accelerators (2 x NVIDIA Tesla “Fermi” M2050 GPUs). In order to implement similar functionality in nova, we need to capture this in a way that is accessible to advanced schedulers.

There are several related blueprints:

User stories

Mary manages a cloud datacenter. In addition to her x86 blades, she wants to advertise her power7 high performance computing cloud with 40Gbit QDR Infiniband support to customers. Mary uses nova-manage instance_types create to define "p7.sippycup", "p7.tall", "p7.grande", and "p7.venti" with cpu_arch="power7" and an increasing number of default memory, storage, cores, and reserved bandwidth. Mary also has a small number of GPU-accelerated systems, so she defines "p7f.grande" and "p7f.venti" options with xpu_arch="gpu", xpu_info = '{"gpu_arch":"fermi"}', and xpus = 1 for grande and xpus = 2 for venti.

Fred wants to run an 8 core machine with 1 fermi-based GPU accelerator. He looks on Mary's web site for text description, then wants the p7f.grande virtual machine. He runs:


euca-run-instances -t p7f.grande -k fred-keypair emi-12345678


Assumptions

This assumes that someone has ported OpenStack to different processor architecture systems and that accelerators such as GPUs can be pass through to the virtual instance. The USC-ISI team is working on this. We will link in related blueprints, but the goal is that this top-level architecture awareness stands alone.

Design

We propose to add cpu_arch, cpu_info, xpu_arch, xpu_info, xpus, net_arch, net_info, and net_mbps as attributes to instance_types, instances, and compute_services tables. Conceptually, this information is treated the same way that existing memory_mb, local_gb, vcpus fields are handled. They exist in instance_types and get copied as columns into instances table as instances are created.

  • cpu_arch, xpu_arch, and net_arch are intended to be high-level label switches for fast row filtering (like "i386" or "gpu" or "infiniband").
  • xpus and net_mbps are treated as quantity fields exactly like vcpus is used by schedulers
  • the cpu_info, xpu_info, and net_info follows the instance_migration branch example using a json formatted string to capture arbitrary requirements.

Scheduler Flow

The basic compute scheduler flow is as follows:

  1. nova-compute starts on a host and registers architecture, accelerator, and networking capabilities in the ComputeService table. "ADD populate compute_services table." This functionality is provided by [5] blueprint and is already implemented.
  2. nova-api receives a run-instances request with instance_type string "m1.small". No change here.
  3. nova-api passes instance_type to compute/api.py create() from api/ec2/cloud.py run_instances() or api/openstack/servers.py create(). No change here.
  4. nova-api compute/api.py create() reads from instance_types table and adds rows to instances table. "ADD our new fields into base_options arg for instances.db.create()"

5. nova-api does an rpc.cast() to scheduler num_instances times, passing instance_id. No change here.

6. nova-scheduler selects compute_service host that matches the options specified in the instance table fields cpu_arch, cpu_info, xpu_arch, xpu_info, xpus, net_arch, net_info, and net_mbps. "ADD resource-aware scheduler functionality" simple scheduler will just work correctly and ignore these fields if a homogeneous installation.

7. nova-scheduler rpc.cast() to each selected compute service. No change here.

8. nova-compute receives rpc.cast() with instance_id, launches the virtual machine, etc. At this point, nova-compute has cpu_arch, cpu_info, xpu_arch, xpu_info, xpus, net_arch, net_info, and net_mbps fields in instance record and can configure libvirt as needed. No change required for existing compute service. USC-ISI team is adding GPU and other non-x86 architecture support (need to add blueprint references).

Schema Changes

InstanceTypes

The instance_types are now stored in their own table: [6]


class InstanceTypes(BASE, NovaBase):
    """Represent possible instance_types or flavor of VM offered"""
    __tablename__ = "instance_types"
    id = Column(Integer, primary_key=True)
    name = Column(String(255), unique=True)
    memory_mb = Column(Integer)
    vcpus = Column(Integer)
    local_gb = Column(Integer)
    flavorid = Column(Integer, unique=True)
    swap = Column(Integer, nullable=False, default=0)
    rxtx_quota = Column(Integer, nullable=False, default=0)
    rxtx_cap = Column(Integer, nullable=False, default=0)
+    cpu_arch = Column(String(255), default='x86_64')
+    cpu_info = Column(String(255), default='')
+    xpu_arch = Column(String(255), default='')
+    xpu_info = Column(String(255), default='')
+    xpus = Column(Integer, nullable=false, default=0)
+    net_arch = Column(String(255), default='')
+    net_info = Column(String(255), default='')
+    net_mbps = Column(Integer, nullable=false, default=0)


Compute Service

The compute service is being included by:

[7]


 class ComputeService(BASE, NovaBase):
    """Represents a running compute service on a host."""
 
     __tablename__ = 'compute_services'
    id = Column(Integer, primary_key=True)  # FK service.id
    memory_mb = Column(Integer)
    local_gb = Column(Integer)
    vcpus = Column(Integer)
    id = Column(Integer, primary_key=True)
    service_id = Column(Integer, ForeignKey('services.id'), nullable=True)
    service = relationship(Service,
                           backref=backref('compute_service'),
                           foreign_keys=service_id,
                           primaryjoin='and_('
                                'ComputeService.service_id == Service.id,'
                                'ComputeService.deleted == False)')

    vcpus = Column(Integer, nullable=True)
    memory_mb = Column(Integer, nullable=True)
    local_gb = Column(Integer, nullable=True)
    vcpus_used = Column(Integer, nullable=True)
    memory_mb_used = Column(Integer, nullable=True)
    local_gb_used = Column(Integer, nullable=True)
    hypervisor_type = Column(Text, nullable=True)
    hypervisor_version = Column(Integer, nullable=True)
+     cpu_arch = Column(String(255), default='x86_64')
+    cpu_info = Column(String(255), default='')
+    xpu_arch = Column(String(255), default='')
+    xpu_info = Column(String(255), default='')
+    xpus = Column(Integer, default=0)
+    net_arch = Column(String(255), default='')
+    net_info = Column(String(255), default='')
+     net_mbps = Column(Integer, default=0)


Instance

Instances table just carries the additional fields.


 class Instance(BASE, NovaBase):
     """Represents a guest vm."""
.... 
     instance_type = Column(String(255))
+    cpu_arch = Column(String(255), default='x86_64')
+    cpu_info = Column(String(255), default='')
+    xpu_arch = Column(String(255), default='')
+    xpu_info = Column(String(255), default='')
+    xpus = Column(Integer, default=0)
+    net_arch = Column(String(255), default='')
+    net_info = Column(String(255), default='')
+     net_mbps = Column(Integer, default=0)


Implementation

This section should describe a plan of action (the "how") to implement the changes discussed. Could include subsections like:

UI Changes

Ideally, we should add the fields to nova-manage

Code Changes

Code changes should include an overview of what needs to change, and in some cases even the specific details.

TBD:

Migration

Include:

  • data migration, if any
  • redirects from old URLs to new ones, if any
  • how users will be pointed to the new way of doing things, if necessary.

Test/Demo Plan

This need not be added or completed until the specification is nearing beta.

Unresolved issues

This should highlight any issues that should be addressed in further specifications, and not problems with the specification itself; since any specification with problems cannot be approved.

BoF agenda and discussion

Use this section to take notes during the BoF; if you keep it in the approved spec, use it for summarising what was discussed and note any options that were rejected.