Jump to: navigation, search

Difference between revisions of "HeterogeneousArchitectureScheduler"

(Fixed typos)
m (Text replace - "NovaSpec" to "NovaSpec")
 
(18 intermediate revisions by 3 users not shown)
Line 1: Line 1:
__NOTOC__
+
 
* '''Launchpad Entry''': [[NovaSpec]]:schedule-instances-on-heterogeneous-architectures
+
* '''Launchpad Entry''': NovaSpec:schedule-instances-on-heterogeneous-architectures
 
* '''Created''': [https://launchpad.net/~bfschott Brian Schott]
 
* '''Created''': [https://launchpad.net/~bfschott Brian Schott]
 +
* '''Maintained''':[https://launchpad.net/~jsuh Jinwoo "Joseph" Suh]
 
* '''Contributors''': [https://launchpad.net/~USC-ISI USC Information Sciences Institute]
 
* '''Contributors''': [https://launchpad.net/~USC-ISI USC Information Sciences Institute]
  
 
== Summary ==
 
== Summary ==
  
Nova should have support for cpu architectures, accelerator architectures, and network interfaces and be able to route run_instances() requests to a compute node capable of running that architecture.  This blueprint is dependent on the schema changes described in [[HeterogeneousInstanceTypes]] blueprint.  The target release for this is Diablo, however the USC-ISI team intends to have a stable test branch and deployment at Cactus release.
+
Nova should have support for cpu architectures, accelerator architectures, and network interfaces and be able to route run_instances() requests to a compute node capable of running that architecture.  This blueprint is dependent on the schema changes described in [[HeterogeneousInstanceTypes]] blueprint.  The target release for this is Diablo. A stable test branch and deployment available now.
  
 
The USC-ISI team has a functional prototype here:
 
The USC-ISI team has a functional prototype here:
* https://code.launchpad.net/~usc-isi/nova/hpc-trunk
+
* https://code.launchpad.net/~usc-isi/nova/hpc-trunk (more up-to-date version)
* https://code.launchpad.net/~usc-isi/nova/hpc-testing
+
* https://code.launchpad.net/~usc-isi/nova/hpc-testing (more stable version)
 +
* https://code.launchpad.net/~usc-isi/nova/hetero (merge candidate)
  
 
This blueprint is related to the [[HeterogeneousInstanceTypes]] blueprint here:
 
This blueprint is related to the [[HeterogeneousInstanceTypes]] blueprint here:
Line 32: Line 34:
 
There are several related blueprints:
 
There are several related blueprints:
 
* https://blueprints.launchpad.net/nova/+spec/compute-host-system-architecture-awareness
 
* https://blueprints.launchpad.net/nova/+spec/compute-host-system-architecture-awareness
* https://blueprints.launchpad.net/nova/+spec/frontend-heterogenous-architecture-support
 
  
 
== User stories ==
 
== User stories ==
Line 48: Line 49:
 
== Assumptions ==
 
== Assumptions ==
  
The assumption is that [[OpenStack]] runs on the target hardware architecture.  See related blueprints above for what our team is doing.
+
The assumption is that [[OpenStack]] runs on the target hardware architecture or on a proxy running on behalf of the target hardware architecture.  See related blueprints above for what our team is doing.
 +
 
 +
We also assume that instance_type_extra_spec is created. cpu_info, xpu_arch, xpu_info, xpus, net_arch, net_info, and net_mbps are example keys inserted into instance_type_extra_spec and instance_metadata tables.
  
 
== Design ==
 
== Design ==
  
We propose to add cpu_arch, cpu_info, xpu_arch, xpu_info, xpus, net_arch, net_info, and net_mbps as attributes to instance_types, instances, and compute_nodes tables.  See [[HeterogeneousInstanceTypes]].
+
See [[HeterogeneousInstanceTypes]].
  
The architecture aware scheduler will compare these additional fields when selecting target compute_nodes for the run_instances request.
+
The architecture aware scheduler will compare any key values to capability reported from zone_manager and all of them must be matched to start an instantiation.
* cpu_arch, xpu_arch, and net_arch are intended to be high-level label switches for fast row filtering (like "i386" or "fermi" or "infiniband").
 
* xpus and net_mbps are treated as quantity fields exactly like vcpus is used by schedulers
 
* the cpu_info, xpu_info, and net_info follows the instance_migration branch example using a json formatted string to capture arbitrary configurations.
 
  
 
The basic scheduler flow through nova is as follows:
 
The basic scheduler flow through nova is as follows:
  
# nova-compute starts on a host and registers architecture, accelerator, and networking capabilities in the [[ComputeNode]] table.
+
# nova-compute starts on a host and registers architecture, accelerator, and networking capabilities to the zone_manager (scheduler/zone_manager.py). The data is stored in memory (not in database). The capability information is refreshed periodically (default is 1 minute). No change here.
# nova-api receives a run-instances request with instance_type string "m1.small" or "p7g.grande". No change here.
+
# nova-api receives a run-instances request with instance_type string "m1.small" or "m1.small;xpu_arch=fermi;xpus=2". No change here.
# nova-api passes instance_type to compute/api.py create() from api/ec2/cloud.py run_instances() or api/openstack/servers.py create().
+
# api/ec2/cloud.py run_instances() gets the instance type string and retrieves detail information from instance_type table. No change here
# nova-api compute/api.py create() reads from instance_types table and adds rows to instances table.  
+
# The detail information about instance type is sent to compute/api.py create() in dictionary form. No change here.
 +
# nova-api compute/api.py create() adds rows to instances table. No change here.
 
# nova-api does an rpc.cast() to scheduler num_instances times, passing instance_id. No change here.
 
# nova-api does an rpc.cast() to scheduler num_instances times, passing instance_id. No change here.
# '''nova-scheduler as architecture scheduler selects compute_service host that matches the options specified in the instance table fields. The arch scheduler filters available compute_nodes by  cpu_arch, cpu_info, xpu_arch, xpu_info, xpus, net_arch, net_info, and net_mbps with the same fields. '''
+
# '''nova-scheduler as architecture scheduler selects compute_service host that matches the options specified in the instance_type_extra_spec table, instance table, and instance_metadata fields. The arch scheduler filters available compute_nodes by  fields in instance_type table, and all criteria in instance_type_extra_spec table. '''
 
# nova-scheduler rpc.cast() to each selected compute service.
 
# nova-scheduler rpc.cast() to each selected compute service.
 
# nova-compute receives rpc.cast() with instance_id, launches the virtual machine, etc.  
 
# nova-compute receives rpc.cast() with instance_id, launches the virtual machine, etc.  
Line 72: Line 73:
 
=== Schema Changes ===
 
=== Schema Changes ===
  
See [[HeterogeneousInstanceTypes]].
+
A new  table instance_type_extra_table is added to specify extra fields needed. The table has id, key, value, instance_type_id, and instance_type fields.
  
 
== Implementation ==
 
== Implementation ==
  
 
The USC-ISI team has a functional prototype:
 
The USC-ISI team has a functional prototype:
 +
 
https://code.launchpad.net/~usc-isi/nova/hpc-trunk
 
https://code.launchpad.net/~usc-isi/nova/hpc-trunk
 +
 +
https://code.launchpad.net/~usc-isi/nova/hpc-testing
 +
 +
https://code.launchpad.net/~usc-isi/nova/hetero
  
 
=== UI Changes ===
 
=== UI Changes ===
Line 87: Line 93:
 
</nowiki></pre>
 
</nowiki></pre>
  
 +
 +
Additional constraint can be specified in instance_type_extra_specs table. User does not need to be aware of the instance_type_extra_specs table. User simply uses an instance type as normal, e.g., "cg1.small." since cloud provider fills in the table and cloud software uses the table automatically.
 +
 +
=== Limitations ===
 +
 +
All new constraints in instance_type_extra_specs are currently match based. No inequality comparison can be done at this time. We plan to add the ability shortly.
  
 
=== Code Changes ===
 
=== Code Changes ===
Line 94: Line 106:
 
* nova/scheduler/arch.py
 
* nova/scheduler/arch.py
 
     - Implements the architecture aware scheduler.
 
     - Implements the architecture aware scheduler.
    - def hosts_up_with_arch(self, context, topic, instance_id):
+
* api/ec2/cloud.py
    - def schedule(self, context, topic, *_args, **_kwargs):
 
* nova/db/api.py
 
    - def compute_node_get_by_arch(context, cpu_arch, xpu_arch, session=None):
 
    - def compute_node_get_by_cpu_arch(context, cpu_arch, session=None):
 
    - def compute_node_get_by_xpu_arch(context, xpu_arch, session=None):
 
    - def instance_get_all_by_cpu_arch(context, cpu_arch):
 
    - def instance_get_all_by_xpu_arch(context, xpu_arch):
 
  
 
=== Migration ===
 
=== Migration ===
  
Very little needs to change in terms of the way deployments will use this if we set sane defaults like "x86_64" as assumed today.
+
Very little needs to be changed in terms of the way deployments will use this.
  
 
== Test/Demo Plan ==
 
== Test/Demo Plan ==
Line 113: Line 118:
 
== Unresolved issues ==
 
== Unresolved issues ==
  
This should highlight any issues that should be addressed in further specifications, and not problems with the specification itself; since any specification with problems cannot be approved.
+
None.
  
 
== BoF agenda and discussion ==
 
== BoF agenda and discussion ==

Latest revision as of 23:31, 17 February 2013

Summary

Nova should have support for cpu architectures, accelerator architectures, and network interfaces and be able to route run_instances() requests to a compute node capable of running that architecture. This blueprint is dependent on the schema changes described in HeterogeneousInstanceTypes blueprint. The target release for this is Diablo. A stable test branch and deployment available now.

The USC-ISI team has a functional prototype here:

This blueprint is related to the HeterogeneousInstanceTypes blueprint here:

We are also drafting blueprints for three machine types:

An etherpad for discussion of this blueprint is available at http://etherpad.openstack.org/heterogeneousarchitecturescheduler

Release Note

Nova has been extended to allow deployments to advertise and users to request specific processor, accelerator, and network interface options using instance_types (or flavors) as the primary mechanism. This blueprint is for a scheduler plugin that supports routing run_instance requests to the appropriate physical compute node.

Rationale

See HeterogeneousInstanceTypes. The short answer is that real deployments will have heterogeneous resources.

There are several related blueprints:

User stories

See HeterogeneousInstanceTypes.

George has two different processing clusters, one x86_64, the other Power7. These two run_instances commands need to go to the appropriate compute nodes. In addition, nova should prevent a user from inadvertently specifying an x86_64 machine image to run on a Power7 compute node or vice-versa. The scheduler should check for inconsistencies.

euca-run-instances -t p7f.grande -k fred-keypair emi-12345678
euca-run-instances -t m1.xlarge -k fred-keypair emi-87654321


Assumptions

The assumption is that OpenStack runs on the target hardware architecture or on a proxy running on behalf of the target hardware architecture. See related blueprints above for what our team is doing.

We also assume that instance_type_extra_spec is created. cpu_info, xpu_arch, xpu_info, xpus, net_arch, net_info, and net_mbps are example keys inserted into instance_type_extra_spec and instance_metadata tables.

Design

See HeterogeneousInstanceTypes.

The architecture aware scheduler will compare any key values to capability reported from zone_manager and all of them must be matched to start an instantiation.

The basic scheduler flow through nova is as follows:

  1. nova-compute starts on a host and registers architecture, accelerator, and networking capabilities to the zone_manager (scheduler/zone_manager.py). The data is stored in memory (not in database). The capability information is refreshed periodically (default is 1 minute). No change here.
  2. nova-api receives a run-instances request with instance_type string "m1.small" or "m1.small;xpu_arch=fermi;xpus=2". No change here.
  3. api/ec2/cloud.py run_instances() gets the instance type string and retrieves detail information from instance_type table. No change here
  4. The detail information about instance type is sent to compute/api.py create() in dictionary form. No change here.
  5. nova-api compute/api.py create() adds rows to instances table. No change here.
  6. nova-api does an rpc.cast() to scheduler num_instances times, passing instance_id. No change here.
  7. nova-scheduler as architecture scheduler selects compute_service host that matches the options specified in the instance_type_extra_spec table, instance table, and instance_metadata fields. The arch scheduler filters available compute_nodes by fields in instance_type table, and all criteria in instance_type_extra_spec table.
  8. nova-scheduler rpc.cast() to each selected compute service.
  9. nova-compute receives rpc.cast() with instance_id, launches the virtual machine, etc.

Schema Changes

A new table instance_type_extra_table is added to specify extra fields needed. The table has id, key, value, instance_type_id, and instance_type fields.

Implementation

The USC-ISI team has a functional prototype:

https://code.launchpad.net/~usc-isi/nova/hpc-trunk

https://code.launchpad.net/~usc-isi/nova/hpc-testing

https://code.launchpad.net/~usc-isi/nova/hetero

UI Changes

Functionality is accessed through selecting the scheduler in nova.conf:

scheduler_driver = nova.scheduler.arch.ArchitectureScheduler


Additional constraint can be specified in instance_type_extra_specs table. User does not need to be aware of the instance_type_extra_specs table. User simply uses an instance type as normal, e.g., "cg1.small." since cloud provider fills in the table and cloud software uses the table automatically.

Limitations

All new constraints in instance_type_extra_specs are currently match based. No inequality comparison can be done at this time. We plan to add the ability shortly.

Code Changes

Summary of changes:

  • nova/scheduler/arch.py
   - Implements the architecture aware scheduler.
  • api/ec2/cloud.py

Migration

Very little needs to be changed in terms of the way deployments will use this.

Test/Demo Plan

This need not be added or completed until the specification is nearing beta.

Unresolved issues

None.

BoF agenda and discussion

Use this section to take notes during the BoF; if you keep it in the approved spec, use it for summarising what was discussed and note any options that were rejected.