Jump to: navigation, search

Difference between revisions of "Nova Compute Load Averages"

(Overview)
(Why)
 
(6 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
== Introduction ==
 
== Introduction ==
The system load averages for the past 1, 5 and 15 minutes are returns by the command "uptime" on linux based system. The load averages are not normalized for the number of CPUs in a system, so a load average of 1 means a single CPU system is loaded all the time while on a 4 CPU system it means it was idle 75% of the time.
+
The system load averages for a compute could be used to define policies during the scheduler of an instance. Unix based system provides a text file '/proc/loadavg' that allow to get information of the system load averages for the last 1, 5, and 15 minutes.
 +
 
 
== Overview ==
 
== Overview ==
Currently the base classe `virt.driver.ComputeDriver` defines a method "get_host_uptime". The method asks the driver to call the command uptime but does not define a correct output format.
+
Currently the base class `virt.driver.ComputeDriver` defines a method "get_host_uptime". The method asks the driver to call the command 'uptime'. This command provides the system load averages and some other information, the problem is that, this method does not define a correct output format and a reflexion is necessary about the really necessity to keep the API.
  
 
=== libvirt ===
 
=== libvirt ===
Line 12: Line 13:
  
 
== Proposal ==
 
== Proposal ==
The proposal is to have the drivers returns the following information in a dictionary. The load average will be in a tuple of the 1, 5 and 15 minutes based system load averages normalized by the number of CPUs in the system. So a load average of 1 means the system is loaded all the time even if there is 1 or n CPUs .
+
The proposal is to define a new API 'loadavg' that return a tuple for the last 1, 5 and 15 minutes of the system load averages. The load average is a value between 0 and 1.
 +
 
 
     {
 
     {
      "time": 15:57:03,
+
        "hypervisor": {
      "up": 5:22,
+
            "hypervisor_hostname": "fake-mini",
      "users": 3,
+
            "id": %(hypervisor_id)s,
      "loadavg": (0.2, 0.55, 0.01)
+
            "loadavg": [0.20, 0.12, 0.14]
 +
        }
 
     }
 
     }
  
=== Implementation Details ===
+
=== Why ===
An additional method `get_host_loadavg` will be added to `virt.driver.ComputeDriver`. This method will returns a tuple of the last 1, 5, and 15minutes of the system load average normalized by the number of CPUs by invoking the standard command `uptime`
+
* Operator can get information about the load system from the past 1, 5, 15 minutes
 +
* Monitoring system like (Munin) could use this information to product charts
 +
* The scheduler could use this information for a scheduling
 +
 
 +
=== Implementation ===
 +
* A new method will be added to 'nova.virt.driver' called 'get_host_loadavg'
 +
* A new API will be added to 'nova.api.openstack.hypervisors' called 'loadavg'
 +
 
 +
==== Unix based system ====
 +
For Unix based system the load average could be found from 'proc/loadavg'
 +
 
 +
==== Windows based system ====
 +
TODO(sahid): It looks there are some equivalences.
 +
 
 +
=== Usage ===
 +
* V2 REST API: not implemented.
 +
* V3 REST API: /v3/os-hypervisors/{ID}/loadavg
 +
* If a driver does not implement the load average an exception will be raised 'NotImplmentedError'
 +
 
 +
 
 +
The blueprint can be seen at: https://blueprints.launchpad.net/nova/+spec/compute-loadavg
 +
Code in review: https://review.openstack.org/#/c/74109/

Latest revision as of 09:36, 25 February 2014

Introduction

The system load averages for a compute could be used to define policies during the scheduler of an instance. Unix based system provides a text file '/proc/loadavg' that allow to get information of the system load averages for the last 1, 5, and 15 minutes.

Overview

Currently the base class `virt.driver.ComputeDriver` defines a method "get_host_uptime". The method asks the driver to call the command 'uptime'. This command provides the system load averages and some other information, the problem is that, this method does not define a correct output format and a reflexion is necessary about the really necessity to keep the API.

libvirt

get_host_uptime: 13:37:30 up 7 days,  4:29, 19 users,  load average: 0.04, 0.04, 0.0

vmware

get_host_uptime: Please refer to {HOST_IP} for the uptime

xenapi

get_host_uptime: 13:37:30 up 7 days,  4:29, 19 users,  load average: 0.04, 0.04, 0.0

Proposal

The proposal is to define a new API 'loadavg' that return a tuple for the last 1, 5 and 15 minutes of the system load averages. The load average is a value between 0 and 1.

   {
       "hypervisor": {
           "hypervisor_hostname": "fake-mini",
           "id": %(hypervisor_id)s,
           "loadavg": [0.20, 0.12, 0.14]
       }
   }

Why

  • Operator can get information about the load system from the past 1, 5, 15 minutes
  • Monitoring system like (Munin) could use this information to product charts
  • The scheduler could use this information for a scheduling

Implementation

  • A new method will be added to 'nova.virt.driver' called 'get_host_loadavg'
  • A new API will be added to 'nova.api.openstack.hypervisors' called 'loadavg'

Unix based system

For Unix based system the load average could be found from 'proc/loadavg'

Windows based system

TODO(sahid): It looks there are some equivalences.

Usage

  • V2 REST API: not implemented.
  • V3 REST API: /v3/os-hypervisors/{ID}/loadavg
  • If a driver does not implement the load average an exception will be raised 'NotImplmentedError'


The blueprint can be seen at: https://blueprints.launchpad.net/nova/+spec/compute-loadavg Code in review: https://review.openstack.org/#/c/74109/