Difference between revisions of "Nova Compute Load Averages"

Latest revision as of 09:36, 25 February 2014

Introduction

The system load averages for a compute could be used to define policies during the scheduler of an instance. Unix based system provides a text file '/proc/loadavg' that allow to get information of the system load averages for the last 1, 5, and 15 minutes.

Overview

Currently the base class `virt.driver.ComputeDriver` defines a method "get_host_uptime". The method asks the driver to call the command 'uptime'. This command provides the system load averages and some other information, the problem is that, this method does not define a correct output format and a reflexion is necessary about the really necessity to keep the API.

libvirt

get_host_uptime: 13:37:30 up 7 days,  4:29, 19 users,  load average: 0.04, 0.04, 0.0

vmware

get_host_uptime: Please refer to {HOST_IP} for the uptime

xenapi

get_host_uptime: 13:37:30 up 7 days,  4:29, 19 users,  load average: 0.04, 0.04, 0.0

Proposal

The proposal is to define a new API 'loadavg' that return a tuple for the last 1, 5 and 15 minutes of the system load averages. The load average is a value between 0 and 1.

   {
       "hypervisor": {
           "hypervisor_hostname": "fake-mini",
           "id": %(hypervisor_id)s,
           "loadavg": [0.20, 0.12, 0.14]
       }
   }

Why

Operator can get information about the load system from the past 1, 5, 15 minutes
Monitoring system like (Munin) could use this information to product charts
The scheduler could use this information for a scheduling

Implementation

A new method will be added to 'nova.virt.driver' called 'get_host_loadavg'
A new API will be added to 'nova.api.openstack.hypervisors' called 'loadavg'

Unix based system

For Unix based system the load average could be found from 'proc/loadavg'

Windows based system

TODO(sahid): It looks there are some equivalences.

Usage

V2 REST API: not implemented.
V3 REST API: /v3/os-hypervisors/{ID}/loadavg
If a driver does not implement the load average an exception will be raised 'NotImplmentedError'

The blueprint can be seen at: https://blueprints.launchpad.net/nova/+spec/compute-loadavg Code in review: https://review.openstack.org/#/c/74109/

@@ Line 1: / Line 1: @@
 == Introduction ==
-The system load averages for the past 1, 5 and 15 minutes are returns by the command "uptime" on linux based system. The load averages are not normalized for the number of CPUs in a system, so a load average of 1 means a single CPU system is loaded all the time while on a 4 CPU system it means it was idle 75% of the time.
+The system load averages for a compute could be used to define policies during the scheduler of an instance. Unix based system provides a text file '/proc/loadavg' that allow to get information of the system load averages for the last 1, 5, and 15 minutes.
 == Overview ==
-Currently the base classe `virt.driver.ComputeDriver` defines a method "get_host_uptime". The method asks the driver to call the command uptime but does not define a correct output format.
+Currently the base class `virt.driver.ComputeDriver` defines a method "get_host_uptime". The method asks the driver to call the command 'uptime'. This command provides the system load averages and some other information, the problem is that, this method does not define a correct output format and a reflexion is necessary about the really necessity to keep the API.
 === libvirt ===
@@ Line 12: / Line 13: @@
 == Proposal ==
-The proposal is to have the drivers returns the following information in a dictionary. The load average will be in a tuple of the 1, 5 and 15 minutes based system load averages normalized by the number of CPUs in the system. So a load average of 1 means the system is loaded all the time even if there is 1 or n CPUs .
+The proposal is to define a new API 'loadavg' that return a tuple for the last 1, 5 and 15 minutes of the system load averages. The load average is a value between 0 and 1.
      {
-       "time": 15:57:03,
+        "hypervisor": {
-       "up": 5:22,
+            "hypervisor_hostname": "fake-mini",
-       "users": 3,
+            "id": %(hypervisor_id)s,
-       "loadavg": (0.2, 0.55, 0.01)
+            "loadavg": [0.20, 0.12, 0.14]
+        }
      }
-=== Implementation Details ===
+=== Why ===
-An additional method `get_host_loadavg` will be added to `virt.driver.ComputeDriver`. This method will returns a tuple of the last 1, 5, and 15minutes of the system load average normalized by the number of CPUs by invoking the standard command `uptime`
+* Operator can get information about the load system from the past 1, 5, 15 minutes
+* Monitoring system like (Munin) could use this information to product charts
+* The scheduler could use this information for a scheduling
+=== Implementation ===
+* A new method will be added to 'nova.virt.driver' called 'get_host_loadavg'
+* A new API will be added to 'nova.api.openstack.hypervisors' called 'loadavg'
+==== Unix based system ====
+For Unix based system the load average could be found from 'proc/loadavg'
+==== Windows based system ====
+TODO(sahid): It looks there are some equivalences.
+=== Usage ===
+* V2 REST API: not implemented.
+* V3 REST API: /v3/os-hypervisors/{ID}/loadavg
+* If a driver does not implement the load average an exception will be raised 'NotImplmentedError'
+The blueprint can be seen at: https://blueprints.launchpad.net/nova/+spec/compute-loadavg
+Code in review: https://review.openstack.org/#/c/74109/