Jump to: navigation, search

Ceilometer/blueprints/remove-ceilometer-nova-notifier

< Ceilometer‎ | blueprints
Revision as of 11:59, 23 October 2013 by Sileht (talk | contribs) (Created page with "=== Context === Ceilometer retrieves different meters from every openstack component. For nova instances meters, some of them are retrieved by polling the nova-compute drive...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Context

Ceilometer retrieves different meters from every openstack component.

For nova instances meters, some of them are retrieved by polling the nova-compute driver (ie: libvirt or hyperv for now), and some other by reading the nova notification on rpc :

The meters retrieved with the polling system are:

  • cpu
  • disk.read.requests
  • disk.read.bytes
  • disk.write.requests
  • disk.write.bytes
  • instance
  • instance:<flavor_name>
  • network.incoming.bytes
  • network.incoming.packets
  • network.outgoing.bytes
  • network.outgoing.packets


For instance and instance:<flavor_name> is retrieved by rpc too, so we don't care of it for the rest of the BP

Problem

The pollster run on every nova-compute node to connect to libvirt or hyperv.

When we delete an instance, we need to retrieve the pollster meters one last time to have the lastest samples of the meters for billing.

But the pollster can't know when the instance is deleted, this is a dump periodic tasks.

A nova notifier has been written in ceilometer that catch only 'compute.instance.delete.start' notification to build an other one 'compute.instance.delete.samples' that contains one last sample of each pollster meters.

This code in ceilometer depends of internal nova code/data format/..., it is very difficult to maintain this in ceilometer and it is often broken.


Solution

On the nova.virt.driver part

Interesting nova.virt.driver API methods (implemented by):

  • interface_stats() (libvirt), could be used for network.* meters
  • block_stats() (libvirt), could be used for disk.* meters
  • get_all_volume_usage() (libvirt), could be used for disk.* meters
  • get_all_bw_counters() (xenapi), could be used for network.* meters
  • get_info() cpu_time field (libvirt, hyperv, powervm), could be used for cpu meter


The problematic meters are supported by Ceilometer in libvirt and hyperv only. So to keep the same functionality, the methods above must be implemented for both these drivers.

The current ceilometer code can be reused to implement these API methods.

On the nova.manager part

If configured (CONF.bandwidth_poll_interval and CONF.volume_usage_poll_interval), polling tasks already exists to cache the result of get_all_volume_usage/get_all_bw_counters of the drivers

So, we need to write a polling tasks for the cpu usage.

The actual 'instance.exists' notification can add the volume and cpu usage to the notification (like it does for the bw)

And when a instance is deleted, we need to update all usage caches (volume/bw/cpu) and then:

  • Sent a notification 'instance.delete.start' with extra informations like 'instance.exists' does.
  • Or just notify 'instance.exists' one last time, just before 'instance.delete.start'.


On the ceilometer part

We just need to write the consumer part of these new informations in the notification message like other meter that come from notification.