Jump to: navigation, search

RaxCeilometerRequirements

Revision as of 19:34, 3 January 2013 by SandyWalsh (talk)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

We plan to bring the following functionality to Ceilometer:

  1. Double-entry accounting verification of OpenStack usage before handoff to billing (finance is our primary customer).
  2. 90+ days storage of high-volume raw notifications (planning for at least 2 billion rows).
  3. Secondary aggregation/rollups of the raw data with support for third-party hooks into the notification pipeline (tricky with schema changes).
  4. Support for downstream consumers via PubSubHubBub/Atom mechanisms (such as AtomHopper)
  5. Monitoring of instance state for detailed debugging and SLA tracking.

The implications of these requirements will require changes to Ceilometer, specifically:

  • It appears that once the data is collected, it goes back into another rabbit queue for their Collector to process. It stores the raw data potentially many times as different metrics. For example there are objects for Cpu, IP, Disk, Bandwidth, etc. I don't know really what that buys us, vs raw + rolled up data. Much of the raw data is then discarded (we need to keep it). To get a single picture of an instance may require many queries (we need Request ID, Instance ID, Host, Tenant ID and time-range)
  • There is no double entry accounting. It's the raw event, but not consolidated. The question of polling the HV directly as a second source is still there.
  • We need to do post-processing on the raw data beyond the initial collection. We need the queue after the initial collection to allow for the "settling time" to allow for multiple workers (otherwise you have a bottleneck or staggered data)
  • Need support for error notifications and capture full state from all .start/.end messages. As it is today, there could be significant miscounts.
  • Millisecond timing resolution regardless of database.
  • Stop using the nova rpc mechanism which does an ACK immediately regardless of if event was properly handled or not.

Other Minor Nits:

  • Ceilometer extensively uses the openstack.common library ... I'm not sure what this really buys us. It seems like there is a lot of boiler plate just to work with this. Could be a lot easier.