Jump to: navigation, search

Difference between revisions of "EfficientMetering"

(Extra star)
(the id of the ressource controls the agregation of all counters : if it is missing, all resources of the same kind and their measures are aggregated. Otherwise only the measures are agreggated.)
Line 35: Line 35:
 
|   
 
|   
 
|  ''' Counter name '''  
 
|  ''' Counter name '''  
 +
|  '''Id'''
 
|  '''Component'''  
 
|  '''Component'''  
 
|  '''Volume unit'''  
 
|  '''Volume unit'''  
Line 41: Line 42:
 
|  c1  
 
|  c1  
 
|  instance  
 
|  instance  
 +
|  instance id
 
|  nova compute  
 
|  nova compute  
 
|  minute  
 
|  minute  
Line 47: Line 49:
 
|  c2  
 
|  c2  
 
|  cpu  
 
|  cpu  
 +
|  instance id
 
|  nova compute  
 
|  nova compute  
 
|  minute  
 
|  minute  
Line 53: Line 56:
 
|  c3  
 
|  c3  
 
|  ram  
 
|  ram  
 +
|  instance id
 
|  nova compute  
 
|  nova compute  
 
| Megabyte  
 
| Megabyte  
Line 59: Line 63:
 
|  c4  
 
|  c4  
 
|  disk  
 
|  disk  
 +
|  instance id
 
|  nova compute  
 
|  nova compute  
 
| Megabyte  
 
| Megabyte  
Line 65: Line 70:
 
|  c5  
 
|  c5  
 
|  io  
 
|  io  
 +
|  instance id
 
|  nova compute  
 
|  nova compute  
 
| Megabyte  
 
| Megabyte  
Line 71: Line 77:
 
|  v1  
 
|  v1  
 
|  bd_reserved  
 
|  bd_reserved  
| nova volume  
+
| volume id
 +
nova volume  
 
|  Megabyte  
 
|  Megabyte  
 
|   
 
|   
Line 77: Line 84:
 
|  v2  
 
|  v2  
 
|  bd_used  
 
|  bd_used  
 +
|  volume id
 
|  nova volume
 
|  nova volume
 
|  Megabyte  
 
|  Megabyte  
Line 83: Line 91:
 
|  n1  
 
|  n1  
 
|  net_in_int  
 
|  net_in_int  
 +
|  ip
 
|  nova network  
 
|  nova network  
 
|  Kbytes  
 
|  Kbytes  
Line 89: Line 98:
 
|  n2  
 
|  n2  
 
|  net_in_ext  
 
|  net_in_ext  
 +
|  ip
 
|  nova network  
 
|  nova network  
 
|  Kbytes  
 
|  Kbytes  
Line 95: Line 105:
 
|  n3  
 
|  n3  
 
|  net_out_int  
 
|  net_out_int  
 +
|  ip
 
|  nova network  
 
|  nova network  
 
|  Kbytes  
 
|  Kbytes  
Line 101: Line 112:
 
|  n4  
 
|  n4  
 
|  net_out_ext  
 
|  net_out_ext  
 +
|  ip
 
|  nova network  
 
|  nova network  
 
|  Kbytes  
 
|  Kbytes  
Line 107: Line 119:
 
|  n5  
 
|  n5  
 
|  net_float  
 
|  net_float  
 +
 
|  nova network  
 
|  nova network  
 
|  minute  
 
|  minute  
Line 113: Line 126:
 
|  o1  
 
|  o1  
 
|  obj_volume  
 
|  obj_volume  
 +
 
|  swift  
 
|  swift  
 
|  Megabytes  
 
|  Megabytes  
Line 119: Line 133:
 
|  o2  
 
|  o2  
 
|  obj_in_int  
 
|  obj_in_int  
 +
|  object id
 
|  swift  
 
|  swift  
 
|  Kbytes  
 
|  Kbytes  
Line 125: Line 140:
 
|  o3  
 
|  o3  
 
|  obj_in_ext  
 
|  obj_in_ext  
 +
|  object id
 
|  swift  
 
|  swift  
 
|  Kbytes  
 
|  Kbytes  
Line 131: Line 147:
 
|  o4  
 
|  o4  
 
|  obj_out_int  
 
|  obj_out_int  
 +
|  object id
 
|  swift  
 
|  swift  
 
|  Kbytes  
 
|  Kbytes  
Line 137: Line 154:
 
|  o5  
 
|  o5  
 
|  obj_out_ext  
 
|  obj_out_ext  
 +
|  object id
 
|  swift  
 
|  swift  
 
|  Kbytes
 
|  Kbytes
Line 145: Line 163:
 
* service handlers (load balancer, databases, queues...)
 
* service handlers (load balancer, databases, queues...)
 
* service usage
 
* service usage
 +
 +
''Note for the id column'': The unique ID of a resource. If the id is missing, the counter is agregated. For instance if the ''ip'' of n4 is missing, the value is the total of Kbytes sent, regardless of the IP from which it originated, as long as the IP has been allocated to the tenant. There may be multiple counters with the same name : some with an ID and one with no id to aggregate all measures.
  
 
''Note for network counters (n1-n4)'': the distinction between internal and external traffic requires that internal networks be explicitly listed in the agent configuration.
 
''Note for network counters (n1-n4)'': the distinction between internal and external traffic requires that internal networks be explicitly listed in the agent configuration.

Revision as of 14:53, 30 April 2012

Efficient Metering in OpenStack Blueprint

Project and code : https://launchpad.net/ceilometer

Meetings : http://wiki.openstack.org/Meetings/MeteringAgenda

Uses cases

  • need a tool to collect per customer usage
  • need an API to query collected data from existing billing system
  • data needed per customer, with an hour level granularity, includes:
    • Compute - Nova:
      • instances (type, availability zone) - hourly usage
      • cpu - hourly usage
      • ram - hourly usage
      • nova volume block device (type, availability zone) - hourly usage
        • reserved
        • used
    • network (data in/out, availability zone) - hourly bytes + total bytes
      • differentiate between internal and external end-points
      • External floating IP - hourly bytes + total bytes
  • Storage - Swift
    • total data stored
    • data in/out - hourly bytes + total bytes
    • differentiate between internal and external end-points

^l

Proposed design

Counters

The following is a first list of counters that needs to be collected in order to allow billing systems to perform their tasks. This list must be expandable over time and each administrator must have the possibility to enable or disable each counter based on his local needs.

Counter name Id Component Volume unit Secondary
c1 instance instance id nova compute minute type
c2 cpu instance id nova compute minute type
c3 ram instance id nova compute Megabyte
c4 disk instance id nova compute Megabyte
c5 io instance id nova compute Megabyte
v1 bd_reserved volume id nova volume Megabyte
v2 bd_used volume id nova volume Megabyte
n1 net_in_int ip nova network Kbytes
n2 net_in_ext ip nova network Kbytes
n3 net_out_int ip nova network Kbytes
n4 net_out_ext ip nova network Kbytes
n5 net_float nova network minute type
o1 obj_volume swift Megabytes
o2 obj_in_int object id swift Kbytes
o3 obj_in_ext object id swift Kbytes
o4 obj_out_int object id swift Kbytes
o5 obj_out_ext object id swift Kbytes

Other possible counters:

  • service handlers (load balancer, databases, queues...)
  • service usage

Note for the id column: The unique ID of a resource. If the id is missing, the counter is agregated. For instance if the ip of n4 is missing, the value is the total of Kbytes sent, regardless of the IP from which it originated, as long as the IP has been allocated to the tenant. There may be multiple counters with the same name : some with an ID and one with no id to aggregate all measures.

Note for network counters (n1-n4): the distinction between internal and external traffic requires that internal networks be explicitly listed in the agent configuration.

Storage

  • Data is stored on a per account basis in a db on a per availability zone basis
  • Per account records hold
    • account_id (same as keystone’s)
    • account_state (enabled, credit disabled, admin disabled)
  • Per event records hold
    • account_id
    • counter_type
    • counter_volume
    • counter_duration
    • counter_datetime
    • message_signature
    • message_id
  • db is not directly accessible by any other mean than API
  • a process must collect messages from agent and store data
  • a process may validate counters against nova event database
  • a process may verify that messages were not lost
  • a process may verify that accounts states are in sync with keystone

Alternative gauge design

During the Folsom ODS session, an alternate design was discussed where events instead of recoding deltas, would record the absolute value of a gauge. That would require to extend the event to include the 'object id' (instance, network, volume) associated with the counter.

The delta model can be derived from the absolute model, and means it's resilient in the face of missing delta registration.

Agents

  • Agent on each nova compute node to accumulate and send counters for c1, c2, c3, c4, c5, n1, n2, n3, n4. The agent is likely to be pulling this information from libvirt.
    • c5 could get disk I/O stats with libvirt's virDomainBlockStats
    • n3 / n4 could use iptables accounting rules ? (for external traffic ?)
    • n1 / n2 could use libvirt's virDomainInterfaceStats ? (for all traffic ?)
  • Agent on each nova volume node to accumulate and send counters for v1, v2
  • Agent on each swift proxy to forward existing accounting data o1 and accumulate and send o2-o5

Note: nova network node need not accumulate and send counters for n5 because they can be pulled directly from the nova database ( see nova-manage floating list for instance )

Architecture

  • An agent runs on each OpenStack node ( Bare Metal machine ) and harvests the data localy
  • A storage daemon communicates with the agents to collect their data and aggregate them
  • The data is sent from agents to the storage daemon via a trusted messaging system (RabbitMQ?)
  • The message queue is separate from other queues (such as the nova queue)
  • The messages in queue are signed and non repudiable (http://en.wikipedia.org/wiki/Non-repudiation)

Note: document some use case scenarios to really nail down the architecture. Who signals the metering service? The API service or nova, quantum, swift, glance, volume?

Messaging use cases

Instance creation

  • An instance is created, nova issues a message ( http://wiki.openstack.org/SystemUsageData )
  • The metering storage agent listens on the nova queue and picks up the creation message
  • The metering storage agent stores the creation event locally, with a timestamp
  • The metering storage daemon is notified by the agent that the instance has been created five minutes ago and aggregates this information in the tenant records

API

  • Database can only be queried via a REST API (i.e. the database schema is not a supported API and can change in a non backward compatible way from one version to the other).
  • Requests must be authenticated (separate from keystone, or only linked to accounting type account)
  • API Server must be able to be redundant
  • Requests allow to
    • GET account_id list
    • GET list of counter_type
    • GET list of events per account
      • optional start and end for counter_datetime
      • optional counter_type
  • GET sum of (counter_volume, counter_duration) for counter_type and account_id
    • optional start and end for counter_datetime

Note: At the Folsom design session, the SET account_id call designed to change the status of the tenant in keystone was pointed more as a wart at this stage, since the billing system will need to talk to Keystone API anyway to make sense of the account id.

Free Software Billing Systems

A list of the billing system implementations that could use the Metering system when it becomes available.

Related resources

FAQ

Q: why reinvent the wheel ? XXXX already does it.

A: please mail about the tool you think does the work, unless it is listed below.

  • http://wiki.openstack.org/SystemUsageData for instance is specific to nova while the metering aims at aggregating all OpenStack components
  • collectd, munin etc. all have pieces of the puzzle but do not have all of them and they are not designed with billing in mind and are not a good fit for this blueprint