Efficient Metering in OpenStack Blueprint

Project Home Page: Ceilometer

Meetings : http://wiki.openstack.org/Meetings/MeteringAgenda

Uses cases

need a tool to collect per customer usage
need an API to query collected data from existing billing system
data needed per customer, with an hour level granularity, includes:
- Compute - Nova:
  - instances (type, availability zone) - hourly usage
  - cpu - hourly usage
  - ram - hourly usage
  - nova volume block device (type, availability zone) - hourly usage
    - reserved
    - used
- network (data in/out, availability zone) - hourly bytes + total bytes
  - differentiate between internal and external end-points
  - External floating IP - hourly bytes + total bytes
Storage - Swift
- total data stored
- data in/out - hourly bytes + total bytes
- differentiate between internal and external end-points

Proposed design

Meters

A more current list of implemented meters is available at http://docs.openstack.org/developer/ceilometer/measurements.html

The following is a first list of meters that needs to be collected in order to allow billing systems to perform their tasks. This list must be expandable over time and each administrator must have the possibility to enable or disable each meter based on his local needs.

	Meter name	Component	Resource ID	Volume unit	Payload	Note
c1	instance	nova compute	instance id	minute	type	type is the instance flavor id used
c2	cpu	nova compute	instance id	minute	type	Arm\|x86\|x86_64])
c3	ram	nova compute	instance id	Megabyte
c4	disk	nova compute	instance id	Megabyte		system disks persist when the instance is shutdown but not terminated and must be accounted for
c5	io	nova compute	instance id	Megabyte		disk IO in megabyte per second has a high impact on the service availability and could be billed separately
v1	bd_reserved	nova volume	volume id	Megabyte
v2	bd_used	nova volume	volume id	Megabyte		(optional)
n1	net_in_int	nova network	IP address	Kbytes		volume of data received from internal network source
n2	net_in_ext	nova network	IP address	Kbytes		volume of data received from external network source
n3	net_out_int	nova network	IP address	Kbytes		volume of data sent to internal network dest
n4	net_out_ext	nova network	IP address	Kbytes		volume of data sent to external network destinations
n5	net_float	nova network	IP address	minute	type	The type distinguishes public IPs depending on their allocation policy. For instance IPv6 or IPv4_FROM_RIPE or IPv4_FROM_OVH etc. The acquisition or maintainance cost of a floating IP may depend on its allocation policy.
o1	obj_volume	swift	swift account id	Megabytes		total object volume stored
o2	obj_in_int	swift	swift account id	Kbytes		volume of data received from internal network source
o3	obj_in_ext	swift	swift account id	Kbytes		volume of data received from external network source
o4	obj_out_int	swift	swift account id	Kbytes		volume of data sent to internal network dest
o5	obj_out_ext	swift	swift account id	Kbytes		volume of data sent to external network destinations
o6	obj_number	swift	swift account id		container	Number of objects stored for a container. The resource_id is the container id.
o7	obj_containers	swift	swift account id		Number of containers
o8	obj_requests	swift	swift account id		type	Number of HTTP requests, type being the request type (GET/HEAD/PUT/POST…)
i1	image_upload	glance	image id	(discrete)	1	Lets us count the number of images created, or charge a flat rate for the each upload

Other possible meters:

service handlers (load balancer, databases, queues...)
service usage

Note for network meters (n1-n4): the distinction between internal and external traffic requires that internal networks be explicitly listed in the agent configuration.

Note(dhellmann): That isn't going to scale to a real system where tenants may create their own networks. We should just collect the data for each network, and let the billing system decide on the rate at which to charge (possibly $0 for internal networks).

Storage

Field name	Type
source	?
user_id	String
project_id	String
resource_id	String
resource_metadata	String
meter_type	String
meter_volume	Number
meter_duration	Integer
meter_datetime	Timestamp
message_signature	String
message_id

db is not directly accessible by any other mean than API
a process must collect messages from agent and store data
a process may validate meters against nova event database
a process may verify that messages were not lost
a process may verify that accounts states are in sync with keystone

Note: The instance_metadata field content is duplicated for each meter. For instance it will be duplicated for all c? fields. The storage optimization is to be dealt with in future versions of ceilometer.

Note: The storage may collapse records or it may be done by the API may collapse records as an optimisation to reduce the amount of information that is returned. For instance, if all fields from two consecutive c1 counter are equal and they are adjacent in time (i.e meter_datetime[second] - meter_datetime[first] == meter_duration[second] - meter_duration[second] ), then the first record can be removed because it is redundant.

Alternative gauge design

During the Folsom ODS session, an alternate design was discussed where events instead of recoding deltas, would record the absolute value of a gauge. That would require to extend the event to include the 'object id' (instance, network, volume) associated with the meter.

The delta model can be derived from the absolute model, and means it's resilient in the face of missing delta registration.

Agents

Agent on each nova compute node to accumulate and send meters for c1, c2, c3, c4, c5, n1, n2, n3, n4. The agent is likely to be pulling this information from libvirt.
- c5 could get disk I/O stats with libvirt's virDomainBlockStats
- n3 / n4 could use iptables accounting rules ? (for external traffic ?)
- n1 / n2 could use libvirt's virDomainInterfaceStats ? (for all traffic ?)
Agent on each nova volume node to accumulate and send meters for v1, v2
Agent on each swift proxy to forward existing accounting data o1 and accumulate and send o2-o5

Note: nova network node need not accumulate and send meters for n5 because they can be pulled directly from the nova database ( see nova-manage floating list for instance )

Architecture

An agent runs on each OpenStack node ( Bare Metal machine ) and harvests the data localy
- If a meter is available from the existing OpenStack component it should be used
- A standalone ceilometer agent implements the meters that are not yet available from the existing OpenStack components
A storage daemon communicates with the agents to collect their data and aggregate them
The agents collecting data are authenticated to avoid pollution of the metering service
The data is sent from agents to the storage daemon via a trusted messaging system (RabbitMQ?)
The data / messages exchanged between agents and the storage daemon use a common messages format
The content of the storage is made available thru a REST API providing aggregation
The message queue is separate from other queues (such as the nova queue)
The messages in queue are signed and non repudiable (http://en.wikipedia.org/wiki/Non-repudiation)

Note(jking-6): The messaging format should use protocol buffers. JSON bytestrings take up too much bandwidth and time to parse.

Note: document some use case scenarios to really nail down the architecture. Who signals the metering service? The API service or nova, quantum, swift, glance, volume?

Note: ideally, all meters are available from the OpenStack component responsible for a given resource (for instance the disk I/O for an ephemeral disk is made available in nova). However, it is not realistic to assume it can always be the case. Standalone ceilometer agents runing on OpenStack nodes provide access to the meters when the OpenStack component don't. The meter implemented in ceilometer agents should always be contributed to the OpenStack component. This kind of incubation for each given meter ( first implemented in ceilometer agents and then in the OpenStack component ) is both practical for short term purposes and a sound long term practice that avoids forking code.

Messaging use cases

Instance creation

An instance is created, nova issues a message ( http://wiki.openstack.org/SystemUsageData )
The metering storage agent listens on the nova queue and picks up the creation message
The metering storage agent stores the creation event locally, with a timestamp
The metering storage daemon is notified by the agent that the instance has been created five minutes ago and aggregates this information in the tenant records

API

Volume of data

A metering system will always generate massive amounts of data. In order to estimate the amounts that your cloud may generate, a Google spreadsheet has been proposed.

Contributing to Ceilometer

The developer documentation is starting to take shape within the source and is also published at http://ceilometer.readthedocs.org in a more friendly format.

The project team hangs out on Freenode in the #openstack-metering channel, feel free to drop by and stay as long as you want to discuss your future implementation. We use the OpenStack General Mailing List for our email discussions tagging the the subject with [metering].

If you wonder what you could contribute to ceilometer, here is a list of features that we are missing.

Roadmap

See EfficientMetering/RoadMap

Free Software Billing Systems

A list of the billing system implementations that could use the Metering system when it becomes available.

Dough https://github.com/lzyeval/dough
trystack.org billing https://github.com/trystack/dash_billing
nova-billing https://github.com/griddynamics/nova-billing

Related resources

Definition of a Storage Accounting Record http://www.ogf.org/Public_Comment_Docs/Documents/2012-02/EMI-StAR-OGF-info-doc-v2.pdf
UsageRecord format http://www.ogf.org/documents/GFD.98.pdf
Capturing exchanges https://github.com/rackspace/stacktach
Messages about system usage http://wiki.openstack.org/SystemUsageData
http://etherpad.openstack.org/EfficientMetering
Use https://github.com/stackforge
lzyeval codebase:
- billing https://github.com/lzyeval/dough
- metering https://github.com/lzyeval/kanyun
trystack.org codebase:
- https://github.com/trystack/dash_billing
http://wiki.openstack.org/utilizationdata
Nova billing https://github.com/griddynamics/nova-billing
Swift
- Retrieve Account Metadata http://docs.openstack.org/trunk/openstack-object-storage/developer/content/retrieve-account-metadata.html
- swift middlewares examples :
  - https://github.com/spilgames/swprobe (https://lists.launchpad.net/openstack/msg07794.html)
  - https://github.com/pandemicsyn/swift-informant (https://lists.launchpad.net/openstack/msg07795.html)
April 2012 mailing list thread on billing https://lists.launchpad.net/openstack/msg10334.html
Virgo (scriptable agent for meter collection): https://github.com/racker/virgo
- Contact Brandon Philips at Rackspace - brandon.philips@rackspace.com
Ovirt DWH http://www.ovirt.org/wiki/Ovirt_DWH and associated database schema http://gerrit.ovirt.org/gitweb?p=ovirt-dwh.git;a=blob;f=data-warehouse/historydbscripts_postgres/create_tables.sql;h=2e05299a2de1b79634e862e5f1811dda3f303a96;hb=0271e5205ad29109c2e2313e7f6fb900e76a757a#l377
Swift http://folsomdesignsummit2012.sched.org/event/d9135eabdd775432c74c3f1d32a325d3 and http://etherpad.openstack.org/FolsomSwiftStatsd
Collecting meters from libvirt https://github.com/ss7pro/rescnt
Doug Hellman sandbox https://github.com/dhellmann/metering-prototype/
Prototype ceilometer implementation http://github.com/woorea/ceilometer-java and discussion https://lists.launchpad.net/openstack/msg11410.html

Resources

A slide deck that Julien Danjou used to present Ceilometer in July 2012.

FAQ

Q: why reinvent the wheel ? XXXX already does it.

A: please mail about the tool you think does the work, unless it is listed below.

http://wiki.openstack.org/SystemUsageData for instance is specific to nova while the metering aims at aggregating all OpenStack components
collectd, munin etc. all have some pieces of the puzzle but do not have all of them and they are not designed with billing in mind and are not a good fit for this blueprint
Riemann -- http://aphyr.github.com/riemann/concepts.html I was able to get a basic dashboard up in an afternoon. Even if it's not a good fit for this project there are plenty of good ideas worth pilfering: protocol buffers, push-based dataflow graphs, extremely simple APIs (a stream processor is just a function that takes a single argument, an event message).

EfficientMetering

Contents