Ceilometer/blueprints/monitoring
- Launchpad Entry: CeilometerSpec:monitoring
- Created: 28 Nov 2012
- Contributors: Angus Salkeld
Summary
Release Note
Rationale
User stories
The purpose of Alarms is to notify a user when a meter matches a certain criteria.
Some examples
"Tell me when the maximum disk utilization exceeds 90%" "Tell me when the average CPU utilization exceeds 80% over 120 seconds" "Tell me when my web app is becoming unresponsive" (loadbalancer latency meter) "Tell me when my httpd daemon dies" (custom user script that checks daemon health)
How can you use Alarms
Create an alarm
{ 'period': '300', 'eval_periods': '2', 'meter': 'CPUUtilization', 'function': 'average', 'operator': 'gt', 'threshold': '50' 'resource_id': 'inst-002', 'source': 'OS/compute', 'alarm_actions': ['rpc/my_notify_topic', 'http://bla.com/bla'], 'ok_actions': ['rpc/my_notify_topic'] }
This will check the "CPUUtilization" meter events every 300sec
and if the average CPUUtilization was > 50% (for inst-002) for both of the
last 2 300sec periods then it will send an rpc notification on the "my_notify_topic" topic
and post the alarm details to http://bla.com/bla.
Then when the alarm goes below this level it will do the "ok_actions".
Assumptions
- We are trying to deliver CloudWatch-like functionality but in an "openstack way" that can be extended.
- Kinds of metrics to monitor: http://docs.amazonwebservices.com/AmazonCloudWatch/latest/DeveloperGuide/CW_Support_For_AWS.html
these are really the same kinds of meters that ceilometer currently samples
- Sample at between 10s to 60s, and Transmit at between 1min and 5min
- try to reuse as much of the current ceilometer code as possible so that the feature enhances ceilometer.
Design
Implementation
This section should describe a plan of action (the "how") to implement the changes discussed. Could include subsections like:
https://blueprints.launchpad.net/ceilometer/+spec/user-api https://blueprints.launchpad.net/ceilometer/+spec/meter-post-api https://blueprints.launchpad.net/ceilometer/+spec/multi-publisher https://blueprints.launchpad.net/ceilometer/+spec/api-aggregate-average https://blueprints.launchpad.net/ceilometer/+spec/multi-dimensions
API Changes
- new alarm rest resource
- new alarm history rest resource
- need changes to make statistics aggregation more flexible
- need a new post meter data API
- need a new list meters API
Code Changes
Code changes should include an overview of what needs to change, and in some cases even the specific details.
Migration
Include:
- data migration, if any
- redirects from old URLs to new ones, if any
- how users will be pointed to the new way of doing things, if necessary.
Test/Demo Plan
This need not be added or completed until the specification is nearing beta.
Unresolved issues
This should highlight any issues that should be addressed in further specifications, and not problems with the specification itself; since any specification with problems cannot be approved.
BoF agenda and discussion
Use this section to take notes during the BoF; if you keep it in the approved spec, use it for summarising what was discussed and note any options that were rejected.