- Launchpad Entry: NovaSpec:nova-instrumentation-metrics-monitoring
- Created: 23 Oct 2012
- Drafter: Tim Daly Jr, Joshua Harlow, Jeff Budzinski
- Drafters Email: [AT yahoo-inc.com], [AT yahoo-inc.com],[AT yahoo-inc.com]
- 1 Summary
- 2 Release Note
- 3 Rationale
- 4 User stories
- 5 Requirements/Constraints
- 6 Design
- 7 Implementation
- 8 Test/Demo Plan
- 9 Unresolved issues
- 10 Meeting Logs
- 11 Related
- 12 BoF agenda and discussion
To effectively operate OpenStack at larger scale, we propose deeper instrumentation and timing of key activities *within* processing daemons at external I/O points and key bits of the processing flow. This will help in monitoring system performance and triaging issues at a deeper level. The proposal is to add a generalized mechanism for measuring processing and I/O events and other key metrics inside the daemons.
This section should include a paragraph describing the end-user impact of this change. It is meant to be included in the release notes of the first release in which it is implemented. (Not all of these will actually be included in the release notes, at the release manager's discretion; but writing them is a useful exercise.)
It is mandatory.
Our experience has shown there to be value in deeper levels of instrumentation and monitoring in order to aid in tracking scale and availability issues, monitoring intra-service issues, component errors, and for managing component health.
Examples of what could be done with this instrumentation data:
- establish performance baselines and characteristics for acceptance testing
- determine scalability characteristics of various system components
- tune configuration parameters for installation-specific sizing, e.g. connection pools, greenpools, wsgi backlog, etc.
- set alerts on certain types of metrics: connection pool utilization, connection errors, timeouts
- perform offline, large scale analysis of system usage patterns and performance using hadoop to do advanced scheduling, prediction, resource balancing
- request receive time
- request receive error
- request receive timeout
- request receive bytes
- response send time
- response send error
- response send timeout
- response send bytes
- request processing time
- response processing time
- dispatch time
- function time (native/not native)
- idle/blocking time
- function time
- model query time
- model write/update time
- session establishment time
- connection errors
- connection count
- ping listener errors
- connection pool used
- connection pool free
- message reply time
- message reply errors
- pack context time
- unpack context time
- multicall wait time
- cast time
- fanout cast time
- cast to server time
- fanout cast to server time
- notify send time
- remote errors
- rpc timeout
- pool used
- pool free
- greenpool used
- greenpool free
- waiter count
- The solution should add a minimum of overhead whether activated or not.
- The solution should *never* compete with command and control message priorities
- Emission and collection of data should be compatible with existing agents and pluggable where practical.
- Aggregation/correlation should be separate from data emission. Different tastes in collection and analysis should be supported.
- Data transmission is best effort and ok to be lossy in some scenarios.
- Desirable to have different levels of instrumentation since some do not want to go as deep as others
- Desirable to have ability to aggregate stats on 'dimensions', e.g. region, zone, tenant, etc.
- We do not want this data going via RPC since it should never interfere or compete for resources with RPC-driven operations.
The current plan is to:
- Create a set of decorators to wrap functions for the purpose of measuring execution time, emitting numeric counts, and raw events.
- Extend the nova logger to create a distinct log level and log handler to divert metrics to a different data sink
- Use the decorators to create metrics for some subset of nova
- Create some examples of metric aggregation using statsd via datagram and also via batch log analysis using hadoop.
We will flesh out this with more details as we complete our prototype work. Here is a sketch for discussion:
(see attachments for graffle and visio xml versions)
- Eventlet backdoor: https://github.com/openstack/openstack-common/blob/7695f967/openstack/common/eventlet_backdoor.py
- Grizzly Design Summit etherpad @ https://etherpad.openstack.org/grizzly-common-instrumentation
- add metrics gauges/decorators to nova/common
- add METRIC log level, metric format, configurable metric handler to nova/log.py
- instrument a couple of key modules to start
None at this time.
This need not be added or completed until the specification is nearing beta.
- Leveraging ceilometer: we certainly don't want to carry this data via RPC but may want to leverage log agent and collector.
- Compatibility with stacktach (see https://github.com/rackspace/stacktach and http://www.sandywalsh.com/2012/09/openstack-nova-internals-pt2-services.html)
- Consideration/evolution of https://blueprints.launchpad.net/nova/+spec/nova-instrumentation-v1 and impacted code if it gets approved
- on the one hand, there are clear similarities between things being measured by ceilometer and monitoring data
- BUT, ceilometer was not built for monitoring; it was built for metering and NOT losing critical billing messages
- AND, putting a bunch of best effort delivery messages through ceilometer and the RPC fabric does not seem to make sense
- possible to utilize ceilometer agents and service but with lighterweight transport?
- good news is code is already well-covered with logger objects and we are very likely to want to instrument at many levels: per-request, periodic, high-level, low-level
- injecting instrumentation data into the logging stream would be relatively straightforward using a metrics log adapter
- filter would be added to select only metrics events
- an additional logging handler could be created to take stuff out of the stream and emit it over the net, e.g. using DatagramHandler
- TBD: understand performance implications of utilizing log stream
Making it low overhead
- instrumented code must be cheap/free when inactive
- could possibly be handled via macros or preprocessing. kind of a pain though.
- technologies to consider:
- deepest level of instrumentation could be if debug: and optimized away with -O
- don't want to flood the network with datagram
- consolidate request metrics into single event
- batch send
How does this fit with stacktach
- Stacktach starts with dequeuing from AMQP so that doesn’t fit with the desire to not put this stuff over queue-based RPC
- BUT, there is clear overlap here since stacktach seems to be design to collect timing for things of interest. Perhaps the answer is that for measuring RPC flow, stacktach and instrumentation are not mutually exclusive?
- notes on stacktach:
- Tach - monkey patching library
- Used monkey patching to avoid ugly-ifying the code
- Monkey patching the RPC code? via config of nova.compute.queue_receive, method-by-method
- Only patch calls
- Decorators to catch/emit on exception
- configurable notifier (e.g. statsd)
- Essentially wrappers functions and does UDP to statsd upon RPC call
- have another set of stachtach workers that listen to queue and do rest calls up to StackTach for insertion into db (this is the v1 implementation). multitenant for devs. but has perf issues.
- v2 writes to the db directly but having troubles with perf on this one
- django app gives your a view of recent activity (notifications)
- also has cli to interrogate the REST-based i/f to stacktach
- suitable for production? seems to be used in rackspace prod envs
- traceability by uuid (nice) and perhaps request id
- can also get metrics: count, min, max, avg for the instrumented events (e.g. compute.instance.shutdown, compute.instance.delete, compute.instance.reboot, etc.)
- looks at request start/end request id pairs to compute times
- statsd choice: superfast, udp, no black holes
- InstrumentationMetricsMonitoring10292012 - IRC Meeting Log 10/29/2012
BoF agenda and discussion
Use this section to take notes during the BoF; if you keep it in the approved spec, use it for summarising what was discussed and note any options that were rejected.