Monasca/Instrumentation

Self-instrumenting Monasca

Goal: instrument Monasca components in the same way end users might instrument their own applications, with an emphasis on usability in Kubernetes environments.

Example Metrics

API latency per endpoint
API batch insert time, per message
API metric insert rate
Persister metric read and write rates
Kafka partition lag
Influx per-metric write latency

Options

monasca-statsd

Pros:

Already part of Monasca: https://github.com/openstack/monasca-statsd
Good developer API

Cons:

statsd dialects are not compatible between backends, e.g. monasca, prometheus, datadog, statsd proper
- i.e., language-specific libraries are needed. The only Monasca-compatible library is for Python (monasca-statsd)
In a k8s environment, a sidecar would need to run with every pod being instrumented
Requires authenticated write to Monasca, i.e. configuration required

prometheus

Pros:

Simple data format: easy to write simple metrics endpoint from scratch in any language
Can take advantage of existing client libraries for most languages
Can be scraped with zero configuration or authentication (in k8s environments)
Can easily inspect the current application state at runtime (just browse to /metrics)

Cons:

May require sidecar in certain situations, new code needed (see below)

See blueprint: https://blueprints.launchpad.net/monasca/+spec/prometheus-instrumentation

See PoC patch: https://review.openstack.org/417163

Sidecar

Prometheus monitoring with multiprocessing apps, like in the Python implementations of monasca-api and monasca-persister, leads to issues:

Multiprocessing support in the python prometheus_client is not ideal
- Requires significant modifications to gunicorn and keystone middleware, plus custom Falcon endpoints...
- Shared state between processes using the filesystem
Overloads '/metrics' endpoint (api)

To work around these limitations, a sidecar server would:

Ingest metrics from application processes using language-native Prometheus clients
- Clients would use either a small client plugin (supported in the Python client) or alternative registration method
Perform simple aggregations to hide individual subprocesses/workers
Publish aggregated metrics as standard Prometheus /metrics
Have a minimal system footprint

See blueprint: https://blueprints.launchpad.net/monasca/+spec/monasca-sidecar

See PoC: https://github.com/timothyb89/monasca-sidecar

Questions

Could monasca-statsd be adapted to work around the listed cons?
Could sidecar functionality be integrated into monasca-statsd?
Could monasca-statsd functionality be integrated into sidecar?
Is there any way to avoid needing a sidecar in the first place?