Monasca/MetricNamingConventionProposal
Contents
Mid-cycle Meeting Discussion
https://etherpad.openstack.org/p/monasca_liberty_mid_cycle_metricnames
Metric Names
Although metric names in the Monasca API can be any string the Monasca Agent uses several naming conventions as follows:
All lowercase characters. '.' is used to hierarchially group. This is done for compatibility with Graphite as Graphite assumes a '.' as a delimiter. '_' is used to separate words in long names that are not meant to be hierarchical.
Metric Name Categories
Considering the five following attributes with a dot delimited metric name will allow future aggregation, filtering, parsing, querying, and pattern matching.
- proper_noun
- noun
- measurement (status, capacity, throughput, latency)
- type (hits, kilobytes, bytes, count, percent, state)
- qualifier (total, second, minute, hour, percent, current)
An abbreviated naming convention is desired in order to reduce the resource requirements to collect, transmit, and store the metrics e.g. perc in place of percent saves three bytes per metric globally.
Common measurement categories and terms
Proper Noun
- apache
- system
- kafka
- mysql
- zookeeper
Noun
- net
- cpu
- busy_worker
- idle_worker
Measurement
- Throughput
- amount per time
- Status
- up, down, alive, dead, crashed, recovering, rebooting, shutting down
- Capacity
- full, percentage, empty
- Latency
- a time based measurement
- Count
- a quantity of some measurement
Type
The type field further explains what is being measured by specifying the unit of measure
- hits
- kbytes
- capacity
- percent
- state
- count
Qualifier
The final field completes the nexus on the entire measurement by specifying the quantity for the unit of measure (UOM) type field
- total
- second
- current
Other Considerations
Dimensions
The metric names can be further complimented with the addition of dimensions. This will preserve the metric naming patterns and allow the flexibility to further annotate the metrics with dimensions. see Dimensions
Value_Meta
The value data can be further annotated using the value_meta field without the need to encode extra metadata in the metric name. see Value_Metadata
Existing and Proposed Metric Names
Proposed pattern format for metric names
- proper_noun.noun.measurement.type.qualifier
Existing Name | Proposed Name |
---|---|
apache.net.hits | apache.net.throughput.hits.total |
apache.net.kbytes_sec | apache.net.throughput.kbytes.second |
apache.net.requests_sec | apache.net.throughput.requests.second |
apache.net.total_kbytes | apache.net.throughput.kbytes.total |
apache.performance.busy_worker_count | apache.busy_worker.capacity.count.total |
apache.performance.cpu_load_perc | apache.cpu.capacity.perccent.total |
apache.performance.idle_worker_count | apache.idle_worker.capacity.count.total |
apache.status | apache.system.status.state.current |
cpu.idle_perc | system.cpu.capacity.percent.total |
... | ... |
Raw Metrics / Published Metrics
Internal Name | Published Name |
---|---|
Vertica | |
total_session_count | |
running_query_count | |
request_queue_depth | |
InfluxDB | |
response_time | |
http_status | |
broadcastMessageTx | |
writeSeriesMessageTx | |
queriesExecuted | |
queriesRx | |
shardsCreated | |
broadcastMessageRx | |
batchWriteRx | |
pointWriteRx | |
Storm / Thresh | |
acker.emit-count.metrics | |
acker.receive.capacity | |
acker.receive.population | |
acker.receive.read_pos | |
acker.receive.write_pos | |
acker.sendqueue.capacity | |
acker.sendqueue.population | |
acker.sendqueue.read_pos | |
acker.sendqueue.write_pos | |
acker.transfer-count.metrics | |
aggregation-bolt.ack-count.alarm-creation-bolt_alarm-creation-stream | |
aggregation-bolt.ack-count.event-bolt_metric-sub-alarm-events | |
aggregation-bolt.ack-count.filtering-bolt_default | |
aggregation-bolt.ack-count.system_tick | |
aggregation-bolt.emit-count.default | |
aggregation-bolt.emit-count.metrics | |
aggregation-bolt.emit-count.system | |
aggregation-bolt.execute-count.alarm-creation-bolt_alarm-creation-stream | |
aggregation-bolt.execute-count.event-bolt_metric-sub-alarm-events | |
aggregation-bolt.execute-count.filtering-bolt_default | |
aggregation-bolt.execute-count.system_tick | |
aggregation-bolt.execute-latency.alarm-creation-bolt_alarm-creation-stream | |
aggregation-bolt.execute-latency.event-bolt_metric-sub-alarm-events | |
aggregation-bolt.execute-latency.filtering-bolt_default | |
aggregation-bolt.execute-latency.system_tick | |
aggregation-bolt.process-latency.alarm-creation-bolt_alarm-creation-stream | |
aggregation-bolt.process-latency.event-bolt_metric-sub-alarm-events | |
aggregation-bolt.process-latency.filtering-bolt_default | |
aggregation-bolt.process-latency.system_tick | |
aggregation-bolt.receive.capacity | |
aggregation-bolt.receive.population | |
aggregation-bolt.receive.read_pos | |
aggregation-bolt.receive.write_pos | |
aggregation-bolt.sendqueue.capacity | |
aggregation-bolt.sendqueue.population | |
aggregation-bolt.sendqueue.read_pos | |
aggregation-bolt.sendqueue.write_pos | |
aggregation-bolt.transfer-count.default | |
aggregation-bolt.transfer-count.metrics | |
aggregation-bolt.transfer-count.system | |
alarm-creation-bolt.ack-count.event-bolt_alarm-definition-events | |
alarm-creation-bolt.ack-count.filtering-bolt_newMetricForAlarmDefinitionStream | |
alarm-creation-bolt.emit-count.alarm-creation-stream | |
alarm-creation-bolt.emit-count.metrics | |
alarm-creation-bolt.execute-count.event-bolt_alarm-definition-events | |
alarm-creation-bolt.execute-count.filtering-bolt_newMetricForAlarmDefinitionStream | |
alarm-creation-bolt.execute-latency.event-bolt_alarm-definition-events | |
alarm-creation-bolt.execute-latency.filtering-bolt_newMetricForAlarmDefinitionStream | |
alarm-creation-bolt.process-latency.event-bolt_alarm-definition-events | |
alarm-creation-bolt.process-latency.filtering-bolt_newMetricForAlarmDefinitionStream | |
alarm-creation-bolt.receive.capacity | |
alarm-creation-bolt.receive.population | |
alarm-creation-bolt.receive.read_pos | |
alarm-creation-bolt.receive.write_pos | |
alarm-creation-bolt.sendqueue.capacity | |
alarm-creation-bolt.sendqueue.population | |
alarm-creation-bolt.sendqueue.read_pos | |
alarm-creation-bolt.sendqueue.write_pos | |
alarm-creation-bolt.transfer-count.alarm-creation-stream | |
alarm-creation-bolt.transfer-count.metrics | |
event-bolt.emit-count.alarm-definition-events | |
event-bolt.emit-count.metrics | |
event-bolt.execute-count.event-spout_default | |
event-bolt.execute-latency.event-spout_default | |
event-bolt.receive.capacity | |
event-bolt.receive.population | |
event-bolt.receive.read_pos | |
event-bolt.receive.write_pos | |
event-bolt.sendqueue.capacity | |
event-bolt.sendqueue.population | |
event-bolt.sendqueue.read_pos | |
event-bolt.sendqueue.write_pos | |
event-bolt.transfer-count.alarm-definition-events | |
event-bolt.transfer-count.metrics | |
event-spout.emit-count.default | |
event-spout.emit-count.metrics | |
event-spout.receive.capacity | |
event-spout.receive.population | |
event-spout.receive.read_pos | |
event-spout.receive.write_pos | |
event-spout.sendqueue.capacity | |
event-spout.sendqueue.population | |
event-spout.sendqueue.read_pos | |
event-spout.sendqueue.write_pos | |
event-spout.transfer-count.default | |
event-spout.transfer-count.metrics | |
filtering-bolt.ack-count.event-bolt_alarm-definition-events | |
filtering-bolt.ack-count.metrics-spout_default | |
filtering-bolt.emit-count.default | |
filtering-bolt.emit-count.metrics | |
filtering-bolt.emit-count.newMetricForAlarmDefinitionStream | |
filtering-bolt.execute-count.event-bolt_alarm-definition-events | |
filtering-bolt.execute-count.metrics-spout_default | |
filtering-bolt.execute-latency.event-bolt_alarm-definition-events | |
filtering-bolt.execute-latency.metrics-spout_default | |
filtering-bolt.process-latency.event-bolt_alarm-definition-events | |
filtering-bolt.process-latency.metrics-spout_default | |
filtering-bolt.receive.capacity | |
filtering-bolt.receive.population | |
filtering-bolt.receive.read_pos | |
filtering-bolt.receive.write_pos | |
filtering-bolt.sendqueue.capacity | |
filtering-bolt.sendqueue.population | |
filtering-bolt.sendqueue.read_pos | |
filtering-bolt.sendqueue.write_pos | |
filtering-bolt.transfer-count.default | |
filtering-bolt.transfer-count.metrics | |
filtering-bolt.transfer-count.newMetricForAlarmDefinitionStream | |
metrics-spout.emit-count.default | |
metrics-spout.emit-count.metrics | |
metrics-spout.receive.capacity | |
metrics-spout.receive.population | |
metrics-spout.receive.read_pos | |
metrics-spout.receive.write_pos | |
metrics-spout.sendqueue.capacity | |
metrics-spout.sendqueue.population | |
metrics-spout.sendqueue.read_pos | |
metrics-spout.sendqueue.write_pos | |
metrics-spout.transfer-count.default | |
metrics-spout.transfer-count.metrics | |
system.emit-count.metrics | |
system.GC_ConcurrentMarkSweep.count | |
system.GC_ConcurrentMarkSweep.timeMs | |
system.GC_ParNew.count | |
system.GC_ParNew.timeMs | |
system.memory_heap.committedBytes | |
system.memory_heap.initBytes | |
system.memory_heap.maxBytes | |
system.memory_heap.unusedBytes | |
system.memory_heap.usedBytes | |
system.memory_heap.virtualFreeBytes | |
system.memory_nonHeap.committedBytes | |
system.memory_nonHeap.initBytes | |
system.memory_nonHeap.maxBytes | |
system.memory_nonHeap.unusedBytes | |
system.memory_nonHeap.usedBytes | |
system.memory_nonHeap.virtualFreeBytes | |
system.newWorkerEvent | |
system.receive.capacity | |
system.receive.population | |
system.receive.read_pos | |
system.receive.write_pos | |
system.sendqueue.capacity | |
system.sendqueue.population | |
system.sendqueue.read_pos | |
system.sendqueue.write_pos | |
system.startTimeSecs | |
system.transfer.capacity | |
system.transfer-count.metrics | |
system.transfer.population | |
system.transfer.read_pos | |
system.transfer.write_pos | |
system.uptimeSecs | |
thresholding-bolt.ack-count.aggregation-bolt_default | |
thresholding-bolt.ack-count.event-bolt_alarm-definition-events | |
thresholding-bolt.ack-count.event-bolt_metric-sub-alarm-events | |
thresholding-bolt.emit-count.metrics | |
thresholding-bolt.execute-count.aggregation-bolt_default | |
thresholding-bolt.execute-count.event-bolt_alarm-definition-events | |
thresholding-bolt.execute-count.event-bolt_metric-sub-alarm-events | |
thresholding-bolt.execute-latency.aggregation-bolt_default | |
thresholding-bolt.execute-latency.event-bolt_alarm-definition-events | |
thresholding-bolt.execute-latency.event-bolt_metric-sub-alarm-events | |
thresholding-bolt.process-latency.aggregation-bolt_default | |
thresholding-bolt.process-latency.event-bolt_alarm-definition-events | |
thresholding-bolt.process-latency.event-bolt_metric-sub-alarm-events | |
thresholding-bolt.receive.capacity | |
thresholding-bolt.receive.population | |
thresholding-bolt.receive.read_pos | |
thresholding-bolt.receive.write_pos | |
thresholding-bolt.sendqueue.capacity | |
thresholding-bolt.sendqueue.population | |
thresholding-bolt.sendqueue.read_pos | |
thresholding-bolt.sendqueue.write_pos | |
thresholding-bolt.transfer-count.metrics |