Jump to: navigation, search

Difference between revisions of "Monasca/Monitoring Of Monasca"

Line 1: Line 1:
 +
==== Goals and Deliverables ====
 +
 +
* Out of the box general purpose monitoring metrics and alarms available for all parts (services, applications, OS) that make up a Monasca installation.
 +
* A dashboard for the Monasca specific components to monitor the health.
 +
* Each component should have metrics to give a view of the service that is useful for thresholds, debugging and capacity planning
 +
* CLI tools to complement the UI capable of displaying Monasca details
 +
** monasca-collector info
 +
** monasca-forwarder info
 +
 +
* Metrics
 +
** Pre-configured Alarm definitions for all core services with reasonable general purpose thresholds
 +
 +
==== User Stories ====
 +
 +
* As an end user the first thing I want to see after installing Monasca is a dashboard showing the status, capacity, and latency of my Monasca installation.
 +
* As an end user deploying Monasca either individually, via CI, Vagrant, or using the installer, I want an initial dashboard showing the status of Monasca.
 +
* As an operator I want a simple and concise view of the health of the Monasca service.
 +
* As an operator or provider I want metrics for all Monasca components that will describe the status, capacity, and latency of each component.
 +
 +
 +
 
==== Architectural Components ====
 
==== Architectural Components ====
  

Revision as of 03:14, 2 February 2015

Goals and Deliverables

  • Out of the box general purpose monitoring metrics and alarms available for all parts (services, applications, OS) that make up a Monasca installation.
  • A dashboard for the Monasca specific components to monitor the health.
  • Each component should have metrics to give a view of the service that is useful for thresholds, debugging and capacity planning
  • CLI tools to complement the UI capable of displaying Monasca details
    • monasca-collector info
    • monasca-forwarder info
  • Metrics
    • Pre-configured Alarm definitions for all core services with reasonable general purpose thresholds

User Stories

  • As an end user the first thing I want to see after installing Monasca is a dashboard showing the status, capacity, and latency of my Monasca installation.
  • As an end user deploying Monasca either individually, via CI, Vagrant, or using the installer, I want an initial dashboard showing the status of Monasca.
  • As an operator I want a simple and concise view of the health of the Monasca service.
  • As an operator or provider I want metrics for all Monasca components that will describe the status, capacity, and latency of each component.


Architectural Components

Off the shelf open components

  • Apache Kafka (message queue)
  • MySQL (alarm, notifications database)
  • InfluxDB (metrics, logging, events database)
  • Apache Storm (realtime stream processor)
  • Apache Zookeeper (resource coordinator)
  • Operating System


Monasca components

  • API
  • Agent
  • Notification engine
  • Threshold engine
  • Persister


Alarm Definition Name Category Provider Component Subcomponent Type (status, capacity, throughput, latency) Measurement
1 HTTP Status Alarm System Application Monasca API Status Up / Down
2 Host Alive Alarm System OS Processor Hardware Status Up / Down
3 Disk Usage System OS Disk Hardware Capacity Percentage
4 Disk Inode Usage System OS Disk Hardware Capacity Percentage
5 High CPU Usage System OS Processor Hardware Capacity Percentage
6 Network Errors System OS Network Hardware Status Count
7 Memory Usage System OS Memory Hardware Capacity Percentage
8 Kafka Consumer Lag Monasca Application Message Queue Consumer Latency Time
9 Monasca Agent emit time Monasca Application Monasca Agent Latency Time
10 Monasca Notification Configuration DB query time Monasca Application Monasca Notification Latency Time
11 Monasca Agent collection time Monasca Application Monasca Agent Latency Time
12 Zookeeper Average Latency Monasca Application Resource Coordinator  ? Latency Time
13 Monasca Notification email time Monasca Application Monasca Notification Latency Time
14 Process not found System OS Processor Process Status Count
15 VM Cpu usage OpenStack OS Processor Hardware Capacity Percentage