Difference between revisions of "Monasca/Monitoring Of Monasca"
< Monasca
Line 18: | Line 18: | ||
* As an operator or provider I want metrics for all Monasca components that will describe the status, capacity, and latency of each component. | * As an operator or provider I want metrics for all Monasca components that will describe the status, capacity, and latency of each component. | ||
+ | ==== StackForge / OpenStack ==== | ||
+ | ===== Blueprints ===== | ||
+ | |||
+ | ===== Bugs ===== | ||
+ | |||
+ | ===== Reviews / Repos ===== | ||
+ | |||
+ | : Gerrit: [https://review.openstack.org/#/c/150509/ Moved to setting up alarms with a role so they can be used more widely] | ||
+ | |||
+ | : Github: [https://github.com/hpcloud-mon/monasca-installer/pull/18 Added the default alarms role] | ||
+ | |||
+ | : Github: [https://github.com/hpcloud-mon/ansible-monasca-default-alarms New monasca-vagrant role for global alarms] | ||
==== Architectural Components ==== | ==== Architectural Components ==== |
Revision as of 03:20, 2 February 2015
Contents
Goals and Deliverables
- Out of the box general purpose monitoring metrics and alarms available for all parts (services, applications, OS) that make up a Monasca installation.
- A dashboard for the Monasca specific components to monitor the health.
- Each component should have metrics to give a view of the service that is useful for thresholds, debugging and capacity planning
- CLI tools to complement the UI capable of displaying Monasca details
- monasca-collector info
- monasca-forwarder info
- Metrics
- Pre-configured Alarm definitions for all core services with reasonable general purpose thresholds
User Stories
- As an end user the first thing I want to see after installing Monasca is a dashboard showing the status, capacity, and latency of my Monasca installation.
- As an end user deploying Monasca either individually, via CI, Vagrant, or using the installer, I want an initial dashboard showing the status of Monasca.
- As an operator I want a simple and concise view of the health of the Monasca service.
- As an operator or provider I want metrics for all Monasca components that will describe the status, capacity, and latency of each component.
StackForge / OpenStack
Blueprints
Bugs
Reviews / Repos
- Github: Added the default alarms role
Architectural Components
Off the shelf open components
- Apache Kafka (message queue)
- MySQL (alarm, notifications database)
- InfluxDB (metrics, logging, events database)
- Apache Storm (realtime stream processor)
- Apache Zookeeper (resource coordinator)
- Operating System
Monasca components
- API
- Agent
- Notification engine
- Threshold engine
- Persister
Alarm Definition Name | Category | Provider | Component | Subcomponent | Type (status, capacity, throughput, latency) | Measurement | |
---|---|---|---|---|---|---|---|
1 | HTTP Status Alarm | System | Application | Monasca | API | Status | Up / Down |
2 | Host Alive Alarm | System | OS | Processor | Hardware | Status | Up / Down |
3 | Disk Usage | System | OS | Disk | Hardware | Capacity | Percentage |
4 | Disk Inode Usage | System | OS | Disk | Hardware | Capacity | Percentage |
5 | High CPU Usage | System | OS | Processor | Hardware | Capacity | Percentage |
6 | Network Errors | System | OS | Network | Hardware | Status | Count |
7 | Memory Usage | System | OS | Memory | Hardware | Capacity | Percentage |
8 | Kafka Consumer Lag | Monasca | Application | Message Queue | Consumer | Latency | Time |
9 | Monasca Agent emit time | Monasca | Application | Monasca | Agent | Latency | Time |
10 | Monasca Notification Configuration DB query time | Monasca | Application | Monasca | Notification | Latency | Time |
11 | Monasca Agent collection time | Monasca | Application | Monasca | Agent | Latency | Time |
12 | Zookeeper Average Latency | Monasca | Application | Resource Coordinator | ? | Latency | Time |
13 | Monasca Notification email time | Monasca | Application | Monasca | Notification | Latency | Time |
14 | Process not found | System | OS | Processor | Process | Status | Count |
15 | VM Cpu usage | OpenStack | OS | Processor | Hardware | Capacity | Percentage |