Monasca/Monitoring Of Monasca
Goals and Deliverables
- Out of the box general purpose monitoring metrics and alarms available for all parts (services, applications, OS) that make up a Monasca installation.
- A dashboard for the Monasca specific components to monitor the health.
- Each component should have metrics to give a view of the service that is useful for thresholds, debugging and capacity planning
- CLI tools to complement the UI capable of displaying Monasca details
- monasca-collector info
- monasca-forwarder info
- Metrics
- Pre-configured Alarm definitions for all core services with reasonable general purpose thresholds
User Stories
- As an end user the first thing I want to see after installing Monasca is a dashboard showing the status, capacity, and latency of my Monasca installation.
- As an end user deploying Monasca either individually, via CI, Vagrant, or using the installer, I want an initial dashboard showing the status of Monasca.
- As an operator I want a simple and concise view of the health of the Monasca service.
- As an operator or provider I want metrics for all Monasca components that will describe the status, capacity, and latency of each component.
Architectural Components
Off the shelf open components
- Apache Kafka (message queue)
- MySQL (alarm, notifications database)
- InfluxDB (metrics, logging, events database)
- Apache Storm (realtime stream processor)
- Apache Zookeeper (resource coordinator)
- Operating System
Monasca components
- API
- Agent
- Notification engine
- Threshold engine
- Persister
Alarm Definition Name | Category | Provider | Component | Subcomponent | Type (status, capacity, throughput, latency) | Measurement | |
---|---|---|---|---|---|---|---|
1 | HTTP Status Alarm | System | Application | Monasca | API | Status | Up / Down |
2 | Host Alive Alarm | System | OS | Processor | Hardware | Status | Up / Down |
3 | Disk Usage | System | OS | Disk | Hardware | Capacity | Percentage |
4 | Disk Inode Usage | System | OS | Disk | Hardware | Capacity | Percentage |
5 | High CPU Usage | System | OS | Processor | Hardware | Capacity | Percentage |
6 | Network Errors | System | OS | Network | Hardware | Status | Count |
7 | Memory Usage | System | OS | Memory | Hardware | Capacity | Percentage |
8 | Kafka Consumer Lag | Monasca | Application | Message Queue | Consumer | Latency | Time |
9 | Monasca Agent emit time | Monasca | Application | Monasca | Agent | Latency | Time |
10 | Monasca Notification Configuration DB query time | Monasca | Application | Monasca | Notification | Latency | Time |
11 | Monasca Agent collection time | Monasca | Application | Monasca | Agent | Latency | Time |
12 | Zookeeper Average Latency | Monasca | Application | Resource Coordinator | ? | Latency | Time |
13 | Monasca Notification email time | Monasca | Application | Monasca | Notification | Latency | Time |
14 | Process not found | System | OS | Processor | Process | Status | Count |
15 | VM Cpu usage | OpenStack | OS | Processor | Hardware | Capacity | Percentage |