Information on this page is old and no longer kept up to date, see http://monasca.io/ for more up to date information
Monasca is a monitoring system with many parts that can scale horizontally to service large cloud deployments. The Monasca components can roughly be broken down into two categories, those which are part of the server cluster and the components that interact only with the Monasca API. In the standard flow of data Monasca agents send measurements into the system which are then processed by the threshold engine as well as stored for future retrieval and/or graphing.
The Monasca threshold engine evaluates metrics according to alarm definitions. As measurements come into the system alarms are created according to how they match the alarm definitions. Each alarm definition can be associated with a notification method which triggers when the alarm changes state.
Monasca is fully multi-tenant so each project must configure the various agents and alarm definitions to drive the system.
Alarm Definition Configuration
Additionally there is an Ansible module to assist in Alarm definition creation and a role with many default alarms already defined, both found at https://github.com/hpcloud-mon/ansible-monasca-default-alarms.
The Monasca agent is highly configurable and can collect measurements from many sources as well as be extended. For information on direct configuration refer to the agent documentation.
An Ansible role for installing and configuring the agent is available at https://github.com/hpcloud-mon/ansible-monasca-agent this includes an Ansible module (monasca_agent_plugin) for running specific monasca-setup detection plugins as well as examples of how to add in custom plugins with Ansible.
Server Installation and Configuration
The entire server stack with all of its components can be built and configured using Ansible. The team development environment does this on a small scale. The various roles have also been used in fully clustered deployments of Monasca.
Additionally some teams have configured Monasca via Puppet.
MoM - Monitoring of Monasca
Monasca itself needs to be monitored and is fully capable of monitoring itself. In non-production installations this can be done by the agent running on the Monasca boxes reporting back to the Monasca API. For production installations I recommend that the agent running on the Monasca nodes report to another installation of Monasca possibly a single vm 'mini-mon' which is itself monitored by the primary installation, this avoids dependency loops.
As components of Monasca are developed metrics for the monitoring of that component need to be added as well as alarm definitions and finally default graphs to view the metrics. The most basic metrics are used in simple up/down alarms and more advanced used for thresholds and graphs aiding in predictive failure and capacity planning.
Here are the MoM alarms broken down by component, the exact alarms can be found at https://github.com/hpcloud-mon/ansible-monasca-default-alarms/blob/master/tasks/monasca.yml
|zookeeper||pid check, average latency, zookeeper connections_count|
|kafka||pid check, consumer lag|
|mysql||pid check, slow queries|
|notification||pid check, config db time, email time|
|thresh/storm||pid check of nimbus, supervisor and workers|
|Agent||emit time, collection time|
|Persister||For the Java version a healthcheck on the admin url|
|API||For the Java version a healthcheck on the admin url|