Jump to: navigation, search

Difference between revisions of "Distributed Monitoring"

(Created page with "= Distributed Monitoring = Current centralized monitoring architecture is suitable for cloud infrastructure, but in NFV, operator want high ... And Distributed approach is .....")
 
(Distributed Monitoring)
Line 1: Line 1:
 
= Distributed Monitoring =
 
= Distributed Monitoring =
 +
== Overview ==
 +
Monitoring and its application are becoming key factor for service lifecycle management of various systems such as NFV and cloud native platform. Distributed monitoring approach is one of the framework which enables flexible and scalable monitoring that can work with current OpenStack telemetry and monitoring framework. In this documantation, you can find the architecture and how to implement this framework to your environment.
 +
=== Who will be interested in ? ===
 +
* For infrastracture operators who want to collect detailed data in short intarval in their compute nodes.
 +
* For NFV operators who want to know abnormal behaivers of Virtual Network Functions.
 +
== Architecture ==
 +
In this architecture includes several functions for monitoring, collector, in-memory database, analysis engine and notification. Below picture shows the architecture of distributed monitoring. Each compute node has it own monitoring function following this architecture.
  
Current centralized monitoring architecture is suitable for cloud infrastructure, but in NFV, operator want high ... And Distributed approach is ...
+
=== Function ===
 
+
* Poller/Notification process collect data from guest OSs and host OS using SNMP protocol, libvirt API, OpenStack API, etc.
 +
* Collector format data suitable for in-memory database and insert these data into database.
 +
* Analytics Engine analyzes data on in-memory database, you can use several analytics engine like machine learning libraries. And also you can use evaluator that function is threshold monitoring directly.  
 +
* Transmitter send analytics results and alarms that are caught on threshold to Operation Support System, OpenStack API and Orchestrator.
 +
=== Feature ===
 +
short interval, scalable, mq非依存
 
== Use Cases ==
 
== Use Cases ==
 
 
* Micro Burst Traffic
 
* Micro Burst Traffic
 
* Memory Leak
 
* Memory Leak
 
* Abnormal behaviour of software/hardware
 
* Abnormal behaviour of software/hardware
 
* ...  
 
* ...  
 
+
== Implementation ==
== Tools ==
 
 
 
 
For distributed monitoring, some open source softwares are useful. ...
 
For distributed monitoring, some open source softwares are useful. ...
 
+
=== Tool(OSS?) ===
=== collectd ===
+
==== collectd ====
 
collectd is one of the powerful collecting tools for timebase data... collectd has also several plugins and you can use threshold ,notification and python plugin.
 
collectd is one of the powerful collecting tools for timebase data... collectd has also several plugins and you can use threshold ,notification and python plugin.
  
 
[https://collectd.org/wiki/index.php collectd's official page]
 
[https://collectd.org/wiki/index.php collectd's official page]
  
=== redis ===
+
==== redis ====
 
redis is light in-memory database...
 
redis is light in-memory database...
  
 
[https://redis.io/ redis's official page]
 
[https://redis.io/ redis's official page]
  
=== scikit-learn ===
+
==== scikit-learn ====
 
In this page, scikit-learn is recommended as light weight machine learning library for analytics engine.  
 
In this page, scikit-learn is recommended as light weight machine learning library for analytics engine.  
  
 
[http://scikit-learn.org/stable/ scikit-learn's official page]
 
[http://scikit-learn.org/stable/ scikit-learn's official page]
  
== Setup ==
+
=== Setup ===
 
 
 
In computing nodes, controller nods and every nodes that you want to monitor, you can setup following example. In this example, ubuntu16.04 is selected as each node's OS.
 
In computing nodes, controller nods and every nodes that you want to monitor, you can setup following example. In this example, ubuntu16.04 is selected as each node's OS.
 
# set up OpenStack environment using DevStack or manual installation
 
# set up OpenStack environment using DevStack or manual installation
Line 58: Line 66:
 
#save 60 10000
 
#save 60 10000
 
</pre>
 
</pre>
 +
 +
== References ==
 +
 +
== Who is contributing to this guide? ==

Revision as of 07:37, 25 September 2017

Distributed Monitoring

Overview

Monitoring and its application are becoming key factor for service lifecycle management of various systems such as NFV and cloud native platform. Distributed monitoring approach is one of the framework which enables flexible and scalable monitoring that can work with current OpenStack telemetry and monitoring framework. In this documantation, you can find the architecture and how to implement this framework to your environment.

Who will be interested in ?

  • For infrastracture operators who want to collect detailed data in short intarval in their compute nodes.
  • For NFV operators who want to know abnormal behaivers of Virtual Network Functions.

Architecture

In this architecture includes several functions for monitoring, collector, in-memory database, analysis engine and notification. Below picture shows the architecture of distributed monitoring. Each compute node has it own monitoring function following this architecture.

Function

  • Poller/Notification process collect data from guest OSs and host OS using SNMP protocol, libvirt API, OpenStack API, etc.
  • Collector format data suitable for in-memory database and insert these data into database.
  • Analytics Engine analyzes data on in-memory database, you can use several analytics engine like machine learning libraries. And also you can use evaluator that function is threshold monitoring directly.
  • Transmitter send analytics results and alarms that are caught on threshold to Operation Support System, OpenStack API and Orchestrator.

Feature

short interval, scalable, mq非依存

Use Cases

  • Micro Burst Traffic
  • Memory Leak
  • Abnormal behaviour of software/hardware
  • ...

Implementation

For distributed monitoring, some open source softwares are useful. ...

Tool(OSS?)

collectd

collectd is one of the powerful collecting tools for timebase data... collectd has also several plugins and you can use threshold ,notification and python plugin.

collectd's official page

redis

redis is light in-memory database...

redis's official page

scikit-learn

In this page, scikit-learn is recommended as light weight machine learning library for analytics engine.

scikit-learn's official page

Setup

In computing nodes, controller nods and every nodes that you want to monitor, you can setup following example. In this example, ubuntu16.04 is selected as each node's OS.

  1. set up OpenStack environment using DevStack or manual installation
  2. install collectd, redis and some other python library
  3. get demo code of DMA
# apt install collectd redis python-pip python-dev 

collectd

collectd.conf

 <Plugin "write_redis">
   <Node "dma">
       Host "localhost"
       Port "6379"
       Timeout 1000
   </Node>
 </Plugin>
}

redis

redis.conf

#save 300 10
#save 60 10000

References

Who is contributing to this guide?