Jump to: navigation, search

Difference between revisions of "Rubick"

m (Add link to rule engine description page)
(Add use cases brief description, remove roadmap as it changed with the scope)
Line 1: Line 1:
 
== Project Name ==
 
== Project Name ==
'''Official:''' OpenStack Diagnostics
 
 
 
'''Codename:''' Rubick
 
'''Codename:''' Rubick
  
Line 22: Line 20:
 
Diagnostics' mission is to '''provide OpenStack cloud operators with tools which minimize time and effort needed to identify and fix errors in operations maintenance phase of cloud life cycle.'''
 
Diagnostics' mission is to '''provide OpenStack cloud operators with tools which minimize time and effort needed to identify and fix errors in operations maintenance phase of cloud life cycle.'''
  
== User Stories ==
+
== Use Cases ==
* As a '''cloud operator''', I want to make sure that my OpenStack architecture and configuration is sane and consistent across all platform components and services.
+
More on use cases: [[Rubick/OpenStack Integration]]
* As a '''cloud architect''', I want to make sure that my OpenStack architecture and configuration are compliant to best practices.
 
* As a '''cloud architect''', I need a knowledge base of troubleshooting scenarios and best practices for my OpenStack cloud which I can reuse and update with my own scenarios and practices.
 
* As a '''cloud operator''', I want to be able to automatically extract configuration parameters from all OpenStack components to verify their correctness, consistency and integrity.
 
* As a '''cloud operator''', I want automatic diagnostics tool which can inspect configuration of my OpenStack cloud and report if it is sane and/or compliant toc community-defined best practices.
 
* As a '''cloud operator''', I want to be able to define rules used to inspect and verify configuration of OpenStack components and store them to use for verification of future configuration changes.
 
 
 
== Roadmap ==
 
==== Proof of concept implementation ====
 
Targeted to end October 2013. PoC implementation scope includes:
 
  
# Open source code in stackforge repository
+
# Stand-alone tool for validating configuration consistency across OpenStack services for individual service instance. For example, if we want to start nova-compute with VMWare driver, Nova will talk to Rubick during initialization, send in its configuration file and request validation of it. If configuraiton is inconsistent with configurations of other services (e.g. Keystone or Glance endpoints are not correct), Nova startup should fail with a message.
# Standalone service with REST API v0.1
+
# Validation of configurations of the whole OpenStack platform as a pre- or post-deployment action. For example, TripleO could include the validation as a final step pri
# Simple SSH-based configuration data extraction
+
# Diagnostic API to increase debugability of OpenStack. For example, whitebox testing could be simplified in Tempest if Diagnostic API allows to track actual state of resources upon Nova API request performed by test class.
# Rules engine with grammatic analysis
 
# Basic healthcheck ruleset v0.1 with example rules of different types
 
# Filesystem-based ruleset store
 
  
==== MVP1 implementation ====
+
== Design ==
Targeted to mid-November 2013. MVP1 implementation scope includes:
 
  
# Basic integration with OpenStack Deployment program projects (Tuskar, TripleO)
+
Service architecture: [[Rubick/Service architecture]]
# Extraction of configuration data from Heat metadata
 
# Extended ruleset with example best practices
 
# Healthcheck ruleset v1.0
 
# Ruleset store back-ends
 
  
 
== Links ==
 
== Links ==
# '''Source code''' on GitHub: https://github.com/MirantisLabs/rubick
+
# '''Source code''' in Stackforge on GitHub: https://github.com/stackforge/rubick
 +
# '''Patches in review''' in Gerrit: https://review.openstack.org/#/q/status:open+project:stackforge/rubick,n,z
 
# '''Launchpad''' project: https://launchpad.net/Rubick
 
# '''Launchpad''' project: https://launchpad.net/Rubick
# Service architecture: [[Rubick/Service architecture]]
 
# OpenStack integration use cases: [[Rubick/OpenStack Integration]]
 
# Rule engine description: [[Rubick/Rules engine]]
 

Revision as of 21:15, 19 November 2013

Project Name

Codename: Rubick

Overview

The typical OpenStack cloud life cycle consists of 2 phases:

  • initial deployment and
  • operation maintenance


OpenStack cloud operators usually rely on deploymnet tools to configure all the platform components correctly and efficiently in initial deployment phase. Multiple OpenStack projects cover that area: TripleO/Tuskar, Fuel and Devstack, to name a few.

However, once you installed and kicked off the cloud, platform configurations and operational conditions begin to change. These changes could break consistency and integration of cloud platform components. Keeping cloud up and running is the essense of operation maintenance phase.

Cloud operator must quickly and efficiently identify and respond to the root cause of such failures. To do so, he must check if his OpenStack configuration is sane and consistent. These checks could be thought of as rules of diagnostic system.

There are no many projects in OpenStack ecosystem aimed to increase reliability and resilience of the cloud at the operation stage. With this proposal we want to introduce a project which will help operators to diagnose their OpenStack platform, reduce response time to known and unknown failures and effectively support the desired SLA.

Mission

Diagnostics' mission is to provide OpenStack cloud operators with tools which minimize time and effort needed to identify and fix errors in operations maintenance phase of cloud life cycle.

Use Cases

More on use cases: Rubick/OpenStack Integration

  1. Stand-alone tool for validating configuration consistency across OpenStack services for individual service instance. For example, if we want to start nova-compute with VMWare driver, Nova will talk to Rubick during initialization, send in its configuration file and request validation of it. If configuraiton is inconsistent with configurations of other services (e.g. Keystone or Glance endpoints are not correct), Nova startup should fail with a message.
  2. Validation of configurations of the whole OpenStack platform as a pre- or post-deployment action. For example, TripleO could include the validation as a final step pri
  3. Diagnostic API to increase debugability of OpenStack. For example, whitebox testing could be simplified in Tempest if Diagnostic API allows to track actual state of resources upon Nova API request performed by test class.

Design

Service architecture: Rubick/Service architecture

Links

  1. Source code in Stackforge on GitHub: https://github.com/stackforge/rubick
  2. Patches in review in Gerrit: https://review.openstack.org/#/q/status:open+project:stackforge/rubick,n,z
  3. Launchpad project: https://launchpad.net/Rubick