Jump to: navigation, search

Difference between revisions of "Rally"

(Weekly updates: Update for October)
(Update for December 1, 2014)
Line 127: Line 127:
 
</big>
 
</big>
  
== Weekly updates ==  
+
== Updates ==  
  
'''Each week we write up on a special [[Rally/Updates|weekly updates page]] what sort of things have been accomplished in Rally during the past week and what are our plans for the next one. Below you can find the most recent report (October 27, 2014).'''
+
''Periodically, we write up on a special [[Rally/Updates|updates page]] what sort of things have been accomplished in Rally recently and what are our plans for the future. Below you can find the most recent report (December 1, 2014).'''
  
Much time has passed since our last update and we are happy to announce that we are about to make our first official Rally release very soon! Our active recent contribution to Rally has enabled us to make a significant progress. Here are the highlights of the novelties in Rally:
+
It's been a while since our last post here, and we've done quite a nice job in Rally during November. Let us share with you new things about Rally:
* We have completely '''[https://review.openstack.org/125119 redesigned] the auto-generated benchmark report page''' so that it looks now even nicer than before and is much more easy to navigate. Besides, further improvements of this HTML report page are on their way to being merged soon.
+
* '''Autogenerated HTML benchmark reports''' in Rally (which can be created by the '''''"rally task report"''''' command after a benchmark task has completed) have [https://review.openstack.org/#/c/131844/ been] [https://review.openstack.org/#/c/136435/ improved] further within the last month. As of now, the [http://logs.openstack.org/05/131005/28/check/gate-rally-dsvm-rally/f8f3da9/rally-plot/results.html.gz report page] contains an overview table, detailed informations about whether [http://logs.openstack.org/05/131005/28/check/gate-rally-dsvm-rally/f8f3da9/rally-plot/results.html.gz#/Authenticate.validate_cinder SLA (service-level agreement) checks] were successful and also detailed error logs, if any. Rally reports have become a wonderful tool indeed to analyse the benchmarking data as well as to share your results with others!
* Rally now has an extended support of '''[https://review.openstack.org/103145 plugins]''': in addition to writing custom scenarios, the plugin mechanism now enables to extend Rally with new context classes/scenario runners without actually contributing to the Rally master branch.
+
* Similar improvements have been made for HTML reports generated for the '''Tempest cloud verification''' ('''''"rally verify results --html --output_file <file>"'''''). New [https://review.openstack.org/#/c/135232/ enhanced] report pages have improved styling and refactored JS code.
* The work on extending the support of OpenStack projects in Rally has been conducted for '''[https://review.openstack.org/128874 Heat]''' and '''[https://review.openstack.org/126900 Sahara]'''.
+
* We have [https://review.openstack.org/#/c/137502/ changed] the way '''context classes''' in Rally should be declared. Having introduced a new '''''@context''''' decorator, we've made it much easier and also more readable.
* '''CLI improvements''': the command-line interface gets more and more user-friendly over time: recently, it has begun to support [https://review.openstack.org/124910 detailed] informations about correct commands usage in case of a failure. The ''"rally info"'' command has also been improved so that it now supports '''[https://review.openstack.org/125238 misspellings handling]'''. Finally, there has been some work on '''[https://review.openstack.org/129306 bash completion]''', which hasn't been completely finished yet.
+
* There is a new '''[https://review.openstack.org/127392 "servers" context]''' that allows you to create temporary servers before benchmark scenarios start and use these servers for testing inside these scenarios.
* '''Test code improvements''': we have greatly proceeded in out continuous work on unit/functional test coverage improvement. We also have [https://review.openstack.org/126379 moved] all the tests into a special ''tests/'' directory so that the test code is now organized in a more neat way. We also have removed the ''./run_tests.sh'' script for the sake of using the ''tox'' command to launch the test suite.
+
* '''New benchmark scenarios''' in Rally include those for '''[https://review.openstack.org/128631 Nova live migrate]''' and also a '''[https://review.openstack.org/127392 Cinder stress scenario]'''.
 +
* '''Command-line interface improvements''' include an ability to [https://review.openstack.org/131463 refer deployments not only by uuid but also by name]. Please note that the syntax has changed a bit so now you have to supply the ''--deployment'' parameter to commands like ''"rally use deployment"'' (instead of ''--uuid'').
 +
* There has been some '''major refactoring''' of the most critical parts of Rally code: the [https://review.openstack.org/129060 cleanup mechanism] and the [https://review.openstack.org/119297 "users" context code]. We are sure that after refactoring, this code has become both cleaner and less error-prone (as well as very pluggable in the case of cleanups).
  
  
In the nearest future, several interesting refactoring patches are going to come to Rally. To be more specific, there will be a vast change of the benchmark engine and cleanup mechanisms.
+
Current work includes further code refactoring (e.g. in the ''Benchmark engine'' part), further CLI improvements (e.g. for the [https://review.openstack.org/131005 "rally task list" command]) and also new benchmark scenarios (e.g. for [https://review.openstack.org/137661 Murano]). We are also going to introduce a possibility of building Rally images for [https://review.openstack.org/#/c/132556/ Docker].
  
 
We encourage you to take a look at new patches in Rally pending for review and to help us make Rally better!
 
We encourage you to take a look at new patches in Rally pending for review and to help us make Rally better!
Line 148: Line 150:
  
  
Stay tuned.
+
Stay tuned!
  
  

Revision as of 13:29, 1 December 2014


What is Rally?

If you are here, you are probably familiar with OpenStack and you also know that it's a really huge ecosystem of cooperative services. When something fails, performs slowly or doesn't scale, it's really hard to answer different questions on "what", "why" and "where" has happened. Another reason why you could be here is that you would like to build an OpenStack CI/CD system that will allow you to improve SLA, performance and stability of OpenStack continuously.

The OpenStack QA team mostly works on CI/CD that ensures that new patches don't break some specific single node installation of OpenStack. On the other hand it's clear that such CI/CD is only an indication and does not cover all cases (e.g. if a cloud works well on a single node installation it doesn't mean that it will continue to do so on a 1k servers installation under high load as well). Rally aims to fix this and help us to answer the question "How does OpenStack work at scale?". To make it possible, we are going to automate and unify all steps that are required for benchmarking OpenStack at scale: multi-node OS deployment, verification, benchmarking & profiling.

Rally-Actions.png
  • Deploy engine is not yet another deployer of OpenStack, but just a pluggable mechanism that allows to unify & simplify work with different deployers like: DevStack, Fuel, Anvil on hardware/VMs that you have.
  • Verification - (work in progress) uses tempest to verify the functionality of a deployed OpenStack cloud. In future Rally will support other OS verifiers.
  • Benchmark engine - allows to create parameterized load on the cloud based on a big repository of benchmarks.

For more information about how it works take a look at Rally Architecture


Use Cases

Before diving deep in Rally architecture let's take a look at 3 major high level Rally Use Cases:

Rally-UseCases.png


Typical cases where Rally aims to help are:

  1. Automate measuring & profiling focused on how new code changes affect the OS performance;
  2. Using Rally profiler to detect scaling & performance issues;
  3. Investigate how different deployments affect the OS performance:
    • Find the set of suitable OpenStack deployment architectures;
    • Create deployment specifications for different loads (amount of controllers, swift nodes, etc.);
  4. Automate the search for hardware best suited for particular OpenStack cloud;
  5. Automate the production cloud specification generation:
    • Determine terminal loads for basic cloud operations: VM start & stop, Block Device create/destroy & various OpenStack API methods;
    • Check performance of basic cloud operations in case of different loads.


Architecture

Usually OpenStack projects are as-a-Service, so Rally provides this approach and a CLI driven approach that does not require a daemon:

  1. Rally as-a-Service: Run rally as a set of daemons that present Web UI (work in progress) so 1 RaaS could be used by whole team.
  2. Rally as-an-App: Rally as a just lightweight CLI app (without any daemons), that makes it simple to develop & much more portable.


How is this possible? Take a look at diagram below:

Rally Architecture.png

So what is behind Rally?


Rally Components

Rally consists of 4 main components:

  1. Server Providers - provide servers (virtual servers), with ssh access, in one L3 network.
  2. Deploy Engines - deploy OpenStack cloud on servers that are presented by Server Providers
  3. Verification - component that runs tempest (or another pecific set of tests) against a deployed cloud, collects results & presents them in human readable form.
  4. Benchmark engine - allows to write parameterized benchmark scenarios & run them against the cloud.


But why does Rally need these components?
It becomes really clear if we try to imagine: how I will benchmark cloud at Scale, if ...

Rally QA.png


TO BE CONTINUED

Rally in action

How amqp_rpc_single_reply_queue affects performance

To show Rally's capabilities and potential we used NovaServers.boot_and_destroy scenario to see how amqp_rpc_single_reply_queue option affects VM bootup time. Some time ago it was shown that cloud performance can be boosted by setting it on so naturally we decided to check this result. To make this test we issued requests for booting up and deleting VMs for different number of concurrent users ranging from one to 30 with and without this option set. For each group of users a total number of 200 requests was issued. Averaged time per request is shown below:

amqp_rpc_single_replya_queue

So apparently this option affects cloud performance, but not in the way it was thought before.


Performance of Nova instance list command

Context: 1 OpenStack user

Scenario: 1) boot VM from this user 2) list VM

Runner: Repeat 200 times.

As a result, on every next iteration user has more and more VMs and performance of VM list is degrading quite fast:

nova vm list performance

Complex scenarios & detailed information

For example NovaServers.snapshot contains a lot of "atomic" actions:

  1. boot VM
  2. snapshot VM
  3. delete VM
  4. boot VM from snapshot
  5. delete VM
  6. delete snapshot

Fortunately Rally collects information about duration of all these operation for every iteration.

As a result we are generating beautiful graphs:

snapshot detailed performance

How To

Actually there are only 3 steps that should be interesting for you:

  1. Install Rally
  2. Use Rally
  3. Add rally performance jobs to your project
  4. Main concepts of Rally
  5. Improve Rally
  1. Main directions of work
  2. Where to begin
  3. How to contribute

Updates

Periodically, we write up on a special updates page what sort of things have been accomplished in Rally recently and what are our plans for the future. Below you can find the most recent report (December 1, 2014).'

It's been a while since our last post here, and we've done quite a nice job in Rally during November. Let us share with you new things about Rally:

  • Autogenerated HTML benchmark reports in Rally (which can be created by the "rally task report" command after a benchmark task has completed) have been improved further within the last month. As of now, the report page contains an overview table, detailed informations about whether SLA (service-level agreement) checks were successful and also detailed error logs, if any. Rally reports have become a wonderful tool indeed to analyse the benchmarking data as well as to share your results with others!
  • Similar improvements have been made for HTML reports generated for the Tempest cloud verification ("rally verify results --html --output_file <file>"). New enhanced report pages have improved styling and refactored JS code.
  • We have changed the way context classes in Rally should be declared. Having introduced a new @context decorator, we've made it much easier and also more readable.
  • There is a new "servers" context that allows you to create temporary servers before benchmark scenarios start and use these servers for testing inside these scenarios.
  • New benchmark scenarios in Rally include those for Nova live migrate and also a Cinder stress scenario.
  • Command-line interface improvements include an ability to refer deployments not only by uuid but also by name. Please note that the syntax has changed a bit so now you have to supply the --deployment parameter to commands like "rally use deployment" (instead of --uuid).
  • There has been some major refactoring of the most critical parts of Rally code: the cleanup mechanism and the "users" context code. We are sure that after refactoring, this code has become both cleaner and less error-prone (as well as very pluggable in the case of cleanups).


Current work includes further code refactoring (e.g. in the Benchmark engine part), further CLI improvements (e.g. for the "rally task list" command) and also new benchmark scenarios (e.g. for Murano). We are also going to introduce a possibility of building Rally images for Docker.

We encourage you to take a look at new patches in Rally pending for review and to help us make Rally better!

Source code for Rally is hosted at GitHub: https://github.com/stackforge/rally
You can track the overall progress in Rally via Stackalytics: http://stackalytics.com/?release=kilo&metric=commits&project_type=all&module=rally
Open reviews for Rally: https://review.openstack.org/#/q/status:open+rally,n,z


Stay tuned!


Regards,
The Rally team


Weekly Updates Archives

Rally in the World

Date Authors Title Location
29/May/2014
  1. Andrey Kurilin
Rally: OpenStack Tempest Testing Made Simple(r) https://www.mirantis.com/blog
01/May/2014
  1. Boden Russell
KVM and Docker LXC Benchmarking with OpenStack http://bodenr.blogspot.ru/
01/Mar/2014
  1. Bangalore C.B. Ananth (cbpadman at cisco.com)
  2. Rahul Upadhyaya (rahuupad at cisco.com)
Benchmark as a Service OpenStack-Rally OpenStack Meetup Bangalore
28/Feb/2014
  1. Peeyush Gupta
Benchmarking OpenStack With Rally http://www.thegeekyway.com/
26/Feb/2014
  1. Oleg Gelbukh
Benchmarking OpenStack at megascale: How we tested Mirantis OpenStack at SoftLayer http://www.mirantis.com/blog/
07/Nov/2013
  1. Boris Pavlovic
Benchmark OpenStack at Scale Openstack summit Hong Kong

Project Info

Useful links

NOTE: To be a member of trello board please write email to (boris at pavlovic.me) or ping in IRC boris-42

How to track project status?

The main directions of work in Rally are documented via blueprints. The most high-level ones are *-base blueprints, while more specific tasks are defined in derived blueprints (for an example of such a dependency tree, see the base blueprint for Benchmarks). Each “base” blueprint description contains a link to a google doc with detailed informations about its contents.

While each blueprint has an assignee, single patchsets that implement it may be owned by different developers. We use a Trello board to track the distribution of tasks among developers. The tasks are structured there both by labels (corresponding to top-level blueprints) and by their completion progress. Please note the Good for start category containing very simple tasks, which can serve as a perfect introduction to Rally for newcomers.

Where can I discuss & propose changes?

  1. Our IRC channel: IRC server #openstack-rally on irc.freenode.net;
  2. Weekly Rally team meeting: held on Tuesdays at 1700 UTC in IRC, at the #openstack-meeting channel (irc.freenode.net);
  3. Openstack mailing list: openstack-dev@lists.openstack.org (see subscription and usage instructions);
  4. Rally team on Launchpad: Answers/Bugs/Blueprints.