Jump to: navigation, search

Difference between revisions of "GSoC2014/Rally/BenchmarksVirtualMachinesOpenStack"

(Description and Analysis: Add description of high-level architecture.)
(Add the Example: Blogbench section.)
 
(12 intermediate revisions by the same user not shown)
Line 2: Line 2:
 
The ability to benchmark a Virtual Machine is an important activity that more and more developers will need to perform as they host their SaaS applications in a cloud. The aim of this project is to integrate into the [[Rally]] project the ability to run easily and in an automated manner various benchmarks for measuring the performance of the deployed Virtual Machines of an OpenStack cloud.
 
The ability to benchmark a Virtual Machine is an important activity that more and more developers will need to perform as they host their SaaS applications in a cloud. The aim of this project is to integrate into the [[Rally]] project the ability to run easily and in an automated manner various benchmarks for measuring the performance of the deployed Virtual Machines of an OpenStack cloud.
  
== Description and Analysis ==
+
== Description ==
 +
The project can be divided in two parts. The first part has to do with the development of an architecture (some kind of framework) that defines a standard and easy way of porting different benchmarks to Rally, and in the second part, with the use of this framework, port existing popular benchmarks that are used to measure the performance of different aspects of a computer system.
  
The goal of this project is to port to Rally existing popular benchmarks used for measuring the performance of a computing system, to the virtual machines of an OpenStack cloud. In order to accomplish that, it is required to develop an architecture that will be flexible to adapt to any benchmark that someone will likely need to use to measure the performance of his/her VM(s).
+
For the first part, a new benchmark context (benchmark_image) is developed that generates an image that has installed all the required programs to run the specified benchmark. The context takes an image, a flavor and some other necessary information from the task configuration file of the benchmark scenario and boots a virtual machine. Then, using the already created users and their keypairs and security groups, it gains access to the virtual machine with SSH and executes the setup script of the specified benchmark. The setup script is a Bash script that installs the benchmark (and its dependencies) in the virtual machine. Finally, the context takes a snapshot of that virtual machine and returns the name of the newly created benchmark-ready image. Now the benchmark scenario uses the image that the context returned, in order to boot the virtual machine. It gains access to the virtual machine with SSH and executes the run script of the specified benchmark. The run script is a Python script that executes the benchmark and returns the results of it in JSON format.  
  
To achieve that, the architecture is modular and it can be distinguished in the following discrete steps:
+
For the second part, because of the architecture that was defined in the first part, there is only need to develop the setup script that installs the benchmark, and the run script that executes the benchmark and returns the result in JSON format. This will be done for every possible benchmark that needs to be ported to Rally in order to be executed in a virtual machine of an OpenStack cloud.
1. Create/boot [1, n] VM(s), where n >= 1
 
2. Inject into the spawned VM(s) the setup_x.sh script, where x is the name of a benchmark. This script builds the benchmark in the virtual machine.
 
3. Inject into the VM(s) the run_x.sh script, where x is again the name of the benchmark. This script executes the benchmark in the virtual machine, process the output of the benchmark, and returns to Rally the ouput in a form that it can be stored into Rally's database.
 
4. With the processed results, we choose which kind of chart can visualize better this benchmark, and produce a chart for it in the html task report.
 
  
The above procedure reveals that a VM benchmark for Rally will be only 2 scripts: one that installs the benchmark, and a second one that executes it. Also, if a benchmark requires more than one VM in order to be run properly (ex. iperf3 that needs two machines) then with this method we can easily isolate the steps, first create your VMs, install them the benchmark, and then you are ready to execute it. In this case, the run script will be a bit more complex (in contrast with running a benchmark that requires only one machine) but this is inevitable, and we still have all the logic for execution in a separate file without polluting the codebase of Rally with functions for this particular benchmark.
+
== Source Code ==
  
== Status ==
+
All the code that was written during the official GSoC period for the development of this project.
  
Week 01 (May 19 - May 25)
+
=== Under Review ===
Implement the developing/testing environment. Testing environment: Single node architecture (services: keystone, glance, nova, nova-network) on a separate physical machine. Developing environment: Vim with appropriate plugins for Python development, on a FreeBSD desktop, with Rally installed on it (https://review.openstack.org/#/c/95341/).
+
* Add the benchmark 'Blogbench' for the Virtual Machines (https://review.openstack.org/#/c/97030/)
Research available benchmarks; selected primary source of information: Phoronix Test Suite, OpenBenchmarking.org. Tried the PTS Desktop Live.
+
* Add the VM scenario 'boot_benchmark_delete' (https://review.openstack.org/#/c/98172/)
 +
* Add the context 'benchmark_image' (https://review.openstack.org/#/c/104564/)
  
Week 02 (May 26 - June 01)
+
=== Merged ===
Design an initial architecture for the project with modularity and extensibility in mind. What is written in the "Description and Analysis" section is a product of this.
+
* Modify install_rally.sh to install under BSDs (https://review.openstack.org/#/c/95341/)
Search for a good (in respect of popularity and reliable results) disk/io benchmark. Tried locally bonnie++, dbench, blogbench. Implement and test the setup script for blogbench (https://review.openstack.org/#/c/95341/).
+
* Add error handling to install_rally.sh (https://review.openstack.org/#/c/98399/)
 +
* Change the default value of scenario_output (https://review.openstack.org/#/c/104180/)
 +
* Change the default value of use_public_urls (https://review.openstack.org/#/c/104924/)
 +
* Modify config semantic validation of benchmark engine (https://review.openstack.org/#/c/112981/)
 +
* Add required_contexts validator (https://review.openstack.org/#/c/111603/)
 +
* Decrease jobs time in gates (https://review.openstack.org/#/c/114839/)
 +
* Fix semantic validation of context images (https://review.openstack.org/#/c/113904/)
  
Week 03 (June 02 - June 08)
+
== Example: Blogbench ==
Under construction :-)
+
This example demonstrates the execution of the benchmark Blogbench in a virtual machine of an OpenStack (version: Icehouse) cloud using Rally. Below you can find the output of the whole procedure with the logging level set to warning, and with the logging level set to debug.
  
== Source Code ==
+
[http://aetos.it.teithe.gr/~tzabal/files/rally/blogbench-log-warning.txt blogbench-warning]
 +
 
 +
[http://aetos.it.teithe.gr/~tzabal/files/rally/blogbench-log-debug.txt blogbench-debug]
  
Any code written in the development of this project.
+
In the end of both outputs, there is a table called "Scenario Specific Results" that shows the results of the benchmark that run in the virtual machine(s). In this example, it was set to execute the benchmark only once in a virtual machine, for this reason the maximum, average, minimum, 90 percentile, and 95 percentile share the same value. The Blogbench benchmark outputs one value that corresponds to the final score of reads from the disk, and to the final score of writes to the disk, that have been done during 5 minutes of running the benchmark (bigger the number, better the score).
  
=== Pending Code Reviews ===
+
Note: the `rally-tool -r` command that is executed in the beginning is not part of Rally but a helper script that I used during the development to test faster my work. What it actually does is to automate (with hard-coded values) the steps that are needed to run any Rally benchmark scenario on my test environment. You may find it [https://github.com/tzabal/rally-tool here].
* Modify install_rally.sh to install under BSDs (https://review.openstack.org/#/c/95341/)
 
* Add benchmark 'Blogbench' for VMs (https://review.openstack.org/#/c/97030/)
 
  
 
== Links ==
 
== Links ==
[1] Blogbench, http://www.pureftpd.org/project/blogbench
+
* Blogbench, http://www.pureftpd.org/project/blogbench

Latest revision as of 12:35, 30 September 2014

Introduction

The ability to benchmark a Virtual Machine is an important activity that more and more developers will need to perform as they host their SaaS applications in a cloud. The aim of this project is to integrate into the Rally project the ability to run easily and in an automated manner various benchmarks for measuring the performance of the deployed Virtual Machines of an OpenStack cloud.

Description

The project can be divided in two parts. The first part has to do with the development of an architecture (some kind of framework) that defines a standard and easy way of porting different benchmarks to Rally, and in the second part, with the use of this framework, port existing popular benchmarks that are used to measure the performance of different aspects of a computer system.

For the first part, a new benchmark context (benchmark_image) is developed that generates an image that has installed all the required programs to run the specified benchmark. The context takes an image, a flavor and some other necessary information from the task configuration file of the benchmark scenario and boots a virtual machine. Then, using the already created users and their keypairs and security groups, it gains access to the virtual machine with SSH and executes the setup script of the specified benchmark. The setup script is a Bash script that installs the benchmark (and its dependencies) in the virtual machine. Finally, the context takes a snapshot of that virtual machine and returns the name of the newly created benchmark-ready image. Now the benchmark scenario uses the image that the context returned, in order to boot the virtual machine. It gains access to the virtual machine with SSH and executes the run script of the specified benchmark. The run script is a Python script that executes the benchmark and returns the results of it in JSON format.

For the second part, because of the architecture that was defined in the first part, there is only need to develop the setup script that installs the benchmark, and the run script that executes the benchmark and returns the result in JSON format. This will be done for every possible benchmark that needs to be ported to Rally in order to be executed in a virtual machine of an OpenStack cloud.

Source Code

All the code that was written during the official GSoC period for the development of this project.

Under Review

Merged

Example: Blogbench

This example demonstrates the execution of the benchmark Blogbench in a virtual machine of an OpenStack (version: Icehouse) cloud using Rally. Below you can find the output of the whole procedure with the logging level set to warning, and with the logging level set to debug.

blogbench-warning

blogbench-debug

In the end of both outputs, there is a table called "Scenario Specific Results" that shows the results of the benchmark that run in the virtual machine(s). In this example, it was set to execute the benchmark only once in a virtual machine, for this reason the maximum, average, minimum, 90 percentile, and 95 percentile share the same value. The Blogbench benchmark outputs one value that corresponds to the final score of reads from the disk, and to the final score of writes to the disk, that have been done during 5 minutes of running the benchmark (bigger the number, better the score).

Note: the `rally-tool -r` command that is executed in the beginning is not part of Rally but a helper script that I used during the development to test faster my work. What it actually does is to automate (with hard-coded values) the steps that are needed to run any Rally benchmark scenario on my test environment. You may find it here.

Links