Jump to: navigation, search

Difference between revisions of "Rally/RoadMap"

m (Deployers)
(Benchmarking Engine)
Line 1: Line 1:
  
= Benchmarking Engine=  
+
= Benchmarking =  
 +
TBD
  
=== Add support of Users & Tenants out of box ===
+
== Context ==
 +
TBD
  
At this moment we are supporting next 3 parameters:
+
== Runners ==
# timeout - this is the timeout of 1 scenario loop
+
TBD
# times - how much loops of scenario to run
 
# concurrent - how much loops should be run simultaneously
 
  
All tests are run from one user => it is not real situations.
+
== Scenarios ==
 +
TBD
  
We are going to add two new parameters:
+
== Production Read Clean Up ==
# tenants - how much tenants to create
+
TBD
# users_pro_tenant - how much users should be in each tenant
 
  
 +
== Non Admin support ==
 +
TBD
  
Benchmark Engine will create all tenants & users, and prepare OpenStack python clients, before starting benchmarking.
+
== Pre Created Users ==
+
TBD
  
=== Add generic cleanup mechanism ===
+
= CLI =
 +
TBD
  
In benchmarks we are creating a lot of different resources: Tenants, Users, VMs, Snapshots, Block devices.
+
= Rally-as-a-Service =
 +
TBD
  
If something went wrong, or test is not well written we will get a lot of allocated resources that could make influence on next benchmarks. So we should clean up our OpenStack.
+
= Verification =
 +
TBD
  
Such generic cleanup could be easily implemented. As we are creating all resources, using tenants, created before running benchmark Scenario.
+
= CI/CD=
 +
TBD
  
We need to make only 2 steps:
+
= Unit & Functional testing =
# Purge resources of each user
+
TBD
# Destroy all users & tenants
 
 
 
 
 
=== Run multiple scenarios simultaneously ===
 
 
 
Okay we now how to make load on Nova: Boot & Destroy VM scenario.
 
 
 
But how will influence huge load of another Scenario: Create & Destroy Block device on Nova Scenario?
 
 
 
This could be also easily extended. For example we will get special name for such benchmarks:
 
<pre>
 
benchmark: {
 
  "@composed" : {
 
      "NovaServers.boot_and_destroy": [...],
 
      ....
 
  },
 
  ""
 
}
 
</pre>
 
 
 
=== More scenarios ===
 
 
 
More benchmark scenarios - for different parts of OpenStack API - are to be impemented:
 
 
 
====Nova====
 
 
 
Assuming that we have prepared a server (and possibly also some floating IPs) in init(), which will be then deleted in cleanup(), like:
 
    def init(cls, config):
 
        context = {'prepared_server': cls.nova.servers.create(...)}
 
        ...
 
        return context
 
    ...
 
  def cleanup(cls, context):
 
        context['prepared_server'].delete()
 
        ...
 
 
 
- the following benchmark scenarios will be implemented in the near future:
 
 
 
'''rebooting server'''
 
    def reboot_server(cls, context):
 
        context['prepared_server'].reboot()
 
 
 
'''suspending/resuming server'''
 
    def suspend_and_resume_server(cls, context):
 
        context['prepared_server'].suspend()
 
        context['prepared_server'].resume()
 
 
 
'''shelving/unshelving server'''
 
    def shelve_and_unshelve_server(cls, context):
 
        context['prepared_server'].shelve()
 
        context['prepared_server'].unshelve()
 
 
 
'''associating floating ips'''
 
    def associate_floating_ips(cls, context):
 
        for floating_ip in context['prepared_floating_ips']:
 
            context["prepared_server"].add_floating_ip(floating_ip)
 
        for floating_ip in context['prepared_floating_ips']:
 
            context["prepared_server"].remove_floating_ip(floating_ip)
 
 
 
 
 
====Cinder====
 
 
 
Assuming that we have prepared a server and a volume in init(), which will be then deleted in cleanup(), like:
 
    def init(cls, config):
 
        context = {'prepared_server': cls.nova.servers.create(...)}
 
        context = {'prepared_volume': cls.conder.volumes.create(...)}
 
        ...
 
        return context
 
    ...
 
  def cleanup(cls, context):
 
        context['prepared_server'].delete()
 
        context['prepared_volume'].delete()
 
        ...
 
 
 
- the following benchmark scenarios will be implemented in the near future:
 
 
 
'''create/delete volume'''
 
    def create_and_delete_colume(cls, context):
 
        volume = cls.cinder.volumes.create(...)
 
        volume.delete()
 
 
 
'''attach/detach volume'''
 
    def attach_and_detach_volume(cls, context):
 
        context['prepared_volume'].attach(context['prepared_server'], ...)
 
        context['prepared_volume'].detach()
 
 
 
'''reserve/unreserve volume'''
 
    def reserve_and_unreserve_volume(cls, context):
 
        context['prepared_volume'].reserve()
 
        context['prepared_volume'].unreserve()
 
 
 
'''set/delete metadata'''
 
    def set_and_delete_metadata(cls, context):
 
        cls.cinder.volumes.set_metadata(context['prepared_volume'], fake_meta)
 
        cls.cinder.volumes.delete_metadata(context['prepared_volume'], fake_meta.keys())
 
 
 
'''extend volume'''
 
    def extend_volume(cls, context):
 
        cls.cinder.volumes.extend(context['prepared_volume'], context['prepared_volume'].size*2)
 
        cls.cinder.volumes.extend(context['prepared_volume'], context['prepared_volume'].size/2)
 
  
 
= Data processing =
 
= Data processing =

Revision as of 14:12, 16 July 2014

Benchmarking

TBD

Context

TBD

Runners

TBD

Scenarios

TBD

Production Read Clean Up

TBD

Non Admin support

TBD

Pre Created Users

TBD

CLI

TBD

Rally-as-a-Service

TBD

Verification

TBD

CI/CD

TBD

Unit & Functional testing

TBD

Data processing

At this moment only thing that we have is getting tables with: min, max, avg. As a first step good=) But we need more!


Data aggregation

to be done...


Graphics & Plots

  • Simple plot Time of Loop / Iteration

Rally plot


  • Histogram of loop times

Rally histgoram

Profiling

Improve & Merge Tomograph into upstream

To collect profiling data we use a small library Tomograph that was created to be used with OpenStack (needs to be improved as well). Profiling data is collected by inserting profiling/log points in OpenStack source code and adding event listeners/hooks on events from 3rd party libraries that support such events (e.g. sqlalchemy). Currently our patches are applied to OpenStack code during cloud deployment. For easier maintenance and better profiling results profiler could be integrated as an oslo component and in such case it’s patches could be merged to upstream. Profiling itself would be managed by configuration options.


Improve Zipkin or use something else

Currently we use Zipkin as a collector and visualization service but in future we plan to replace it with something more suitable in terms of load (possibly Ceilometer?) and improve visualization format (need better charts).

Some early results you can see here:

Make it work out of box

Few things should be done:

  1. Merge into upstream Tomograph that will send LOGs to Zipkin
  2. Bind Tomograph with Benchmark Engine
  3. Automate installation of Zipkin from Rally


Server providing

  1. Improve VirshProvider
    1. Implement netinstall linux on VM (currently implemented only cloning existing VM).
    2. Add support zfs/lvm2 for fast cloning
  2. Implement LxcProvider
    1. This provider about to be used for fast deployment large amount of instances.
    2. Support zfs clones for fast deployment.
  3. Implement AmazonProvider
    1. Get your VMs from Amazon

Deployers

  1. Implement MultihostEngine - this engine will deploy multihost configuration using existing engines.
  2. Implement Dev FUEL based engine - deploy OpenStack using FUEL on existing servers or VMs
  3. Implement Full FUEL based engine - deploy OpenStack with Fuel on bare metal nodes.
  4. Implement Tripple-O based engine - deploy OpenStack on bare metal nodes using Tripple-O.