Jump to: navigation, search

Difference between revisions of "TripleO/TripleOCloud/Regions"

(What is it)
(Hardware size)
 
(16 intermediate revisions by 2 users not shown)
Line 3: Line 3:
  
 
Each region of this cloud contains:
 
Each region of this cloud contains:
* a bare metal cloud used by the TripleO admins team  
+
* a bare metal cloud used by the TripleO admins team to deploy things in the region
 
* a hypervisor based cloud (e.g. KVM) deployed on top of the bare metal cloud
 
* a hypervisor based cloud (e.g. KVM) deployed on top of the bare metal cloud
 
* one more more test environments for emulated bare metal testing deployed on top of the bare metal cloud
 
* one more more test environments for emulated bare metal testing deployed on top of the bare metal cloud
Line 9: Line 9:
 
= What is it for? =
 
= What is it for? =
  
Primarily for testing the deployment and production readiness of OpenStack. A set of test environments for testing OpenStack deployment logic is deployed on the bare metal cloud using OpenStack's bare metal deployment facilities. These are then used to run 'check' tests with emulated baremetal. Secondly we deploy OpenStack to bare metal and then run tempest against the deployed cloud, validating Hypervisor configuration and the ability to deploy to bare metal on that hardware.
+
Primarily for testing the deployment and production readiness of OpenStack. A set of test environments for testing OpenStack deployment logic is deployed on the bare metal cloud using OpenStack's bare metal deployment facilities. These are then used to run 'check' tests with emulated baremetal. Secondly when we have enough capacity we plan to deploy OpenStack to bare metal and then run tempest against the deployed cloud, validating Hypervisor configuration and the ability to deploy to bare metal on that hardware.
  
Finally, we permit active contributors to TripleO to use the cloud, because having users on the cloud finds issues :).
+
Finally, we permit active contributors to TripleO to use slack capacity from this cloud, because having users on the cloud finds issues :).
 +
 
 +
= Regions =
 +
 
 +
* HP: 48 machines, 24 cores / 96GB RAM / 2TB RAID 1 disk
 +
* RedHat: 15 machines, 24 cores / 94GB RAM / 6 HARD DRIVE, 300G, SAS6, 10K, 2.5, SGT-COMP
 +
 
 +
= How big is it / does it need to be ? =
 +
 
 +
There is currently one region, which is big enough to run single-node TripleO KVM gate runs (which is our current development goal). It's not big enough to run multi-node tests (so no live migration tests etc), nor is it big enough to run dedicated nova-BM functional tests and TripleO gate checks, nor other hypervisors or additional projects that cannot run within a nested virt environment. We will be able to calculate this much better once we have a full set of runs in place, and obviously we will work on optimising to get the most out of the hardware that we can.
 +
 
 +
As a data point - testing just the gate for bare metal deployed KVM clouds would have required 80 machines during the H release period : 20 patches landing per hour, but re-tests when things failed in the gate means more like 40 runs per hour, 2 machines per run, an hour budget for bare metal deployed runs. We can distribute test load across multiple regions, so contributing a fairly small region will still help the project!
  
 
= Contributing a region to the TripleOCloud =
 
= Contributing a region to the TripleOCloud =
  
Regions are contributed by interested parties.  
+
Regions are contributed by interested parties. Generally speaking the testing load is spread over all the regions - only tests that require specific hardware are localised to specific regions.
 +
 
 +
== Contact points ==
 +
 
 +
The TripleO PTL or anyone in the TripleO CD admins team can be contacted for information about contributing a region. The current TripleO PTL is Robert Collins - rbtcollins@hp.com
  
 
== Location ==
 
== Location ==
  
Contributors should provide the machines in an operational environment of their own with their own DC operations staff available to correct hardware and network issues with the environment.
+
Contributors should provide the machines in an operational environment of their own with their own DC operations staff available to correct hardware and network issues with the environment. No specific SLA is needed - by being multi-region we have intrinsic resiliency and no need to panic over issues in any single region.
 +
 
 +
== Hardware type ==
 +
 
 +
Any hardware that either trunk Nova baremetal or trunk Ironic can deploy to can be contributed. However, we're volunteering to operate the test environment, not to do driver development - that should be done prior to contributing the hardware to the test cloud. If you believe your hardware will work, but don't know, Ironic folk are happy to help you check that out - come see them on #openstack-ironic on irc.freenode.net.
  
 
== Hardware size ==
 
== Hardware size ==
  
Each region has a minimum size of 10 machines, but their size varies substantially. For instance the first region we brought on line has ~50 machines. Each machine needs to be large enough to run a production OpenStack configuration - not necessarily scaled up. Right now we say 2-core and 8GB of RAM + 1TB of disk is the minimum node size for sensible use. We don't have a preferred config because - scaling out gate tests requires lots of small nodes to meet the concurrency of the gate pipeline, but running check tests requires large nodes that can handle many VMs effectively, as each test environment runs multiple VMs on one node.
+
Each region has a minimum size of 10 machines, but their size varies substantially. For instance the first region we brought on line has ~50 machines. Each machine needs to be large enough to run a production OpenStack configuration - not necessarily scaled up.
 +
 
 +
Right now we say 2-core and 8GB of RAM + 1TB of disk is the minimum node size for sensible use. We don't have a preferred config because - scaling out gate tests requires lots of small nodes to meet the concurrency of the gate pipeline, but running check tests requires large nodes that can handle many VMs effectively, as each test environment runs multiple VMs on one node.
 +
 
 +
Right now, all our test runs are run virtually, so there is a slight preference for high density hardware, but in reality, any contribution is good, and obviously the more machines the better - we're ramping up to test all of openstack at the moment, and that needs many many more machines than we currently have.
  
 
== Networking ==
 
== Networking ==
  
We require a reasonable number of public IP addresses to permit openstack-infra's nodepool to spin up tests and drive them. The IPMI (or equivalent) network needs to be reachable from the machines in the rack, so that we can use Ironic/Nova baremetal to deploy to the same machines. This should be partitioned of from your own network.
+
We require a reasonable number of public IP addresses (a /25 IPv4 or IPv6 range) to permit openstack-infra's nodepool to spin up tests and drive them.
 +
 
 +
The IPMI (or equivalent)endpoints on each machine need to be reachable from the data plane in the rack - so that we can use Ironic/Nova baremetal to deploy to the same machines. Obviously this should be partitioned off from your own network.
  
 
10Gbps is preferred for the server LAN, but we can work with 1Gbps too.
 
10Gbps is preferred for the server LAN, but we can work with 1Gbps too.
Line 35: Line 60:
 
== Sysops ==
 
== Sysops ==
  
The TripleO CD admins team will do all system operations other than hardware interventions - that's something your local team needs to provide.
+
The TripleO CD admins team will do all machine system operations other than hardware interventions - that's something your local team needs to provide.
 +
 
 +
== Initial configuration ==
 +
 
 +
One machine should be configured by the team who are contributing the region. It should have Ubuntu or Fedora + the SSH key of one of the tripleo-cd admins, who can bring up the undercloud on there and scale the region out.

Latest revision as of 21:33, 13 May 2014

What is it

The TripleO cloud is a contributor run continuously deployed cloud, operated by the TripleO CD admins team, a team of trusted members of the TripleO community.

Each region of this cloud contains:

  • a bare metal cloud used by the TripleO admins team to deploy things in the region
  • a hypervisor based cloud (e.g. KVM) deployed on top of the bare metal cloud
  • one more more test environments for emulated bare metal testing deployed on top of the bare metal cloud

What is it for?

Primarily for testing the deployment and production readiness of OpenStack. A set of test environments for testing OpenStack deployment logic is deployed on the bare metal cloud using OpenStack's bare metal deployment facilities. These are then used to run 'check' tests with emulated baremetal. Secondly when we have enough capacity we plan to deploy OpenStack to bare metal and then run tempest against the deployed cloud, validating Hypervisor configuration and the ability to deploy to bare metal on that hardware.

Finally, we permit active contributors to TripleO to use slack capacity from this cloud, because having users on the cloud finds issues :).

Regions

  • HP: 48 machines, 24 cores / 96GB RAM / 2TB RAID 1 disk
  • RedHat: 15 machines, 24 cores / 94GB RAM / 6 HARD DRIVE, 300G, SAS6, 10K, 2.5, SGT-COMP

How big is it / does it need to be ?

There is currently one region, which is big enough to run single-node TripleO KVM gate runs (which is our current development goal). It's not big enough to run multi-node tests (so no live migration tests etc), nor is it big enough to run dedicated nova-BM functional tests and TripleO gate checks, nor other hypervisors or additional projects that cannot run within a nested virt environment. We will be able to calculate this much better once we have a full set of runs in place, and obviously we will work on optimising to get the most out of the hardware that we can.

As a data point - testing just the gate for bare metal deployed KVM clouds would have required 80 machines during the H release period : 20 patches landing per hour, but re-tests when things failed in the gate means more like 40 runs per hour, 2 machines per run, an hour budget for bare metal deployed runs. We can distribute test load across multiple regions, so contributing a fairly small region will still help the project!

Contributing a region to the TripleOCloud

Regions are contributed by interested parties. Generally speaking the testing load is spread over all the regions - only tests that require specific hardware are localised to specific regions.

Contact points

The TripleO PTL or anyone in the TripleO CD admins team can be contacted for information about contributing a region. The current TripleO PTL is Robert Collins - rbtcollins@hp.com

Location

Contributors should provide the machines in an operational environment of their own with their own DC operations staff available to correct hardware and network issues with the environment. No specific SLA is needed - by being multi-region we have intrinsic resiliency and no need to panic over issues in any single region.

Hardware type

Any hardware that either trunk Nova baremetal or trunk Ironic can deploy to can be contributed. However, we're volunteering to operate the test environment, not to do driver development - that should be done prior to contributing the hardware to the test cloud. If you believe your hardware will work, but don't know, Ironic folk are happy to help you check that out - come see them on #openstack-ironic on irc.freenode.net.

Hardware size

Each region has a minimum size of 10 machines, but their size varies substantially. For instance the first region we brought on line has ~50 machines. Each machine needs to be large enough to run a production OpenStack configuration - not necessarily scaled up.

Right now we say 2-core and 8GB of RAM + 1TB of disk is the minimum node size for sensible use. We don't have a preferred config because - scaling out gate tests requires lots of small nodes to meet the concurrency of the gate pipeline, but running check tests requires large nodes that can handle many VMs effectively, as each test environment runs multiple VMs on one node.

Right now, all our test runs are run virtually, so there is a slight preference for high density hardware, but in reality, any contribution is good, and obviously the more machines the better - we're ramping up to test all of openstack at the moment, and that needs many many more machines than we currently have.

Networking

We require a reasonable number of public IP addresses (a /25 IPv4 or IPv6 range) to permit openstack-infra's nodepool to spin up tests and drive them.

The IPMI (or equivalent)endpoints on each machine need to be reachable from the data plane in the rack - so that we can use Ironic/Nova baremetal to deploy to the same machines. Obviously this should be partitioned off from your own network.

10Gbps is preferred for the server LAN, but we can work with 1Gbps too.

There should be no other network services running in the rack as our Undercloud will be serving DHCP and net-booting the machines.

Sysops

The TripleO CD admins team will do all machine system operations other than hardware interventions - that's something your local team needs to provide.

Initial configuration

One machine should be configured by the team who are contributing the region. It should have Ubuntu or Fedora + the SSH key of one of the tripleo-cd admins, who can bring up the undercloud on there and scale the region out.