Jump to: navigation, search

Difference between revisions of "RealDeployments"

(Adds TryStack)
Line 4: Line 4:
 
This page documents details of real OpenStack deployments.  
 
This page documents details of real OpenStack deployments.  
  
== Wikimedia ==
+
== Mediawiki ==
  
 
Contact: [http://ryandlane.com Ryan Lane]
 
Contact: [http://ryandlane.com Ryan Lane]
Line 64: Line 64:
  
 
* TBD
 
* TBD
 +
 +
== [[TryStack]] Dell Region ==
 +
 +
The first region established for
 +
1TryStack features server hardware from Dell. There are 20 servers contained in five (5) Dell C6105s 2U server enclosures. Each server (four (4) in each of the 6105s server enclosures) contains:
 +
 +
* 96GB RAM
 +
* 2 12-core Intel Xeon processors X5650 or AMD Opteron 4176HE
 +
* Two (2) 1GB network interface cards
 +
* ~5 TB usable disk space -- managed in a RAID10 setup
 +
 +
One (1) server -- freecloud-mgmt -- is used as a management server and runs the following services:
 +
 +
* dnsmasq -- Used by all compute nodes to determine VMs IP addressing
 +
* chef-server -- http://localhost:4040/ -- The configuration management server used to deploy services into the service nodes user/passwd: admin/openstack
 +
* munin -- http://localhost:8081/ A networked resource monitoring tool useful in tracking performance and usage of resources. user/passwd: munin/openstack
 +
* nagios -- http://localhost:8082/ -- A different resource monitoring tool that does not need to have an agent installed on the tracked nodes (unlike Munin) user/passwd: munin/openstack
 +
* jenkins -- http://localhost:8080/ -- A continuous integration and deployment platform, used for running automated tasks
 +
 +
In addition to the above services, the management server also is responsible for:
 +
 +
* git repositories for:
 +
  -      The [[TryStack]] Chef cookbooks and recipes are at /root/openstack-chef/
 +
  -      The canonical repo is available on [[GitHub]]: https://github.com/trystack/openstack-chef/
 +
 +
* Doing base operating system deploys into other service nodes
 +
  -      Done using PXE installs
 +
 +
The other nineteen (19) servers are used as service nodes and run a variety of [[OpenStack]] servers and services that [[OpenStack]] depends on. These services may include one or more of the following:
 +
 +
* mysql-server -- A MySQL database server
 +
* rabbitmq-server -- A RabbitMQ message queueing service
 +
* nova-api -- The [[OpenStack]] Compute API server
 +
* nova-scheduler -- The [[OpenStack]] Compute instance scheduling service
 +
* nova-compute -- The [[OpenStack]] Compute VM management service -- listens for messages sent from the nova-scheduler service and is responsible for performing actions such as launching, terminating or rebooting virtual machines
 +
* nova-network -- The [[OpenStack]] Compute networking service -- responds to messages sent from the nova-scheduler and nova-compute services to handle setting up of networking information for virtual machines
 +
* keystone -- The [[OpenStack]] Identity API server
 +
* glance-api -- The [[OpenStack]] Images API server
 +
* glance-registry -- The [[OpenStack]] Images Registry server
 +
* dashboard -- The [[OpenStack]] Dashboard server -- web-based console for users and administrators of [[TryStack]]
 +
 +
=== Network Architecture ===
 +
 +
A single Cisco 4948-10GE switch is in use and it is used to route a private management network for the 20 server nodes as well as provide access to the public Internet.
 +
 +
The freecloud-mgmt server runs a dnsmasq server and publishes a gateway for the rest of the other host machines at 10.0.100.1. The other 19 hosts set their default gateway to 10.0.100.1 and their eth0 interfaces are set to 10.0.100.101 through 10.0.100.118, making the management network. eth1 interfaces are used for the public network addresses of nodes, if any are needed.
 +
 +
Below is a diagram illustrating the overall topology of the Dell Region, including active/standby groups of servers and private/public IP addressing in the cluster.
 +
 +
=== High Availability (HA) Service Configuration ===
 +
 +
There are six (6) service nodes that are deployed with heartbeat and DRBD. Three (3) nodes are set as the active servers and three are set as the standby servers. Thus, each combination of critical [[OpenStack]] services run on a pair of servers, with heartbeat monitoring the health of the active server and, on failure of the active server, redirects traffic from the IP address of the failed node to the standby node.
 +
 +
The pairs of active/standby servers act as redundant nodes providing a given set of related services:
 +
 +
    “Front-end Web” -- nova-api, nova-scheduler, keystone, horizon
 +
    “Database and Message Queue Server” -- mysql-server, rabbitmq-server
 +
    “Image Service” -- glance-api, glance-registry
 +
 +
Fig1. Physical setup of [[TryStack]]

Revision as of 21:19, 7 May 2012

Real deployments

This page documents details of real OpenStack deployments.

Mediawiki

Contact: Ryan Lane

Documentation

The ODP file of the FOSDEM talk has full notes if you switch to notes view.

Deployment scripts

Puppet repository, which has OpenStack manifests (for swift and nova) and some scripts used for managing gluster, nfs and ganglia in a per-project way:

Blog posts

Ryan Lane's blog about how certain things are handled:

Wikimedia blog about design decisions:

Argonne National Labs (DOE Magellan)

Current Diablo environment

  • ubuntu 10.11 oneiric
  • Bcfg2 configuration management
  • openstack Diablo via managed IT PPA
  • nova network 10GigE, VLAN manager
  • nova volume serivce using iscsi over ipoib
  • nginx load balancer / HA (frontend for all client API connections)
  • 2 x nova api servers, each with 4 instances
  • glance on gluster (over native ib to compute nodes)
  • keystone
  • dashboard
  • euca2ools via EC2 api
  • 500 compute nodes
  • IBM iDataplex
  • 2 x 2.6 intel nehalem
  • 24GB memory
  • 1GigE NIC
  • QDR infiniband (only used for storage atm)
  • ~100 users spread across ~15 tenants

Planned Essex environment

  • TBD

TryStack Dell Region

The first region established for 1TryStack features server hardware from Dell. There are 20 servers contained in five (5) Dell C6105s 2U server enclosures. Each server (four (4) in each of the 6105s server enclosures) contains:

  • 96GB RAM
  • 2 12-core Intel Xeon processors X5650 or AMD Opteron 4176HE
  • Two (2) 1GB network interface cards
  • ~5 TB usable disk space -- managed in a RAID10 setup

One (1) server -- freecloud-mgmt -- is used as a management server and runs the following services:

  • dnsmasq -- Used by all compute nodes to determine VMs IP addressing
  • chef-server -- http://localhost:4040/ -- The configuration management server used to deploy services into the service nodes user/passwd: admin/openstack
  • munin -- http://localhost:8081/ A networked resource monitoring tool useful in tracking performance and usage of resources. user/passwd: munin/openstack
  • nagios -- http://localhost:8082/ -- A different resource monitoring tool that does not need to have an agent installed on the tracked nodes (unlike Munin) user/passwd: munin/openstack
  • jenkins -- http://localhost:8080/ -- A continuous integration and deployment platform, used for running automated tasks

In addition to the above services, the management server also is responsible for:

  • git repositories for:
 -       The TryStack Chef cookbooks and recipes are at /root/openstack-chef/
 -       The canonical repo is available on GitHub: https://github.com/trystack/openstack-chef/
  • Doing base operating system deploys into other service nodes
 -       Done using PXE installs

The other nineteen (19) servers are used as service nodes and run a variety of OpenStack servers and services that OpenStack depends on. These services may include one or more of the following:

  • mysql-server -- A MySQL database server
  • rabbitmq-server -- A RabbitMQ message queueing service
  • nova-api -- The OpenStack Compute API server
  • nova-scheduler -- The OpenStack Compute instance scheduling service
  • nova-compute -- The OpenStack Compute VM management service -- listens for messages sent from the nova-scheduler service and is responsible for performing actions such as launching, terminating or rebooting virtual machines
  • nova-network -- The OpenStack Compute networking service -- responds to messages sent from the nova-scheduler and nova-compute services to handle setting up of networking information for virtual machines
  • keystone -- The OpenStack Identity API server
  • glance-api -- The OpenStack Images API server
  • glance-registry -- The OpenStack Images Registry server
  • dashboard -- The OpenStack Dashboard server -- web-based console for users and administrators of TryStack

Network Architecture

A single Cisco 4948-10GE switch is in use and it is used to route a private management network for the 20 server nodes as well as provide access to the public Internet.

The freecloud-mgmt server runs a dnsmasq server and publishes a gateway for the rest of the other host machines at 10.0.100.1. The other 19 hosts set their default gateway to 10.0.100.1 and their eth0 interfaces are set to 10.0.100.101 through 10.0.100.118, making the management network. eth1 interfaces are used for the public network addresses of nodes, if any are needed.

Below is a diagram illustrating the overall topology of the Dell Region, including active/standby groups of servers and private/public IP addressing in the cluster.

High Availability (HA) Service Configuration

There are six (6) service nodes that are deployed with heartbeat and DRBD. Three (3) nodes are set as the active servers and three are set as the standby servers. Thus, each combination of critical OpenStack services run on a pair of servers, with heartbeat monitoring the health of the active server and, on failure of the active server, redirects traffic from the IP address of the failed node to the standby node.

The pairs of active/standby servers act as redundant nodes providing a given set of related services:

   “Front-end Web” -- nova-api, nova-scheduler, keystone, horizon
   “Database and Message Queue Server” -- mysql-server, rabbitmq-server
   “Image Service” -- glance-api, glance-registry

Fig1. Physical setup of TryStack