Real deployments

This page documents details of real OpenStack deployments.

Mediawiki

Documentation

The ODP file of the FOSDEM talk has full notes if you switch to notes view.

Deployment scripts

Puppet repository, which has OpenStack manifests (for swift and nova) and some scripts used for managing gluster, nfs and ganglia in a per-project way:

Blog posts

Ryan Lane's blog about how certain things are handled:

Wikimedia blog about design decisions:

http://blog.wikimedia.org/2012/04/16/introduction-to-wikimedia-labs/

Argonne National Labs (DOE Magellan)

Current Diablo environment

ubuntu 10.11 oneiric
Bcfg2 configuration management
openstack Diablo via managed IT PPA
nova network 10GigE, VLAN manager
nova volume serivce using iscsi over ipoib
nginx load balancer / HA (frontend for all client API connections)
2 x nova api servers, each with 4 instances
glance on gluster (over native ib to compute nodes)
keystone
dashboard
euca2ools via EC2 api
500 compute nodes
IBM iDataplex
2 x 2.6 intel nehalem
24GB memory
1GigE NIC
QDR infiniband (only used for storage atm)
~100 users spread across ~15 tenants

Planned Essex environment

TBD

TryStack Dell Region

Contact: JayPipes

The first region established for 1TryStack features server hardware from Dell. There are 20 servers contained in five (5) Dell C6105s 2U server enclosures. Each server (four (4) in each of the 6105s server enclosures) contains:

96GB RAM
2 12-core Intel Xeon processors X5650 or AMD Opteron 4176HE
Two (2) 1GB network interface cards
~5 TB usable disk space -- managed in a RAID10 setup

One (1) server -- freecloud-mgmt -- is used as a management server and runs the following services:

dnsmasq -- Used by all compute nodes to determine VMs IP addressing
chef-server -- http://localhost:4040/ -- The configuration management server used to deploy services into the service nodes user/passwd: admin/openstack
munin -- http://localhost:8081/ A networked resource monitoring tool useful in tracking performance and usage of resources. user/passwd: munin/openstack
nagios -- http://localhost:8082/ -- A different resource monitoring tool that does not need to have an agent installed on the tracked nodes (unlike Munin) user/passwd: munin/openstack
jenkins -- http://localhost:8080/ -- A continuous integration and deployment platform, used for running automated tasks

In addition to the above services, the management server also is responsible for:

git repositories for:

 -       The TryStack Chef cookbooks and recipes are at /root/openstack-chef/
 -       The canonical repo is available on GitHub: https://github.com/trystack/openstack-chef/

Doing base operating system deploys into other service nodes

 -       Done using PXE installs

The other nineteen (19) servers are used as service nodes and run a variety of OpenStack servers and services that OpenStack depends on. These services may include one or more of the following:

mysql-server -- A MySQL database server
rabbitmq-server -- A RabbitMQ message queueing service
nova-api -- The OpenStack Compute API server
nova-scheduler -- The OpenStack Compute instance scheduling service
nova-compute -- The OpenStack Compute VM management service -- listens for messages sent from the nova-scheduler service and is responsible for performing actions such as launching, terminating or rebooting virtual machines
nova-network -- The OpenStack Compute networking service -- responds to messages sent from the nova-scheduler and nova-compute services to handle setting up of networking information for virtual machines
keystone -- The OpenStack Identity API server
glance-api -- The OpenStack Images API server
glance-registry -- The OpenStack Images Registry server
dashboard -- The OpenStack Dashboard server -- web-based console for users and administrators of TryStack

Network Architecture

A single Cisco 4948-10GE switch is in use and it is used to route a private management network for the 20 server nodes as well as provide access to the public Internet.

The freecloud-mgmt server runs a dnsmasq server and publishes a gateway for the rest of the other host machines at 10.0.100.1. The other 19 hosts set their default gateway to 10.0.100.1 and their eth0 interfaces are set to 10.0.100.101 through 10.0.100.118, making the management network. eth1 interfaces are used for the public network addresses of nodes, if any are needed.

High Availability (HA) Service Configuration

There are six (6) service nodes that are deployed with heartbeat and DRBD. Three (3) nodes are set as the active servers and three are set as the standby servers. Thus, each combination of critical OpenStack services run on a pair of servers, with heartbeat monitoring the health of the active server and, on failure of the active server, redirects traffic from the IP address of the failed node to the standby node.

The pairs of active/standby servers act as redundant nodes providing a given set of related services:

   “Front-end Web” -- nova-api, nova-scheduler, keystone, horizon
   “Database and Message Queue Server” -- mysql-server, rabbitmq-server
   “Image Service” -- glance-api, glance-registry

CERN

Contact: Tim Bell (tim.bell@cern.ch)

Presentations and Documentation

San Diego 2012 Summit - http://www.slideshare.net/noggin143/20121017-openstack-accelerating-science
Overall project description (including other components) - http://cern.ch/go/N8wp
User guide for the facility is at http://clouddocs.web.cern.ch/clouddocs/

Deployment

The environment is largely based on Scientific Linux 6, which is Red Hat compatible. We use KVM as our primary hypervisor although tests are ongoing with Hyper-V on Windows Server 2008.

We use the puppetlabs OpenStack modules to configure Nova, Glance, Keystone and Horizon. Puppet is used widely within the guest configuration also and Foreman as a GUI for reporting and VM provisioning.

Users and Groups are managed through Active Directory and imported into Keystone using LDAP.

CLIs are available for Nova and Euca2ools.

Areas currently being investigated

Block storage for live migration and Cinder
Integration with CERN Single Sign On

Current Status

We currently are running around 250 hypervisors with around 1000 VMs.

RealDeployments

Contents