RealDeployments

= Real deployments =

This page documents details of real OpenStack deployments.

Mediawiki
Contact: Ryan Lane

Documentation

 * http://www.mediawiki.org/wiki/Wikimedia_Labs
 * http://wikitech.wikimedia.org/view/OpenStack
 * http://www.mediawiki.org/wiki/Extension:OpenStackManager
 * http://ryandlane.com/blog/wp-content/uploads/2012/02/Infrastructure-as-an-Open-Source-Project-FOSDEM-publish.odp

The ODP file of the FOSDEM talk has full notes if you switch to notes view.

Deployment scripts
Puppet repository, which has OpenStack manifests (for swift and nova) and some scripts used for managing gluster, nfs and ganglia in a per-project way:


 * https://gerrit.wikimedia.org/r/gitweb?p=operations/puppet.git;a=summary
 * https://wikitech.wikimedia.org/wiki/Help:Git#Restrictions_and_Anonymous_access

Blog posts
Ryan Lane's blog about how certain things are handled:


 * http://ryandlane.com/blog/2012/04/24/per-project-sudo-policies-using-sudo-ldap-and-puppet/
 * http://ryandlane.com/blog/2011/11/01/sharing-home-directories-to-instances-within-a-project-using-puppet-ldap-autofs-and-nova/
 * http://ryandlane.com/blog/2011/11/02/a-process-for-puppetization-of-a-service-using-nova/
 * http://ryandlane.com/blog/2011/01/24/announcing-openstackmanager-extension-for-mediawiki/
 * http://ryandlane.com/blog/2011/01/02/building-a-test-and-development-infrastructure-using-openstack/

Wikimedia blog about design decisions:


 * http://blog.wikimedia.org/2012/04/16/introduction-to-wikimedia-labs/

Current Diablo environment

 * ubuntu 10.11 oneiric
 * Bcfg2 configuration management
 * openstack Diablo via managed IT PPA
 * nova network 10GigE, VLAN manager
 * nova volume serivce using iscsi over ipoib
 * nginx load balancer / HA (frontend for all client API connections)
 * 2 x nova api servers, each with 4 instances
 * glance on gluster (over native ib to compute nodes)
 * keystone
 * dashboard
 * euca2ools via EC2 api
 * 500 compute nodes
 * IBM iDataplex
 * 2 x 2.6 intel nehalem
 * 24GB memory
 * 1GigE NIC
 * QDR infiniband (only used for storage atm)
 * ~100 users spread across ~15 tenants

Planned Essex environment

 * TBD

TryStack Dell Region
Contact: JayPipes

The first region established for 1TryStack features server hardware from Dell. There are 20 servers contained in five (5) Dell C6105s 2U server enclosures. Each server (four (4) in each of the 6105s server enclosures) contains:


 * 96GB RAM
 * 2 12-core Intel Xeon processors X5650 or AMD Opteron 4176HE
 * Two (2) 1GB network interface cards
 * ~5 TB usable disk space -- managed in a RAID10 setup

One (1) server -- freecloud-mgmt -- is used as a management server and runs the following services:


 * dnsmasq -- Used by all compute nodes to determine VMs IP addressing
 * chef-server -- http://localhost:4040/ -- The configuration management server used to deploy services into the service nodes user/passwd: admin/openstack
 * munin -- http://localhost:8081/ A networked resource monitoring tool useful in tracking performance and usage of resources. user/passwd: munin/openstack
 * nagios -- http://localhost:8082/ -- A different resource monitoring tool that does not need to have an agent installed on the tracked nodes (unlike Munin) user/passwd: munin/openstack
 * jenkins -- http://localhost:8080/ -- A continuous integration and deployment platform, used for running automated tasks

In addition to the above services, the management server also is responsible for:

-      The TryStack Chef cookbooks and recipes are at /root/openstack-chef/ -      The canonical repo is available on GitHub: https://github.com/trystack/openstack-chef/
 * git repositories for:

-      Done using PXE installs
 * Doing base operating system deploys into other service nodes

The other nineteen (19) servers are used as service nodes and run a variety of OpenStack servers and services that OpenStack depends on. These services may include one or more of the following:


 * mysql-server -- A MySQL database server
 * rabbitmq-server -- A RabbitMQ message queueing service
 * nova-api -- The OpenStack Compute API server
 * nova-scheduler -- The OpenStack Compute instance scheduling service
 * nova-compute -- The OpenStack Compute VM management service -- listens for messages sent from the nova-scheduler service and is responsible for performing actions such as launching, terminating or rebooting virtual machines
 * nova-network -- The OpenStack Compute networking service -- responds to messages sent from the nova-scheduler and nova-compute services to handle setting up of networking information for virtual machines
 * keystone -- The OpenStack Identity API server
 * glance-api -- The OpenStack Images API server
 * glance-registry -- The OpenStack Images Registry server
 * dashboard -- The OpenStack Dashboard server -- web-based console for users and administrators of TryStack

Network Architecture
A single Cisco 4948-10GE switch is in use and it is used to route a private management network for the 20 server nodes as well as provide access to the public Internet.

The freecloud-mgmt server runs a dnsmasq server and publishes a gateway for the rest of the other host machines at 10.0.100.1. The other 19 hosts set their default gateway to 10.0.100.1 and their eth0 interfaces are set to 10.0.100.101 through 10.0.100.118, making the management network. eth1 interfaces are used for the public network addresses of nodes, if any are needed.

High Availability (HA) Service Configuration
There are six (6) service nodes that are deployed with heartbeat and DRBD. Three (3) nodes are set as the active servers and three are set as the standby servers. Thus, each combination of critical OpenStack services run on a pair of servers, with heartbeat monitoring the health of the active server and, on failure of the active server, redirects traffic from the IP address of the failed node to the standby node.

The pairs of active/standby servers act as redundant nodes providing a given set of related services:

“Front-end Web” -- nova-api, nova-scheduler, keystone, horizon “Database and Message Queue Server” -- mysql-server, rabbitmq-server “Image Service” -- glance-api, glance-registry

CERN
Contact: Tim Bell (tim.bell@cern.ch)

Presentations and Documentation

 * San Diego 2012 Summit - http://www.slideshare.net/noggin143/20121017-openstack-accelerating-science
 * Overall project description (including other components) - http://cern.ch/go/N8wp
 * User guide for the facility is at http://clouddocs.web.cern.ch/clouddocs/

Deployment
The environment is largely based on Scientific Linux 6, which is Red Hat compatible. We use KVM as our primary hypervisor although tests are ongoing with Hyper-V on Windows Server 2008.

We use the puppetlabs OpenStack modules to configure Nova, Glance, Keystone and Horizon. Puppet is used widely within the guest configuration also and Foreman as a GUI for reporting and VM provisioning.

Users and Groups are managed through Active Directory and imported into Keystone using LDAP.

CLIs are available for Nova and Euca2ools.

Areas currently being investigated

 * Block storage for live migration and Cinder
 * Integration with CERN Single Sign On

Current Status
We currently are running around 250 hypervisors with around 1000 VMs.