- 1 Real deployments
- 1.1 Mediawiki
- 1.2 Argonne National Labs (DOE Magellan)
- 1.3 TryStack Dell Region
- 1.4 CERN
This page documents details of real OpenStack deployments.
Contact: Ryan Lane
The ODP file of the FOSDEM talk has full notes if you switch to notes view.
Puppet repository, which has OpenStack manifests (for swift and nova) and some scripts used for managing gluster, nfs and ganglia in a per-project way:
Ryan Lane's blog about how certain things are handled:
Wikimedia blog about design decisions:
Argonne National Labs (DOE Magellan)
Current Diablo environment
- ubuntu 10.11 oneiric
- Bcfg2 configuration management
- openstack Diablo via managed IT PPA
- nova network 10GigE, VLAN manager
- nova volume serivce using iscsi over ipoib
- nginx load balancer / HA (frontend for all client API connections)
- 2 x nova api servers, each with 4 instances
- glance on gluster (over native ib to compute nodes)
- euca2ools via EC2 api
- 500 compute nodes
- IBM iDataplex
- 2 x 2.6 intel nehalem
- 24GB memory
- 1GigE NIC
- QDR infiniband (only used for storage atm)
- ~100 users spread across ~15 tenants
Planned Essex environment
TryStack Dell Region
The first region established for 1TryStack features server hardware from Dell. There are 20 servers contained in five (5) Dell C6105s 2U server enclosures. Each server (four (4) in each of the 6105s server enclosures) contains:
- 96GB RAM
- 2 12-core Intel Xeon processors X5650 or AMD Opteron 4176HE
- Two (2) 1GB network interface cards
- ~5 TB usable disk space -- managed in a RAID10 setup
One (1) server -- freecloud-mgmt -- is used as a management server and runs the following services:
- dnsmasq -- Used by all compute nodes to determine VMs IP addressing
- chef-server -- http://localhost:4040/ -- The configuration management server used to deploy services into the service nodes user/passwd: admin/openstack
- munin -- http://localhost:8081/ A networked resource monitoring tool useful in tracking performance and usage of resources. user/passwd: munin/openstack
- nagios -- http://localhost:8082/ -- A different resource monitoring tool that does not need to have an agent installed on the tracked nodes (unlike Munin) user/passwd: munin/openstack
- jenkins -- http://localhost:8080/ -- A continuous integration and deployment platform, used for running automated tasks
In addition to the above services, the management server also is responsible for:
- git repositories for:
- The TryStack Chef cookbooks and recipes are at /root/openstack-chef/ - The canonical repo is available on GitHub: https://github.com/trystack/openstack-chef/
- Doing base operating system deploys into other service nodes
- Done using PXE installs
- mysql-server -- A MySQL database server
- rabbitmq-server -- A RabbitMQ message queueing service
- nova-api -- The OpenStack Compute API server
- nova-scheduler -- The OpenStack Compute instance scheduling service
- nova-compute -- The OpenStack Compute VM management service -- listens for messages sent from the nova-scheduler service and is responsible for performing actions such as launching, terminating or rebooting virtual machines
- nova-network -- The OpenStack Compute networking service -- responds to messages sent from the nova-scheduler and nova-compute services to handle setting up of networking information for virtual machines
- keystone -- The OpenStack Identity API server
- glance-api -- The OpenStack Images API server
- glance-registry -- The OpenStack Images Registry server
- dashboard -- The OpenStack Dashboard server -- web-based console for users and administrators of TryStack
A single Cisco 4948-10GE switch is in use and it is used to route a private management network for the 20 server nodes as well as provide access to the public Internet.
The freecloud-mgmt server runs a dnsmasq server and publishes a gateway for the rest of the other host machines at 10.0.100.1. The other 19 hosts set their default gateway to 10.0.100.1 and their eth0 interfaces are set to 10.0.100.101 through 10.0.100.118, making the management network. eth1 interfaces are used for the public network addresses of nodes, if any are needed.
High Availability (HA) Service Configuration
There are six (6) service nodes that are deployed with heartbeat and DRBD. Three (3) nodes are set as the active servers and three are set as the standby servers. Thus, each combination of critical OpenStack services run on a pair of servers, with heartbeat monitoring the health of the active server and, on failure of the active server, redirects traffic from the IP address of the failed node to the standby node.
The pairs of active/standby servers act as redundant nodes providing a given set of related services:
“Front-end Web” -- nova-api, nova-scheduler, keystone, horizon “Database and Message Queue Server” -- mysql-server, rabbitmq-server “Image Service” -- glance-api, glance-registry
Contact: Tim Bell (firstname.lastname@example.org)
Presentations and Documentation
- San Diego 2012 Summit - http://www.slideshare.net/noggin143/20121017-openstack-accelerating-science
- Overall project description (including other components) - http://cern.ch/go/N8wp
- User guide for the facility is at http://clouddocs.web.cern.ch/clouddocs/
The environment is largely based on Scientific Linux 6, which is Red Hat compatible. We use KVM as our primary hypervisor although tests are ongoing with Hyper-V on Windows Server 2008.
We use the puppetlabs OpenStack modules to configure Nova, Glance, Keystone and Horizon. Puppet is used widely within the guest configuration also and Foreman as a GUI for reporting and VM provisioning.
Users and Groups are managed through Active Directory and imported into Keystone using LDAP.
CLIs are available for Nova and Euca2ools.
Areas currently being investigated
- Block storage for live migration and Cinder
- Integration with CERN Single Sign On
We currently are running around 250 hypervisors with around 1000 VMs.