Design Summit/Folsom/DevOps

= DevOps at the Folsom OpenStack Design Summit =

Accepted Tracks

 * Instrumenting OpenStack
 * https://blueprints.launchpad.net/nova/+spec/resource-monitor-alerts-and-notifications
 * https://blueprints.launchpad.net/nova/+spec/utilizationdata
 * https://blueprints.launchpad.net/nova/+spec/cloud-inventory-manager
 * Common image properties
 * OpenStack and Operations: Getting Real
 * High Availability in OpenStack
 * Federated Zones—meta-affinity with ServiceCatalogs
 * Making the configuration of openstack easier
 * Base Packaging Guidelines
 * Efficient metering for Nova and Swift

Ops-related Sessions in Other Tracks

 * Ops pain points
 * Improvements to Nova Service Management
 * Setting up an OpenStack CI install
 * Swift Cluster Monitoring With StatsD
 * Performance Testing OpenStack
 * Smoke Testing Realistic Deployments
 * Test Strategy, Processes, and Quality Metrics
 * Automating complex Openstack deployment testing
 * Dough: OpenStack Billing Project
 * Security improvements in Nova for Folsom
 * Puppet/OpenStack
 * Chef and OpenStack
 * OpenStack client tools (or unified tool)
 * Performance and Caching in Nova
 * High Availability in OpenStack
 * Performance Evaluation for OpenStack
 * Scaling Openstack
 * Guest agents support and implementation
 * Openstack Notifications System and Yagi

Proposed DevOps Tracks

 * Instrumenting OpenStack
 * Common image properties
 * OpenStack and Operations: Getting Real
 * High Availability in OpenStack
 * Federated Zones—meta-affinity with ServiceCatalogs
 * Making the configuration of openstack easier
 * Base Packaging Guidelines
 * Efficient metering for Nova and Swift

Refused DevOps-related Tracks

 * Templated Jenkins
 * How to make OpenStack rock even more on Ubuntu
 * Enhancing OpenStack for Federated OpenStack
 * Challenges in Enterprise Cloud Deployments
 * Kanyun: OpenStack Monitoring Project
 * Dodai

Mail List Input
From Matt Ray:


 * +1 for providing a documented monitoring API for the various components
 * other tools would know what is exposed and why.
 * Closely related: providing standardized logging and documenting error conditions
 * various tools could be applied to the logs (splunk, syslog, logstash, etc.)
 * Making OpenStack operationally consistent would be:
 * a boon to anyone doing tooling
 * rather than everyone having to rediscover what to look for
 * not sure it calls for another project (because of the cross-cutting concerns)
 * OpenStack operations need to be given more visibility and forethought

From Tim Bell:

Splitting monitoring into


 * 1) Gathering of metrics (availability, performance) and reporting in a standard fashion should be part of OpenStack.
 * 2) Best practice sensors should sample the metrics and provide alarms for issues which could cause service impacts. Posting of these alarms to a monitoring system should be based on plug ins
 * 3) Reference implementations for standard monitoring systems such as Nagios should be available that queries the data above and feeds it into the package selected

Each site does not want to be involved in defining the best practice. Equally, each monitoring system should not have to have an intimate understanding of OpenStack to produce a red/green light. The components for 1 and 2 fall under the associated openstack component. Component 3 is the monitoring solution provider.

From David Kranz:


 * cluster health and monitoring
 * I did a bunch of stuff with Swift before turning to nova
 * really appreciated the way each swift service has a "healthcheck" call that can be used by a monitoring system
 * I don't think providing a production-ready monitoring system should be part of core OpenStack
 * However, it is the core architects who really know what needs to be checked
 * it would be a big improvement for deployers if:
 * each openstack service provided healthcheck apis
 * these were based on expert knowledge of what is supposed to be happening inside
 * this would also insulate deployers from changes in the code that might impact what it means to be running properly

From Matt Joyce:


 * security governance in openstack
 * "encrypt all REST APIs" is not really acceptable
 * there needs to be some standardization of what is supported as an encryption procedure
 * important for people developing hooks into those APIs
 * this is not a problem with the APIs; it's a problem with security governance
 * poorly defined procedure for proposing, discussing, and approving a standard
 * we need something akin to an RFC model
 * possibly based on PEP approach?
 * current process is a mix of
 * developer specific conversations
 * no single authoritative source for information clearly identified
 * on the rare occasion there is, it changes fairly quickly
 * seems to be a lot of "act first, ask for input later" going on in development
 * maybe we can see some integration with groups like the IEEE
 * for specifying some standards for cloud orchestration, e.g.

From Ed Conzel:

I would like to see the topic expanded to propose a new sub-project around operational and host management.

This is one of the areas where OpenStack is weak and helps fuel the fire of immaturity that is mentioned in some of the articles lately.

I think having a formal project of operational and host management capabilities would go a long way to making OpenStack more attractive and competitive in this space. I know from [professional, customer experience] that the idea of having to re-invent the wheel for every deployment is, to say the least, aggravating. System Integrators love it the way it is since they can charge clients for the tools they have developed for previous clients. But, to make the OpenStack community project more attractive, these tools should be available as part of the full release….IMHO.

From Duncan McGreggor:
 * API pains
 * changes to existing tools
 * the creation of new tools
 * the proposal of a new sub-project around operational and host

 From MikePittaro :

I would like to determine whether there's any interest in adding a feature to log directly to a message queue(s). This would allow distributed logging, with the ability to consolidate all logs in a view, or in a stream for archival.