Launchpad Entry: http://blueprints.launchpad.net/openstack-devel/+spec/openstack-dbaas
Created:
Contributors: Michael Basnight (That dumb guy with glasses), Daniel Salinas (MySQL/OpenVZ idol) , Edward Konetzko (Linux Deity), Nirmal Ranganathan (Code Ninja), Tim Simpson (Test Zealot/Voiceover), Chris Sackmann (HP Lefthand/Windows-somebody has to do it- expert)
Summary
Databases as a service is a scalable relational database service that allows users to quickly and easily utilize the features of a relational database without the burden of handling complex administrative tasks. Cloud users and database administrators can provision and manage multiple database instances as needed. Initially, the service will focus on providing resource isolation at high performance while automating complex administrative tasks including deployment, configuration, patching, backups, restores, and monitoring.
Utilizing the container based virtualization provided by OpenVZ (http://wiki.openvz.org/Main_Page), we are able to provide a superior level of service with QoS guarantees and resource isolation. We effectively prevent the bad neighbor effect while maintaining a level of service needed for a performant MySQL database. An in-rack network raid block level storage system rounds out the service and allow for seamless migrations and disaster recovery between database racks.
Release Note
We have codenamed the project Red Dwarf. This is both a play on the star theme and a homage to an amazing television show from the late 80's. For in depth awesomeness, see http://www.youtube.com/watch?v=LRq_SAuQDec "I toast, therefore I AM"
Rationale
MySQL in a pure virtualization environment is not effective. Users pay for a disk abstraction (VDI) penalty by typically using overly large table caching, increasing the need for a large amount of ram which is used to bypass the disk penalties. This leads to overused ram for caching to align with a service level that would be achieved from a bare metal or container based solution. OpenVZ uses beancounters (http://wiki.openvz.org/UBC) to allow admins and applications to limit containers from encroaching on neighbors on a host. We are choosing to mount the MySQL data volumes on a remote SAN using the volume manager. We will currently be using the HP SAN Driver in the volume manager, but any volume driver implementations can be used to accomplish remote data storage.
User stories
TBD
Assumptions
- A guest agent (nova-guest) must reside within the container or VM. We are currently using package management to install mysql and the nova components on the guest, but there is nothing within the nova codebase or the openvz branch that precludes the user from using 1) another virtualization tech and bypassing openVZ, and 2) setting up MySQL and a guest agent within the virtual image before installing it. Once the guest is installed as per the git repo (yet to be defined) and the platform/db api is installed, the functions of the api can be leveraged against any mysql image/VM.
- Some remote storage mechanism is used. We are using the HP SAN driver, but with the Volume Manager, this is pretty generic. Any remote mount can be used for MySQL data storage.
Design
Well its easy to see that database as a service will be a hosted service within Nova. We are using every component of nova, and in doing this, any user of this system can customize the drivers they want to use. There is no reason another producer of this service cannot use Xen or kvm to accomplish the same means to the end. Based on our assumptions above, the guest agent is the communication channel to the service within the container/VM.
Guest Agent
Currently we are using the default manager.Manager (or worker in nova terms) from the service.py. This will handle periodic tasks as well as communication to the message bus. The periodic tasks are not very extensible in nova yet, so this is a necessary change we will have to take on. The guest api is written in such a way that the driver can be switched and a different linux (current, future to include windows) compatible database server can be used. The first of these is MySQL. As of now (discussion pending on the list) we are using the builtin mysql user auth and not the authN/Z functionality that is being nebulously discussed. This can be changed in the future and it has been brought up already to us personally. We welcome any help to make this happen!
The guest is dumb. It is told by the API which services to install, so that in the future we can install multiple technologies inside a single VM. If a user wants a memcache service within the database container, that can be accomplished by preparing the guest to understand memcache. The prepare is the api call that installs the packages.
Package Management
Instead of using custom images to accomplish the same means to the end, we decided to go with a generic image, with no nova components installed. The package management system (possibly a PPA) hosts the packages needed to install the common nova codebase, as well as the nova guest components. Anyone can use any custom image, and install the guest themselves or use our firstboot (see below) methodology for doing such. The package management is also integral to installing mysql, or any other service, such as memcache, in a container. This is not a necessary component, and custom images can be used. the problem with custom images (in our mind, lets talk it out) is that they can be limited to a single service within a image. Consensus can be reached and we can develop a management / provisioning system that nova proper can use moving forward.
FirstBoot
The 'custom' part of the image here is the firstboot script. A firstboot package (a part of RedDwarf currently) is installed upon first image boot that installs the guest and nova common. It also copies a production nova.conf with the necessary flags for the guest to use. The guest connects to both rabbitMQ and the nova database, much in the same way every manager.Manager works.
Volume Management
In order to provide a robust MySQL service the data directory should be stored remotely. The problem of the overhead of some remote storage systems can slow down reads/writes. A SAN can be used to accomplish a highly available, robust level of service. We are using the HP SAN driver in the Volume Manager, but it is up to the operations team who installs RedDwarf to decide which remote storage device to use, and how much they want to spend on such a device!
Implementation
Much of the implementation is finished for the initial Create/Destroy of Containers/Databases/Users, as well as enabling and disabling a MySQL root user. This can be found in the cloned branch of cactus, here (https://github.com/rackspace/reddwarf). The platform API is inspired by the openstack API. The guest agent is inspired by the manager.Manager from the service.py in nova. The design section above explains the reasoning for each section of Red Dwarf. Please refer to the current codebase for implementation specific details. Each of the design sections above is a major component that we are using in addition to stock nova. If there are any points of confusion or contention, we can build this section out more. We would like to place some high level sequence diagrams to show the flows and how we expect to use nova and how we build upon it. These are in the works now and will be added as they are finished. If anyone has problems with this currently, the implementation specifics can be outlined here.
Code Changes
We decided (for now) not to use the extensions API. We have stood up a platform api which encompasses our api functionality. There are a few nova internal changes that are needed to accomplish it. Nothing major, just adding a few 'platform api' installation code (diffs/BP's to be provided soon).
Test/Demo Plan
We have created a CI environment using vagrant and scripts to install packages inside a virtualbox image from the source you (or jenkins) provides. This is useful for being able to test uncommitted changes to the guest/api without harming other developers or checking code in blindly. We are wrapping nose with a small library that allows for method level test chaining and skipping of tests that depend on a failed test. This is very useful for setting up a nova environment in a fresh vagrant box and creating setup dependencies across a suite of tests.
Unresolved issues
- Package management - Should we use package management??
- Guest Agent - The guest is not secure. We are punting on this because we do not allow for ssh, but once a secure guest is created for nova, we will use it.
- Extensions API - To Extend or not to Extend...
BoF agenda and discussion
Use this section to take notes during the BoF; if you keep it in the approved spec, use it for summarising what was discussed and note any options that were rejected.
Project Links
Project Pages
Blueprint
Code
Test and Deployment Images