Trove/snapshot-design


 * Launchpad Entry: Reddwarf:server-templates
 * Created: 13 Feb 2013
 * Contributors: Don Kehn (dkehn)

= Backups Design =

Overview
The goal of this blueprint is to implement capabilities in Reddwarf to support the concept of DbaaS Backups, which could be used to support database capabilities (i.e. backups, cloning, etc.). This will be supported in the Reddwarf API (guest_agent) and provide the ability to support snapshots in a consistent view of DB such that it can be used for replication setup, cloning of the Master DB, and/or backup such that a point-in-time restoration can be performed.

There are many methods of implementing a backup design such as: mysqldump with the master-data option, Perconas' XtraBackup, Openstack's volume snapshot, etc.. Each has their advantages and disadvantages. As such the design presented herein will provide for the idea of a pluggable interface such that as capabilities are developed they can be incorporated without code changes. One entities SLA may allow for only a certain type of snapshot due to acceptable down-time, etc.. So in order to offer the most robust possible snapshot implementation this pluggable API present the best of breed.

Design Goals
The following are the design goals:


 * Pluggable interface to the lower snapshot implementation layer in order to support multiple snapshot mechanisms, i.e. Xtrabackup, mysqldump, Openstack's Volume support, others. This will allow for a variety of snapshot implementations that could define different levels of service. For example:
 * High availability you would use Xtrabackup
 * For a lower level of service mysqldump.
 * Database support for information pertaining to the backup, i.e. name, time of the backup, status information as to the current state of the snapshot, tenant information, etc..
 * Use of ACLs to support backup security and access from whom to which backup.

Storage
Backups that are produced as artifacts will be uploaded to Swift. The Auth token presented to Reddwarf will be forwarded to Swift and the snapshot will live within a container in the user's swift account. It is assumed that the user has the necessary Swift Role when attempting to create a Snapshot. Reddwarf will perform a HEAD against Swift, prior to accepting the user's create request.

ACL Design
The ACL will use the Openstack Storage ACLs. ACLs allow us to have greater control over individual objects and containers without requiring full read/write access to a particular container. Granting access control is done on a container basis and is achieved at the role level. When a user creates a container by using the role they are in, other users can be granted that access by adding other roles to the container. In our case the following hierarchal modes applies. From the diagram below it shows each tenant will have one BackupManager role and associated to that role can be many users. Each user has the ability to create a snapshot object in swift and to access it in order to restore a VM's database from it. These users will have to be pre-assigned to the role of the BackupManager prior to use and by a user that has administration duties for that tenant. This way we can ensure that security and isolation between tenants.

http://www.scion-eng.com/Pictures/hp/ACLPic1.jpg

Please note that the user's that are to be assigned to have the authority to perform backups and restores must be previously assign to the role of BackupManager first before attempting and backup or restore.

Once assignment to the role of BackupManager a user can use their credentials to create a backup and restore from a previously create backup. In addition, listing of the available backups performed with those credentials can be obtained for only a tenant that this user belongs to.

Pluggable Interface
In order to support a variety of different possible snapshot mechanisms a pluggable interface to those underlying capabilities seems the most plausible method for extending the capabilities of the snapshot implementation in the most efficient manner. A directory will be created under the ~/reddwarf/reddwarf/guestagent/plugins, that will contain the various implementations of the snapshot using the specific methods for their implementations (i.e. xtrabackup, mysqldump, volume snapshots, etc.).

In the call from the reddwarf API a type shall be exposed that will be use to load the supported implementation.

Database Design
In order to support the snapshot a database table will be created in the reddwarf database to allow determining the following information:


 * TENANT_ID - this will be the most granular predicate of whom can access which snapshot from swift. The TENANT_ID, ROLE_ID, USER,  and PASSWORD will  be necessary elements to determine if you can access a container from Swift.
 * Name - will represent the name of the snapshot, will be unique.
 * Size of the snapshot - this shot reflect the entire size of the snapshot. If for example the snapshot consists of the data directories this size should reflect all files involved.
 * Timestamp of the snapshot - this will reflect at what point the snapshot is consistent with the database is was taken from.
 * State of the snapshot - this should indicate the current state of the snapshot. During the snapshot process this will be updated by the agent to reflect state. The possible states are:
 * started - indicates that the API and agent has the request and is starting.
 * running - indicates that agent is in the process of performing the snapshot, given the size of the database this could be a state of snapshot request for some time.
 * completed - indicates that the snapshot is complete and the container is in Swift.
 * failed - indicates that an error has occurred, refer to the notes section for details on the failure.

An example and definition of the of the database table is as follows:

DROP TABLE IF EXISTS snapshot; CREATE TABLE backups (      id             int(10)unsigned NOT NULL auto_increment,       name           varchar(128) NOT NULL DEFAULT ,       location       varchar(1024) NOT NULL DEFAULT ,       tenant_id      varchar(36) NOT NULL DEFAULT '00000000-0000-0000-000000000000',       bkup_type      varchar(32) NOT NULL DEFAULT '',       size           float unsigned not null DEFAULT 0.0,       deleted        tinyint(1) NOT NULL DEFAULT 0,       created_ts     timestamp NOT NULL DEFAULT "0000-00-00 00:00:00",       deleted_ts     timestamp NOT NULL DEFAULT "0000-00-00 00:00:00",       state          varchar(32) NOT NULL DEFAULT 'started',       last_updated   timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,       PRIMARY KEY (id),       UNIQUE KEY (name, tenant_id),       KEY idx_location(location),       KEY idx_bkup_type(bkup_type) ) ENGINE=ENGINE_INNODB DEFAULT CHARSET=utf8;


 * id - Primary key.
 * name - Unique name identifying the snapshot amongst others within a tenant.
 * localtion - This identifies in the Swift system of the location where this snapshot is located for easy retrieval.
 * tenant_id - This is the 36 cahracter TENANT_ID.
 * bkup_type - The defines how the snapshop is represented i.e. Xtrabackup, mysqldump, LVM, etc..
 * size - The total size in bytes of the snapshot, in the case of xtrabackup this is the total of all the sub-directories, etc..
 * deleted - 1 = deleted, the snapshot has been removed from the Swift system, 0 = this snapshot is located in the Swift system.
 * create_ts - Timestamp when the snapshot was created.
 * deleted_ts - Timestamp when the snapshot was deleted.
 * state - The state represent the actual state of the snapshot, this should indicate the current state of the snapshot. During the snapshot process this will be updated by the agent to reflect state. The possible states are:
 * started - indicates that the API and agent has the request and is starting the snapshot process.
 * running - the actual snapshot is in process.
 * completed - the snapshot has finished and the snapshot process has finished.
 * failed - indicated that an error has occurred, refer to the notes column for details on the failure.
 * notes - This is a free test area to provide further information on the nature of a failure or further status information during the "started" and "running" states, will contain the total running time of the snapshot operation when state is "completed".
 * last_updated - Will hold the timestamp of the last update on this record.

State Diagrams for Xtrabackup
In order to demonstrate the code structure a state diagram has is show in the following diagrams for various process:

Backup request using Xtrabackup


Backup Request Diagram

Restore Request Using Xtrabackup


Restore Snapshot

Backup List List


List Backups

= API Spec =

Create Backup
As a Reddwarf User, I need the ability to create a backup of my instance stored in my Swift account so that I can have a full backup of all MySQL data from my instance.

POST /backups Creates a new instance backup.

Parameters

name - optional - if a name is not specified one will be autopopulated

instanceid - ID of the instance to create a backup from

description - optional - description of the backup

Request Body:

{   "backup" : { "name":"My Backup" "instanceid":"4c6ad8e3-d857-46e2-aca4-dcbbfdab8526" "description" : "Backups for my production database." } } Response: {   "backup": { "id" : "56b80958-cf34-4e9e-b5b0-b84fbe5e4ecc" "name" : "My Snapshot", "locationRef" : "https:/ / / / / ", "status" : "STARTED" } }

List All Backups
As a Reddwarf User, I need the ability to view a list of all my available backups for all my instances so that I can easily locate previously saved data across all my instances.

GET /backups List all backups for a given tenant id.

Response: {   "backups": [ {           "id" : "56b80958-cf34-4e9e-b5b0-b84fbe5e4ecc" "name" : "My Backup", "description" : "Backups for my production database." "locationRef" : "https:/ / / / / ", "instanceRef" : "https://service/v1.0/1234/instances/28d1b8f3-172a-4f6d-983d-36021508444a" "created" : "2012-03-28T21:31:02Z"",           "updated" : "2012-03-28T21:34:25Z",            "status" : "COMPLETED",        },        {          "id" : "56b80958-cf34-4e9e-b5b0-b84fbe5e4eea"            "name" : "My other backup",            "description" : "Backups for my production database."            "locationRef" : "https:/ / / / / ",            "instanceRef" : "https://service/v1.0/1234/instances/28d1b8f3-172a-4f6d-983d-36021508444a"            "created" : "2012-03-28T21:31:02Z"", "updated" : "2012-03-28T21:34:25Z", "deleted" : "2012-03-28T21:34:25Z", "status" : "DELETED", },        {          "id" : "45x453467-df99-4j5k-a4a5-v873455e4wwt" "name" : "My other backup", "description" : "Backups for my production database." "locationRef" : "https:/ / / / / ", "instanceRef" : "https://service/v1.0/1234/instances/28d1b8f3-172a-4f6d-983d-36021508444a" "created" : "2012-03-28T21:31:02Z"",           "updated" : "2012-03-28T21:34:25Z",            "status" : "FAILED",        }    ] }

List Backups for a Specified Instance
As a Reddwarf User, I need the ability to view a list of backups available for a single instance, so that I can locate previously saved data from my instance.

GET /instance/instanceId/backups Return the list of backups for the instance

Same as list all backups for tenant

Delete Backup
As a Reddwarf User, I need the ability delete a backup, or multiple backups, so that I can remove data no longer needed from my Swift account to save space/money.

DELETE /backups/{uid} Delete specified backup

Create Instance from Backup
As a Reddwarf User, I need the ability to create a new Reddwarf instance from a snapshot, so that I can quickly create a copy on an existing database instance.

POST /instances Creates a new database instance.

New Attributes

snapshotRef -

Response Codes: same as current call

Error Codes: same as current call

Description:

Request Body:

{   "instance": { "flavorRef": "https://service//v1.0/1234/flavors/1", "name": "my_db_inst", "volume": { "size": 2 }   "restorePoint" : { "backupRef": "https://service/v1.0/1234/snapshots/56b80958-cf34-4e9e-b5b0-b84fbe5e4ecc" | '56b80958-cf34-4e9e-b5b0-b84fbe5e4ecc' } } Response:

{   "instance": { "created": "2012-01-25T21:53:09Z", "flavor": { "id": "1", "links": [ ...       ]    },     "hostname": "192.168.1.1", "id": "dea5a2f7-3ec7-4496-adab-0abb5a42d635", "links": [ ...   ],     "name": "my_db_inst", "status": "BUILD", "updated": "2012-01-25T21:53:10Z", "volume": { "size": 2 } }