Trove/incremental-backups

Incremental Backups

Adding support for incremental backups and restores. The main difference between this and the current manual backups will be that these will be deleted when the instance is deleted. This is intended to be run as a backend scheduled task.

https://wiki.openstack.org/wiki/Trove/scheduled-tasks

Storing Metadata

In order to create incremental backups we need to store a little extra information in full backups and incremental ones. There are two options for storing this metadata for the incremental and full backups.

1 Add meta field to the backups table in trove:

ALTER TABLE `backups`
    ADD COLUMN `meta` varchar(1024) DEFAULT NULL;
    ADD COLUMN `parent_id` varchar(36) DEFAULT NULL;
    ADD COLUMN `automated` tinyint(1) DEFAULT 0;

PRO: More secure as the data will be stored only in trove database.
CON: The meta data will need to be passed to and from the guest, making the logic more complex.

2. Use swift metadata http://docs.openstack.org/api/openstack-object-storage/1.0/content/object-metadata.html to store any extra metadata for the backup.

curl –X POST -i \
    -H "X-Auth-Token: my-token" \
    -H "X-Object-Meta-Parent: https://storage.com/v1/XXX/path_to_parent_backup"" \
    -H "X-Object-Meta-LSN: 123456789" \
    https://storage.com/v1/XXX/path_to_incremental_backup

PRO: No database change. No need to pass extra info to the guest as a simple HEAD call on the swift file will get the meta info. The guest can easily update the metadata on the swift objects it writes.
CON: Less secure as users can change these fields (possibly this field could be encrypted with the same key as the backups)

Incremental Workflow

API:

1. A full backup must be made by the user or an automated system. (This backup ID is used as the parent ID)
2. An api call is made to run an incremental backup (two new optional fields):

POST /backups 
{'backup_type': 'incremental',
 'instance_id': 'instance_id',
'parent_id': '<parent_id>'}

3. A entry in the backup table is created.
4. The backup handler looks up the information on the parent backup (error if it doesn't exist)
5. A message is sent to the guest to run the backup

backup_info = {
       'backup_type': 'incremental',
       'meta': <meta_info_from_parent>,
       'parent_location': <parent_location>,
       'parent_id': <parent_id>}

GUEST:

1. backup agent will be changed to accept a type passed in by the api.
2. A new backup type of increamental uses the meta data to preform the restore. For xtrabackup this requires passing a 'lsn' number to the backup command:

/usr/bin/innobackupex  --stream --ibbackup=xtrabackup \
   --incremental --incremental-lsn=%(lsn)s

3. After a successful backup record meta data along with other backup info (location, checksum, etc) For xtrabackup that will include the last LSN number of the backup.

Full backup changes (innobackupex)

Change the current backup process to add the meta data from the backup process.

def check_process():
     # check for 'completed OK!'
     # parse out the LSN number
     # update meta data property on backup
     return True

Restore Workflow (metadata in db)

During the create instance call if a backup ID is passed in restorePoint the api will need to change slightly: