Difference between revisions of "Trove/cassandra-incremental-backups"

Revision as of 11:54, 4 May 2014

Incremental Backups

Adding support for incremental backups and restores. The main difference between this and the current manual backups will be that these will be deleted when the instance is deleted. This is intended to be run as a backend scheduled task.

https://wiki.openstack.org/wiki/Trove/scheduled-tasks

Storing Metadata

In order to create incremental backups we need to store a little extra information in full backups and incremental ones.

Add 'parent_id' field to the backups table in trove:

         ALTER TABLE `backups`
           ADD COLUMN `parent_id` varchar(36) DEFAULT NULL;

Use swift metadata http://docs.openstack.org/api/openstack-object-storage/1.0/content/object-metadata.html to store any extra metadata for the backup.

        curl –X POST -i \
            -H "X-Auth-Token: my-token" \
            -H "X-Object-Meta-Parent: https://storage.com/v1/XXX/path_to_parent_backup"" \
            -H "X-Object-Meta-LSN: 123456789" \
            https://storage.com/v1/XXX/path_to_incremental_backup

Incremental Workflow

API:

Another full or incremental backup must be made by the user or an automated system. (This backup ID is used as the parent ID)
An api call is made to run an incremental backup (one new optional field: 'parent_id'):

      POST /backups 
     {'name': 'incremental-backup-name',
      'instance_id': 'instance_id',
      'parent_id': '<parent_id>'}

A entry in the backup table is created.
The backup handler looks up the information on the parent backup (error if it doesn't exist)
A message is sent to the guest to run the backup

      backup_info = {
       'backup_type': 'incremental',
       'backup_location': <location>,
       'parent': {
              'location': <parent location>,
              'checksum': <parent file checksum>
          }
       }

GUEST:

Backup agent will be changed to accept a type passed in by the api.
A new backup type of increamental uses the meta data to preform the restore. For xtrabackup this requires passing a 'lsn' number to the backup command:

        Place all snapshots to /var/lib/cassandra/data directory and restart cassandra

After a successful backup record meta data along with other backup info (location, checksum, etc) For xtrabackup that will include the last LSN number of the backup.

Full backup changes

None

Restore Workflow (metadata in db)

During the create instance call if a backup ID is passed in restorePoint the api will need to change slightly: