Trove/scheduled-tasks


 * Created: 13 Aug 2013
 * Contributors: Craig Vyvial (cp16net)

Update from Trove mid-cycle (Kilo) in Seattle.
Update from Trove mid-cycle (Kilo) in Seattle. A number of comments were made about this specification and the proposed implementation. These comments and the discussion were recorded at https://etherpad.openstack.org/p/trove-kilo-sprint-scheduled-tasks and an action item was given to dougshelley66 to review this specification and revise it based on our current understanding of the proposed scope.

= Automated/Scheduled Backups Design =

Overview
The goal of this blueprint is to implement capabilities in Trove to support automated/scheduled backups. This will be supported in the Trove API as well as the guest agent to use the existing snapshot-design in an automated way. The automated backups shall allow a user to restore or clone to a new instance at a point in time.

There are many different concerns when attempting to automate the backup of instances in a deployment, i.e. Stagger Backups, Maintenance Time Window, and others. In the design of this automated system each of these concerns should be capable of being addressed. This shall lead to a pluggable or configurable interface for allowing a deployer to chose what type of strategy they would like to use without a code change.

The user shall define a Maintenance Time Window of when the automated backup maintenance shall occur. The Maintenance Time Window can be generalized a maintenance window of when a user would like to have scheduled backups, updates or other maintenance related operations to occur.

In order for the strategy(s) of an automated backup to be handled in a pluggable way, it will require a trove scheduler that can determine the time from the general maintenance window when the backup must occur with simple or complex logic. The scheduler shall send a message to the guest agent to run the automated backup.

cp16net (talk) 15:07, 15 August 2013 (UTC) There is a debate on where the scheduled tasks will be run from. Sent from the scheduler to the guest or the guest handles the runs itself. Either way there will be messages sent back and forth on the bus.

Design Goals
The following are the design goals:


 * Pluggable interface to determine the type of strategy(s) to support. (Stagger Backups, Maintenance Time Window, and others)
 * Manage network bandwidth and connections from backups
 * Database support for the schedule set and listing of the backups
 * Maintenance window of when to run scheduled tasks like backups

Maintenance Time Window
The Maintenance Time Window can be set by the user for when they would like package updates to occur, backup or their data, restart of service, system maintenance, or other related tasks. This shall allow users to determine when a disruption of the service occurs rather than the deployer determining a time.

The user shall be able to create multiple time windows with different types that allow for specific types of maintenance or tasks to occur when the user specifies.

New Objects

 * Type of Scheduled Tasks
 * Scheduled Tasks

API Spec
User shall be able to create a new scheduled task for an instance and have basic CRUD operations on the scheduled task.

Type of Scheduled Tasks
The Schedule type needs to be configurable from the mgmt side but visible to the user.

GET /scheduledtasktypes

Response: {   "scheduledtasktypes": [ {           "type" : "backup", "description" : "backup description", "enabled": True },       {            "type" : "restart", "description" : "restart description", "enabled": False },       {            "type" : "update", "description" : "update description", "enabled": True }   ] } cp16net (talk) 20:54, 22 August 2013 (UTC) May need some extra fields about each task type here. (optional)

Enabled flag could be controlled by the deployer if they would like to stop certain tasks to stop from happening during a maintenance period.

Create a Scheduled Task
User shall be able to create a new scheduled task on an instance.

POST /scheduledtasks Creates a new scheduled task for an instance.

Parameters

name - optional - if a name is not specified one will be autopopulated

instance_id - the instance against which to run the task

enabled - True if the task is enabled False if not.

type - ID of the type of schedule

frequency - Frequency of the task. (hourly|daily|weekly|monthly)

time_window - Time window of when task can occur

description - optional - description of the schedule

metadata - optional - a dictionary of metadata

cp16net (talk) 21:49, 20 August 2013 (UTC) Some of this data needs to be ironed out still. The frequency could be moved to a metadata block. Notification recipient could be a list instead of a single email address.

Jimbobhickville (talk) 19:46, 19 December 2013 (UTC) I disagree on the frequency being moved to metadata, and I don't think notifications belongs there either. Anything that describes the task belongs in the main body, the metadata should be things that are only applicable to a specific task type. Rather than a single field for time_window, shouldn't we use two fields so we can still use date math?

Jimbobhickville (talk) 19:42, 15 January 2014 (UTC) After some discussion, the notification parameters were removed, to be replaced by broadcast messages that external programs like ceilometer can listen for and handle notifying the customer.

Request:

{   "scheduledtask" : { "name":"My Auto Backup", "enabled": "True|False", "type":"1", "frequency" : "hourly|daily|weekly|monthly", "time_window":"2012-03-28T22:00Z/2012-03-28T23:00Z", "description" : "Auto Backup for my production database." "metadata" : { "retentionPeriod" : "7|14|21|28", }   } } Response: {   "scheduledtask": { "id" : "56b80958-cf34-4e9e-b5b0-b84fbe5e4ecc" "name":"My Auto Backup", "enabled": "True|False", "type":"1", "frequency" : "hourly|daily|weekly|monthly", "time_window":"2012-03-28T22:00Z/2012-03-28T23:00Z", "description" : "Auto Backup for my production database.", "metadata": { "retentionPeriod" : "7|14|21|28", }    } }

Update a Scheduled Task
User shall be able to update an existing scheduled task on an instance. i.e.

PUT /scheduledtasks/{id} Updates an existing scheduled task.

Parameters

name - optional - if a name is not specified one will be autopopulated

enabled - optional - True if the task is enabled False if not.

frequency - optional - Frequency of the task. (hourly|daily|weekly|monthly)

time_window - optional - Time window of when task can occur

description - optional - description of the schedule

metadata - optional - metadata used by the task that runs

Request:

{   "scheduledtask" : { "name":"My Auto Backup", "enabled": "True|False", "type":"1", "frequency" : "hourly|daily|weekly|monthly", "time_window":"2012-03-28T22:00Z/2012-03-28T23:00Z", "description" : "Auto Backup for my production database.", "metadata": { "retentionPeriod" : "7|14|21|28", }   } } Response: {   "scheduledtask": { "id" : "56b80958-cf34-4e9e-b5b0-b84fbe5e4ecc" "name":"My Auto Backup", "enabled": "True|False", "type":"1", "frequency" : "hourly|daily|weekly|monthly", "time_window":"2012-03-28T22:00Z/2012-03-28T23:00Z", "description" : "Auto Backup for my production database." "metadata": { "retentionPeriod" : "7|14|21|28", }    } }

Get a Scheduled Task
User shall be able to get details of a scheduled task.

GET /scheduledtasks/{id} Shows the scheduled task.

Response: {   "scheduledtask": { "id" : "56b80958-cf34-4e9e-b5b0-b84fbe5e4ecc" "name":"My Auto Backup", "enabled": "True|False", "type":"1", "frequency" : "hourly|daily|weekly|monthly", "time_window":"2012-03-28T22:00Z/2012-03-28T23:00Z", "description" : "Auto Backup for my production database." "metadata": { "retentionPeriod" : "7|14|21|28", }    } }

Get list of Scheduled Tasks
User shall be able to list all of the scheduled tasks on an instance.

GET /instance/{id}/scheduledtasks Shows a list of scheduled task for an instance.

Response: {   "scheduledtasks": [ {           "id" : "11111111-cf34-4e9e-b5b0-b84fbe5e4ecc", "name":"My Auto Backup", "enabled": "True|False", "type":"1", "frequency" : "hourly|daily|weekly|monthly", "time_window":"2012-03-28T22:00Z/2012-03-28T23:00Z", "description" : "Auto Backup for my production database." "metadata": { "retentionPeriod" : "7|14|21|28", }       },        {            "id" : "22222222-cf34-4e9e-b5b0-b84fbe5e4ecc", "name":"My Update Period", "enabled": "True|False", "type":"3", "frequency" : "hourly|daily|weekly|monthly", "time_window":"2012-03-28T23:00Z/2012-03-28T24:00Z", "description" : "Updates can be applied to my production database." "metadata": { "retentionPeriod" : "7|14|21|28", }       }    ] }

Delete a Scheduled Task
User shall be able to delete a scheduled task.

DELETE /scheduledtasks/{id} Deletes a scheduled task.

Response: 202 Accepted

RPC Message Communication
This is the contract defined that the scheduler will have with the guest to run tasks and get replies from the guest on tasks being run.

Run a Backup
This is the message payload that will be sent to the guest to excute a backup job.

{   "backup": { "id" : "11111111-cf34-4e9e-b5b0-b84fbe5e4ecc", "auth_token": "0a9sdf809sdu09sjfaisduf098ublahblahblah", "destination_url": "customerbackupdir", "type": "xtrabackup|mysqldump|other" } }

Guest Update on Backup
{    "backup": { "id" : "11111111-cf34-4e9e-b5b0-b84fbe5e4ecc", "status": "(Success|Failure)", "reason": "Some explaination of the failure or success", "size_mb": 123, "md5": "some-md5-check-sum" } }

Run an Update
This is the message payload that will be sent to the guest to excute a backup job.

{   "update": { "type": "packages" } }

Guest Update on Update
{    "update": { "status": "(Success|Failure)", "reason": "Some explaination of the failure or success" } }

Scheduled Task Schema
One new entity will be created in the trove database: scheduled_tasks. This entity will store the scheduled task data that is supplied by the user.

Scheduled Task (scheduled_tasks)

This table will contain the id, name, description, tenant id, instance id, task type, and enabled the task belongs to.

API Spec - Scheduled Task Creation

 * drewford: Time Window value is shown as a timestamp. Does the granularity of the "time_window" value change depending on your choice for frequency?