Trove/Specs/Trove-v1-MySQL-Replication

Description
Providing support for the various replication use cases is critical for use of Trove in production. For the first phase implementation of Replication in Trove we will implement the functionality laid out in the Trove V1 Replication Blueprint

Use Case Summary
The following use cases will be addressed by this V1 implementation:

A. Read Replicas (Slaves)
 * 1) The master can exist before the slave such that the master already contains data
 * 2) N slaves can be created for one master slicknik (talk) * To clarify, the v1 implementation will allow for this but will require N separate create calls. We may optimize this in a later implementation.
 * 3) Slaves can be marked read-only (read-only will be default)
 * 4) A slave can be detached from its master to act as independent site
 * 5) A pre-existing non-replication site can become the master of a new slave
 * 6) The health of a slave will be monitor-able by third party apps.

Trove API
The REST API will be extended to support:
 * creating a new instance as a replication slave of an existing instance
 * detaching a slave from its master such that it becomes a stand-alone instance

Create Instance (Master)
There is no explicit action to create a master: any existing instance can be used as the replication source when creating a new slave.

For reference, here is a sample call to create a MySQL instance.

Request: POST /instances { "instance": { "name": "products", "datastore": { "type": "mysql", "version": "5.5" },   "configuration": "b9c8a3f8-7ace-4aea-9908-7b555586d7b6", "flavorRef": "7", "volume": { "size": 1 } } }

Response: { "instance": { "status": "BUILD", "id": "dfbbd9ca-b5e1-4028-adb7-f78643e17998", "name": "products", "created": "...", "updated": "...", "links": [{...}], "datastore": { "type": "mysql", "version": "5.5" },   "configuration": { "id": "b9c8a3f8-7ace-4aea-9908-7b555586d7b6", "links": [{...}], },   "flavor": { "id": "7", "links": [{...}], },   "volume": { "size": 1 } } }

Create Slave
A replication slave is created as a new instance with a 'slaveOf' reference to an existing instance (which will become the master).

Request: POST /instances { "instance": { "name": "products-s1", "datastore": { "type": "mysql", "version": "5.5" },   "slaveOf": "dfbbd9ca-b5e1-4028-adb7-f78643e17998", "configuration": "b9c8a3f8-7ace-4aea-9908-7b555586d7b6", "flavorRef": "7", "volume": { "size": 1 } } }

Response: { "instance": { "status": "BUILD", "id": "061aaf4c-3a57-411e-9df9-2d0f813db859", "name": "products-s1", "created": "...", "updated": "...", "links": [{...}], "datastore": { "type": "mysql", "version": "5.5" },   "slaveOf": { "id": "dfbbd9ca-b5e1-4028-adb7-f78643e17998", "links":[{..}], }   "configuration": { "id": "b9c8a3f8-7ace-4aea-9908-7b555586d7b6", "links": [{...}], },   "flavor": { "id": "7", "links": [{...}], },   "volume": { "size": 1 } } }

Stop Replication
POST /instances/{id}/action {   "detach_replication": {} }

Notes:
 * in the resource URI is the id of a replication slave instance

New Commands

 * Detach

No additional arguments are required for a 'detach' operation.

Updated Commands

 * Create

The optional  argument is used to indicate that the new instance should be configured as a slave of the specified master instance.


 * Show

The  command will be updated to indicate whether the specified instance instance is a replication  master or slave.

trove show

+---+-+ +---+-+ +---+-+
 * Property    |         Value                               |
 * created     | 2014-05-27T18:21:57                         |
 * datastore    | mysql                                       |
 * datastore_version | mysql-5.5                                  |
 * flavor     | 100                                         |
 * id       | 93832783-0993-48e0-a0ab-7b996818b7cc        |
 * name      | test1                                       |
 * slaves     | 061aaf4c-3a57-411e-9df9-2d0f813db859        |
 * status     | ACTIVE                                      |
 * updated     | 2014-05-27T18:48:05                         |
 * volume     | {u'used': 0.11, u'size': 3}                 |

trove show

+---+-+ +---+-+ +---+-+
 * Property    |         Value                               |
 * created     | 2014-05-27T18:21:57                         |
 * datastore    | mysql                                       |
 * datastore_version | mysql-5.5                                  |
 * flavor     | 100                                         |
 * id       | 93832783-0993-48e0-a0ab-7b996818b7cc        |
 * name      | test1                                       |
 * slaveOf    | dfbbd9ca-b5e1-4028-adb7-f78643e17998        |
 * status     | ACTIVE                                      |
 * updated     | 2014-05-27T18:48:05                         |
 * volume     | {u'used': 0.11, u'size': 3}                 |

Notes:
 * Only immediate links will be included in the 'show' output. (In future iterations it may be necessary to add new commands to view more complex topologies.)
 * Exact rendering of show output is subject to change; content is intended to be representative.

Taskmanager
The taskmanager will implement 2 API calls:


 * create_instance will be updated to support the additional 'slaveOf' argument
 * detach_replication(slave_instance)

taskmanager.create_instance
The create instance task will be updated to handle creating a slave. When a master instance is specified (via the slave_of parameter):


 * 1) execute get_replication_master_snapshot on the master site, receiving "master snapshot results metadata"
 * 2) uses the master snapshot to create a new instance with a copy of the master's data (via restore functionality)
 * 3) execute guestagent.attach_replication_slave on new instance
 * 4) delete replication snapshot from Swift

taskmanager.detach_replication
Executes guestagent.detach_replication_slave for the selected instance; removed the slaveOf reference from the instance record.

Trove GuestAgent
There will be 4 new methods added to the guestagent API:


 * get_replication_snapshot
 * attach_replication_slave
 * detach_replication_slave
 * demote_replication_master

Replication will be focused around a replication snapshot. This snapshot will contain the data necessary to set up a slave to replicate from the site which created the snapshot, typically a URI to the user's data set stored in Swift plus the metadata required to coordinate replication.

Each datastore implementation will need to implement these methods. The content of the image uploaded to swift is opaque to the taskmanager and higher components, so the guest agent is free to store whatever data it chooses, in whichever format is most appropriate. The content of the metadata is specific to the datastore, but will be represented as a JSON object.

Notes:
 * In future iterations, trove capabilities may be used to indicate whether a particular data store supports the replicate / detach actions.

get_replication_snapshot
The MySQL guestagent will use xtrabackup to create a backup of the user's data and upload it to Swift. The metadata will include a URI of the uploaded backup data, along with the site's binlog position and network information required to set up replication.

{   "master": { "host": "192.168.0.1", "port": 3306 },   "dataset": { "datastore": "mysql", "datastore_version": "mysql-5.5", "dataset_size": 2, "snapshot_href": "http://..." },   "binlog_position": }

attach_replication_slave
Configures the site to receive replicated updates from the master site.

detach_replication_slave
Stops the slave from replicating from the master. After the instance has been detached from the master, it is an indepent copy of the master's data, and is a fully functional site on its own.

After a slave is detached the topology for the master will no longer contain the detached slave:

{ "topology": { "members": [ {       "id": "{master-id}", "name": "master" },     {        "id": "{slave2-id}", "name": "slave2", "mysql": { "slave_of": [{"id": "{master-id}"}], "read_only": true }     }    ]  } }

The detached slave (slave1 in this example) will have no topology, as it is now a stand-alone instance.

demote_replication_master
Returns the site to its pre-replication state. For mysql, this will involve turning off bin-logging and removing associated logs.

Trove Guestagent - Replication Status
The trove guest-agent will reflect the state of replication via the guest heartbeat. In the event that replication is not functional at a site, that site's heartbeat status will be ERROR and the database service will be disabled.

Master Instance - for mysql, master state is indeterminate Slave Instance - in the event of a replication related issue which prevents replication from continuing, the guest status will be updated to ERROR, which will be reflected in the guest heartbeat. For mysql, an ERROR state will be flagged when the IO and SQL slave threads are not running.

Server ID
MySQL replication requires a unique server id for each slave of a given master. By default trove already generates a unique id during instance creation, so this requirement is satisfied. It is currently possible to override the generated service id using a configuration group. This could cause issues with replication so server id will be removed as a 'settable' field in MySQL configuration groups.

Read-Only
By default new slave instances will be created as read-only. This option will be added as a supported field for MySQL configuration groups so that it is possible to override the default and create a read-write slave.

In a future iteration we will consider adding read-only as a field on the instance itself rather than exposing this via configuration groups.

Use Case Summary
1. The master can exist before the slave such that the master already contains data
 * esp: Once an instance becomes a master can it be downgraded in the same way that a slave can be detached?
 * mwj: Updated design - when last slave is detached, master site will be "demoted".

3. Slaves can be marked read-only (read-only will be default)
 * esp: If a read-only slave is detached is there an option to make it read_write?
 * mwj: I don't think this is necessary for V1.

6. The health of a slave will be monitor-able
 * esp: We'll probably want to monitor the health of the master too.


 * esp: Will the mechanism of monitoring be anything more than the heart beat message sent by the agent?

Create Slaves

 * esp: If a user chooses to create slave(s) with smaller flavor(s) and volume we should allow it as long as it fits. This is similar to how backup/restore currently works.  It would be good to provide sufficient logging and return an error response for when the data doesn't fit though.

Stop Replication
PUT /instances/{id}/topology/action {      "instance": { "detach": {}, "read_only": false } }
 * esp: I think maybe this 'POST /instances/{id}/topology/action' could be PUT but I don't care that much :)
 * esp: suggested HTTP method

Updated Commands

 * esp: I think only showing the direct association between nodes is a good way to go. Trying to show more than that will get messy quick.
 * esp: It wouldn't hurt to add these calls above in the Trove API section but not critical.

taskmanager.create_replication
4. delete replication snapshot from Swift
 * esp: I'm guessing the the snapshot will only be created 1x for a set of given replicas and deleted when the last slave is created.
 * mwj: Yes, that's why we changed the proposed API to have a slave count.


 * esp: One day creating replicas could be done in parallel but I'm probably dreaming :)
 * mwj: Yes, we thought of that, but decided not to do so for V1.

TBD: Handling security groups Proposal: Slaves should be added to the security group of the master rather than getting their own group each. (This may not be addressed in v1.)
 * glucas When security group support is enabled, each instance created via a 'trove create' call gets a new security group. What should we do with slaves?

detach_replication_slave

 * esp: After a slave is detached can it be re-attached? Or do we only allow attaching slaves that do not contain data?
 * mwj: For this version, the guestagent will assume an empty db. In this version, there will be no API call to re-attach a slave, or to attach any pre-existing site as a slave.  The only operation taskmanager will know is creating a new set of slaves from a specified master.