Difference between revisions of "Trove/Specs/Trove-v1-MySQL-Replication"
(→Design) |
|||
Line 38: | Line 38: | ||
Notes: | Notes: | ||
− | * <code>count</code> allows multiple slaves to be created from a single snapshot of the master | + | * <code>id</code> in the resource URI is the id of the master instance to be replicated |
− | * <code>instance</code> defines the template used for each slave. Certain instance properties (datastore_version, databases, users) will not be supported here | + | * <code>count</code> allows multiple slaves to be created from a single snapshot of the master |
+ | * <code>instance</code> defines the template used for each slave. Certain instance properties (datastore_version, databases, users) will not be supported here | ||
==== Stop Replication ==== | ==== Stop Replication ==== | ||
Line 49: | Line 50: | ||
} | } | ||
</pre> | </pre> | ||
+ | |||
+ | Notes: | ||
+ | * <code>id</code> in the resource URI is the id of the slave to be detached from its current topology. | ||
=== Python-Troveclient === | === Python-Troveclient === | ||
Line 70: | Line 74: | ||
# N times: | # N times: | ||
## Create trove instance of given flavor, volume size, and any optional instance parameters | ## Create trove instance of given flavor, volume size, and any optional instance parameters | ||
− | ## generate a unique server_id for the slave | + | ## generate a unique server_id for the slave. |
## execute guestagent.create_replication_slave() on new instance | ## execute guestagent.create_replication_slave() on new instance | ||
## Update instance metadata to add "topology" section | ## Update instance metadata to add "topology" section | ||
# delete replication snapshot from Swift | # delete replication snapshot from Swift | ||
<br/> | <br/> | ||
− | After the Create Replication task has completed, the topology | + | After the Create Replication task has completed, showing the topology of the master will list the newly created slave instances: |
− | |||
<pre> | <pre> | ||
{ | { | ||
Line 153: | Line 156: | ||
Stops the slave from replicating from the master. After the instance has been detached from the master, it is an indepent copy of the master's data, and is a fully functional site on its own. | Stops the slave from replicating from the master. After the instance has been detached from the master, it is an indepent copy of the master's data, and is a fully functional site on its own. | ||
− | After a slave is detached the topology will | + | After a slave is detached the topology for the master will no longer contain the detached slave: |
<pre> | <pre> | ||
Line 175: | Line 178: | ||
} | } | ||
</pre> | </pre> | ||
+ | |||
+ | The detached slave will have no topology, as it is now a stand-alone instance. |
Revision as of 16:22, 28 April 2014
Contents
Description
Providing support for the various replication use cases is critical for use of Trove in production. For the first phase implementation of Replication in Trove we will implement the functionality laid out in the Trove V1 Replication Blueprint
Use Case Summary
The following use cases will be addressed by this V1 implementation:
A. Read Replicas (Slaves)
- The master can exist before the slave such that the master already contains data
- N Slaves for one master
- Slaves can be marked read-only (read-only will be default)
- A slave can be detached from "replication set" to act as independent site
- A pre-existing non-replication site can become the master of a new "replication set"
- The health of a slave will be monitor-able
Design
Trove API
Create Slaves
POST /instances/{id}/action
{ "replicate": { "count": 2, "instance": { "availability_zone": "us-west-2", "flavorRef": "7", "volume": { "size": 1 } } "topology": { "slave_of": [{"id": "{id}"}], "read_only": true } }
Notes:
-
id
in the resource URI is the id of the master instance to be replicated -
count
allows multiple slaves to be created from a single snapshot of the master -
instance
defines the template used for each slave. Certain instance properties (datastore_version, databases, users) will not be supported here
Stop Replication
POST /instances/{id}/topology/action
{ "detach": {} }
Notes:
-
id
in the resource URI is the id of the slave to be detached from its current topology.
Python-Troveclient
trove replicate <master instance> <slave count> --read-only=<boolean>
trove detach_replication <slave instance>
Taskmanager
The taskmanager will implement 2 API calls:
- create_replicated_instances(master_id, slave_count, flavor, topology, volume_size, availability_zone, nics )
- detach_replication(slave_instance)
taskmanager.create_replication
The Create Replication task will be performed with the following steps:
- Execute getReplicationSnapshot() on the master site, receiving "master snapshot results metadata"
- N times:
- Create trove instance of given flavor, volume size, and any optional instance parameters
- generate a unique server_id for the slave.
- execute guestagent.create_replication_slave() on new instance
- Update instance metadata to add "topology" section
- delete replication snapshot from Swift
After the Create Replication task has completed, showing the topology of the master will list the newly created slave instances:
{ "topology": { "members": [ { "id": "{master-id}", "name": "master" }, { "id": "{slave1-id}", "name": "slave1", "mysql": { "slave_of": [{"id": "{master-id}"}], "read_only": true } } { "id": "{slave2-id}", "name": "slave2", "mysql": { "slave_of": [{"id": "{master-id}"}], "read_only": true } } ] } }
taskmanager.detach_replication
Executes guestagent.detach_replication_slave() for the selected instance.
Trove GuestAgent
There will be 3 new methods added to the guestagent API:
- get_replication_snapshot()
- attach_replication_slave()
- detach_replication_slave()
replication will be focused around a replication snapshot. This snapshot will contain the data necessary to set up a slave to replicate from the site which created the snapshot, typically a URI to the user's data set stored in Swift plus the metadata required to coordinate replication.
Each datastore implementation will need to implement these methods. The content of the image uploaded to swift is opaque to the taskmanager and higher components, so the guest agent is free to store whatever data it chooses, in whichever format is most appropriate. The content of the metadata is specific to the datastore, but will be represented as a JSON object.
Trove Guestagent - MySQL Datastore Implementation
get_replication_snapshot()
The MySQL guestagent will use xtrabackup to create a backup of the user's data and upload it to Swift. The metadata will include a URI of the uploaded backup data, along with the site's binlog position and network information required to set up replication.
{ "master": { "host": "192.168.0.1", "port": 3306 }, "dataset": { "datastore": "mysql", "datastore_version": "mysql-5.5", "dataset_size": 2, "snapshot_href": "http://..." }, "binlog_position": <binlog position> }
attach_replication_slave()
Injects the copy of the master's data into the selected site, then configures the site to receive replicated updates from the master site.
detach_replication_slave()
Stops the slave from replicating from the master. After the instance has been detached from the master, it is an indepent copy of the master's data, and is a fully functional site on its own.
After a slave is detached the topology for the master will no longer contain the detached slave:
{ "topology": { "members": [ { "id": "{master-id}", "name": "master" }, { "id": "{slave2-id}", "name": "slave2", "mysql": { "slave_of": [{"id": "{master-id}"}], "read_only": true } } ] } }
The detached slave will have no topology, as it is now a stand-alone instance.