Jump to: navigation, search

Trove-Replication-And-Clustering-API-Single

Revision as of 02:33, 1 March 2014 by Amcrn (talk | contribs) (MongoDB)

MySQL


MySQL Master/Slave


For Master/Slave, the server_id must differ, and optionally the slave can specify whether it is read_only or not to avoid accidental writes.

Create Configuration-Group for Master

Request:

POST /configurations
{
  "configuration": {
    "name": "config-a",
    "datastore": {
      "type": "mysql",
      "version": "mysql-5.5"
    },
    "values": {
      "server_id": 1
    }
  }
}

Response:

{
  "configuration": {
    "id": "b9c8a3f8-7ace-4aea-9908-7b555586d7b6",
    "name": "config-a",
    "description": null,
    "datastore_version_id": "c5ad9638-b7a1-464b-8cef-721d8c29dbf9",
    "values": {
      "server_id": 1
    }
  }
}


Create Configuration-Group for Slave

Request:

POST /configurations
{
  "configuration": {
    "name": "config-b",
    "datastore": {
      "type": "mysql",
      "version": "mysql-5.5"
    },
    "values": {
      "server_id": 2,
      "read_only": true
    }
  }
}

Response:

{
  "configuration": {
    "id": "fc318e00-3a6f-4f93-af99-146b44912188",
    "name": "config-b",
    "description": null,
    "datastore_version_id": "c5ad9638-b7a1-464b-8cef-721d8c29dbf9",
    "values": {
      "server_id": 2,
      "read_only": true
    }
  }
}

Notes:

  • read_only here is used for illustrative purposes; it can be omitted or set to false on the slave/read-replica as desired.


Create Master

Request:

POST /instances
{
  "instance": {
    "availability_zone": "us-west-1",
    "name": "product-a",
    "datastore": {
      "type": "mysql",
      "version": "mysql-5.5"
    },
    "topology": {
      "mysql": {
        "type": "master",
      }
    },
    "configuration": "b9c8a3f8-7ace-4aea-9908-7b555586d7b6",
    "flavorRef": "7",
    "volume": {
      "size": 1
    }
  }
}

Response:

{
  "instance": {
    "status": "BUILD",
    "id": "dfbbd9ca-b5e1-4028-adb7-f78643e17998",
    "name": "product-a",
    "configuration": {
      "id": "b9c8a3f8-7ace-4aea-9908-7b555586d7b6",
      "name": "config-a",
      "links": [{...}]
    },
    ...
  }
}


Create Slave

Request:

POST /instances
{
  "instance": {
    "availability_zone": "us-west-2",
    "name": "product-b",
    "datastore": {
      "type": "mysql",
      "version": "mysql-5.5"
    },
    "topology": {
      "mysql": {
        "type": "slave",
        "replicates_from": [{"id": "dfbbd9ca-b5e1-4028-adb7-f78643e17998"}]
      }
    },
    "configuration": "fc318e00-3a6f-4f93-af99-146b44912188",
    "flavorRef": "7",,
    "volume": {
      "size": 1
    }
  }
}

Response:

{
  "instance": {
    "status": "BUILD",
    "id": "061aaf4c-3a57-411e-9df9-2d0f813db859",
    "name": "product-b",
    "configuration": {
      "id": "fc318e00-3a6f-4f93-af99-146b44912188",
      "name": "config-b",
      "links": [{...}]
    },
    ...
  }
}

Notes:

  • type and replicates_from will be the only supported input fields, despite replicates_to being returned in a GET /instances/<id> response
  • replicates_from and replicates_to should be arrays to properly represent star + fan-in/multi-source + all-masters replication patterns (Tungsten, MySQL 5.7)
  • For now, replicates_from takes a trove instance uuid, but inevitably will need to be prefixed with resource-names to support multiple dcs and sources
  • Format of trove:dc:tenant:source:id/name, where source is instance, or backup or etc.
  • *or* additional fields like datacenter, tenant, etc. will need to be added to each dict in replicates_from


Show Instance

Request:

GET /instances/dfbbd9ca-b5e1-4028-adb7-f78643e17998

Response:

{
  "instance": {
    "status": "ACTIVE",
    "updated": "2014-02-16T03:38:49"
    "id": "dfbbd9ca-b5e1-4028-adb7-f78643e17998",
    "name": "product-a",
    "datastore": {
      "version": "mysql-5.5",
      "type": "mysql",
    },
    "topology": {
      "mysql": {
        "type": "master",
        "replicates_to": [{"id": "061aaf4c-3a57-411e-9df9-2d0f813db859"}]
      }
    },
    "flavor": {
      "id": "7",
      "links": [{...}]
    },
    "configuration": {
      "id": "b9c8a3f8-7ace-4aea-9908-7b555586d7b6",
      "name": "config-a",
      "links": [{...}]
    }
  }
}


Show Topology

Request:

GET /instances/dfbbd9ca-b5e1-4028-adb7-f78643e17998/topology

Response:

{
  "instance": {
    "status": "ACTIVE",
    "updated": "2014-02-16T03:38:49"
    "id": "dfbbd9ca-b5e1-4028-adb7-f78643e17998",
    "name": "product-a",
    "datastore": {
      "version": "mysql-5.5",
      "type": "mysql",
    },
    "topology": {
      "members": [
        {
          "id": "dfbbd9ca-b5e1-4028-adb7-f78643e17998",
          "name": "product-a",
          "mysql": {
            "type": "master",
            "replicates_to": [{"id": "061aaf4c-3a57-411e-9df9-2d0f813db859"}]
          }
        },
        {
          "id": "061aaf4c-3a57-411e-9df9-2d0f813db859",
          "name": "product-b",
          "mysql": {
            "type": "slave",
            "replicates_from": [{"id": "dfbbd9ca-b5e1-4028-adb7-f78643e17998"}]
          }
        }
      ]
    }
    "flavor": {
      "id": "7",
      "links": [{...}]
    },
    "configuration": {
      "id": "b9c8a3f8-7ace-4aea-9908-7b555586d7b6",
      "name": "config-a",
      "links": [{...}]
    }
  }
}


Remove Replication (aka "Promote" to Standalone)

Request:

POST /instances/061aaf4c-3a57-411e-9df9-2d0f813db859/topology
{
  "promote": {}
}

Response:

TBD


Notes:

  • The PUT /instances/<id> approach requires bloat to prevent multiple unrelated fields from changing in-tandem.
  • configuration-groups uses PUT, yet things like resize use /action; no ubiquitous approach.


MySQL Master/Master


For Master/Master, the server_id must differ and the increments be offset so as to avoid collisions.

Create Configuration-Group for Master A

Request:

POST /configurations
{
  "configuration": {
    ...
    "values": {
      "server_id": 1,
      "auto_increment_increment": 2,
      "auto_increment_offset": 1
    },
    ...
  }
}

Response:

{
  "configuration": {
    ...
    "values": {
      "server_id": 1,
      "auto_increment_increment": 2,
      "auto_increment_offset": 1
    },
    ...
  }
}


Create Configuration-Group for Master B

Request:

POST /configurations
{
  "configuration": {
    ...
    "values": {
      "server_id": 2,
      "auto_increment_increment": 2,
      "auto_increment_offset": 2
    },
    ...
  }
}

Response:

{
  "configuration": {
    ...
    "values": {
      "server_id": 2,
      "auto_increment_increment": 2,
      "auto_increment_offset": 2
    },
    ...
  }
}


Create Master A

Request:

  • Same as seen in Master/Slave scenario


Response:

  • Same as seen in Master/Slave scenario


Create Master B

Request:

POST /instances
{
  "instance": {
    ...
    "topology": {
      "mysql": {
        "type": "master",
        "join": [{"id": "dfbbd9ca-b5e1-4028-adb7-f78643e17998"}]
      }
    },
    ...
  }
}

Response:

{
  "instance": {
    "id": "061aaf4c-3a57-411e-9df9-2d0f813db859",
    ...
  }
}


Show Instance

Request:

  • Same as in Master/Slave scenario


Response:

{
  "instance": {
    ...
    "topology": {
      "mysql": {
        "type": "master",
        "join": [{"id": "dfbbd9ca-b5e1-4028-adb7-f78643e17998"}]
      }
    },
    ...
  }
}


Show Topology

Request:

  • Same as in Master/Slave scenario


Response:

{
  "topology": {
    "id": "377d54bb-9e89-4ac3-bf29-f78c2fd4faca",
    ...
    "members": [
      {
        "id": "dfbbd9ca-b5e1-4028-adb7-f78643e17998",
        ...
        "topology": {
          "mysql": {
            "type": "master",
            "join": [{"id": "061aaf4c-3a57-411e-9df9-2d0f813db859"}]
          }
        }
      },
      {
        "id": "061aaf4c-3a57-411e-9df9-2d0f813db859",
        ...
        "topology": {
          "mysql": {
            "type": "master",
            "join": [{"id": "dfbbd9ca-b5e1-4028-adb7-f78643e17998"}]
          }
        }
      }
    ]
  }
}


Remove Replication (aka "Promote")

Request:

POST /instances/061aaf4c-3a57-411e-9df9-2d0f813db859/action
{
  "update_topology": {
    "topology": {
      "mysql": {
        "type": "master",
        "join": []
      }
    }
  }
}

*or*

PUT /instances/061aaf4c-3a57-411e-9df9-2d0f813db859
{
  "instance": {
    "topology": {
      "mysql": {
        "type": "master",
        "join": []
      }
    }
  }
}

*or*

POST /instances/061aaf4c-3a57-411e-9df9-2d0f813db859/topology
{
  "promote": {}
}

Response:

TBD

MongoDB


Create Replica-Set



Create Initial Replica-Set

Request:

POST /instances
{
  "instance": {
    "name": "product-a",
    ...
    "datastore": {
      "type": "mongodb",
      "version": "mongodb-2.0.4"
    },
    "topology": {
      "mongodb": {
        "type": "member",
        "replica_set": "products"
      }
    },
    ...
  }
}

Response:

{
  "instance": {
    "id": "dfbbd9ca-b5e1-4028-adb7-f78643e17998",
    ...
  }
}


Notes:

  • Enforce 'replica_set' field to be provided for MongoDB, even in the case of a standalone/single instance. See http://www.mongodb.com/blog/post/dont-let-your-standalone-mongodb-server-stand-alone for reasoning.
  • The lack of a 'join' field indicates that the intention is to create a new replica-set. If an active replica-set by that name already exists, the request will be failed.
  • 'type' is 'member' vs. 'primary' because in a replica-set, the primary is dynamic and can change in an election.


Add Member to Replica-Set



Request:

POST /instances
{
  "instance": {
    "name": "product-b",
    ...
    "topology": {
      "mongodb": {
        "type": "member",
        "join": "products"
      }
    },
    ...
  }
}


Response:

{
  "instance": {
    "status": "BUILD",
    "id": "061aaf4c-3a57-411e-9df9-2d0f813db859",
    ...
  }
}


Notes:

  • If the 'replica_set' field is included in addition to the 'join' field, the request will be failed.
  • If there is no existing replica-set for the tenant by the 'join' value, the request will be failed.
  • Will have to use 'db.isMaster()' to determine the current master to execute replica-set commands against.
  • Will use http://docs.mongodb.org/manual/tutorial/expand-replica-set/#configure-and-add-a-member
  • Should protect against adding more than 12 members to a replica-set
  • Should protect against adding more than 7 voting members to a replica-set
  • Should return warning when number of voting members is even and there is no arbiter


Add Another Member to Replica-Set



Request:

POST /instances
{
  "instance": {
    "name": "product-c",
    ...
    "topology": {
      "mongodb": {
        "type": "member",
        "join": "products"
      }
    },
    ...
  }
}


Response:

{
  "instance": {
    "status": "BUILD",
    "id": "3a72ee87-cf3e-40f1-a1e1-fe8c7263a782",
    ...
  }
}


Show Instance



Request:

GET /instances/dfbbd9ca-b5e1-4028-adb7-f78643e17998

Response:

{
  "instance": {
    ...
    "topology": {
      "mongodb": {
        "type": "member",
        "replica_set": "products"
      }
    },
    ...
  }
}


Show Topology



Request:

GET /instances/dfbbd9ca-b5e1-4028-adb7-f78643e17998/topology

Response:

{
  "instance": {
    ...
    "topology": {
      "members": [
        {
          "id": "dfbbd9ca-b5e1-4028-adb7-f78643e17998",
          "name": "product-a",
          ...
          "mongodb": {
	    "type": "member",
	    "replica_set": "products"
	  }
        },
        {
          "id": "061aaf4c-3a57-411e-9df9-2d0f813db859",
          "name": "product-b",
          ...
          "mongodb": {
	    "type": "member",
	    "replica_set": "products"
	  }
        },
        {
          "id": "3a72ee87-cf3e-40f1-a1e1-fe8c7263a782",
          "name": "product-c",
          ...
          "mongodb": {
	    "type": "member",
	    "replica_set": "products"
	  }
        }
      ]
    }
    ...
  }
}


Add Aribiter



Request:

POST /instances
{
  "instance": {
    "name": "product-arbiter",
    ...
    "topology": {
      "mongodb": {
        "type": "arbiter",
        "join": "products"
      }
    },
    ...
  }
}

Response:

{
  "instance": {
    "status": "BUILD",
    "id": "a1b62aaa-7863-4384-8250-59024141c1f8",
    ...
  }
}


Add a Delayed Member



Request:

POST /instances
{
  "instance": {
    "name": "product-delayed",
    ...
    "topology": {
      "mongodb": {
        "type": "member",
        "join": "products",
        "priority": 0,
        "hidden": true,
        "slaveDelay": 3600
      }
    },
    ...
  }
}

Response:

{
  "instance": {
    "status": "BUILD",
    "id": "7d8eb019-931b-4b2a-88d2-4c9f0ca1b29e",
    ...
  }
}


Notes:

  • 'type', 'replica_set', 'join', 'priority', 'hidden', and 'slaveDelay' are the only fields supported in topology.mongodb{}. All other configuration values must be set via a Configuration Group. After more thought, consider supporting 'hostname' and 'votes' as well.
  • Why isn't 'priority', 'hidden' and 'slaveDelay' in Configuration Groups you ask? This is explained in "Modifying a Replica-Set" below.


Modifying a Replica-Set


Thus far we've been able to model building a replica-set, adding an arbiter, adding a delayed secondary member, etc. Let's continue with how to modify a replica-set.

Example:

# from http://docs.mongodb.org/manual/tutorial/configure-secondary-only-replica-set-member/#example
cfg = rs.conf()
cfg.members[0].priority = 2
cfg.members[1].priority = 1
cfg.members[2].priority = 0.5
cfg.members[3].priority = 0
rs.reconfig(cfg)

Executing these priority changes one at a time can have catastrophic results, so it must be done as a transaction (with rs.reconfig() commiting). However, without the ability to address the cluster (i.e. multiple members at once), this becomes impossible. The only backdoor solution would be to guarantee that the MongoDB user(s) presented to the cloud tenant all have the clusterAdmin role, as this would allow them to connect to the primary and execute such transactions themselves via the native client. Obviously however, granting clusterAdmin to every DBaaS user in MongoDB is unacceptable in most deployments.

Option #1: PATCH /instances/:id/topology

Request:

PATCH /instances/dfbbd9ca-b5e1-4028-adb7-f78643e17998/topology

{
  "instance": {
    "topology": {
      "members": [
        {
          "id": "dfbbd9ca-b5e1-4028-adb7-f78643e17998",
          "mongodb": {
	    "priority": 2
	  }
        },
        {
          "id": "061aaf4c-3a57-411e-9df9-2d0f813db859",
          "mongodb": {
	    "priority": 1
	  }
        },
        {
          "id": "3a72ee87-cf3e-40f1-a1e1-fe8c7263a782",
          "mongodb": {
	    "priority": 0.5
	  }
        }
      ]
    }
  }

Response:

{
  "instance": {
    ...
    "topology": {
      "members": [
        {
          "id": "dfbbd9ca-b5e1-4028-adb7-f78643e17998",
          "name": "product-a",
          ...
          "mongodb": {
	    "type": "member",
	    "replica_set": "products"
	  }
        },
        {
          "id": "061aaf4c-3a57-411e-9df9-2d0f813db859",
          "name": "product-b",
          ...
          "mongodb": {
	    "type": "member",
	    "replica_set": "products"
	  }
        },
        {
          "id": "3a72ee87-cf3e-40f1-a1e1-fe8c7263a782",
          "name": "product-c",
          ...
          "mongodb": {
	    "type": "member",
	    "replica_set": "products"
	  }
        }
      ]
    }
    ...
  }
}


Notes:

  • An HTTP PATCH vs. PUT because the omission of a field or structure should not be an indication to drop/delete it.
  • All modified fields in a request will be changed transactionally in a single rs.reconfig().
  • It should now be clear why 'priority', 'hidden' and 'slaveDelay' are in topology.mongodb{} vs. a configuration-group: when a configuration-group is changed, an event is immediately triggered to update any attached trove instances. Therefore, if you have a heterogeneous mixture of configuration-groups in a replica-set, there is no way to coordinate a consolidated rs.reconfig().
  • Downside: if topology.mongodb{} may have fields returned on a GET that you cannot change in a PATCH/PUT; re-worded, the granularity of what is permissible to change in a PATCH becomes complicated to check and validate.
  • TBD on what should be returned in the mongodb{} on a GET /instance/:id and GET /instance/:id/topology. It's a question of whether we should persist anything beyond the 'type' and 'replica_set'. If say the 'priority' is stored, you introduce the possibility of drift from the truth, but can easily return it on a GET; if it's not stored, do we prompt MongoDB for the truth on a GET, or is that too computationally expensive?



Option #2: POST /instances/:id/topology/action

Request:

POST /instances/dfbbd9ca-b5e1-4028-adb7-f78643e17998/topology/action

{
  "mongodb": {
    "update_members": [
      {
        "id": "dfbbd9ca-b5e1-4028-adb7-f78643e17998",
        "priority": 2
      },
      {
        "id": "061aaf4c-3a57-411e-9df9-2d0f813db859",
        "priority": 1
      },
      {
        "id": "3a72ee87-cf3e-40f1-a1e1-fe8c7263a782",
        "priority": 0.5
      }
    ]
  }
}


Notes:

  • 'update_members' will only permit 'priority', 'hidden', and 'slaveDelay' (possibly 'votes' and 'hostname' as mentioned earlier).
  • Due to the limited field-set, this approach is much more fine-grained than the PATCH approach in Option #1.



Decision: Option #2 is more fine-grained, easier to reason about, and less error-prone. As you'll see in later operations (like Remove a Member), the action vs. PUT/PATCH approach is also more appropriate.

Remove a Member


Removing a member is not the same as deleting one, therefore DELETE /instances/:id is not appropriate.

Option #1: PUT /instances/:id/topology

Request:

PUT /instances/dfbbd9ca-b5e1-4028-adb7-f78643e17998/topology

{
  "instance": {
    "topology": {
      "members": [
        {
          "id": "dfbbd9ca-b5e1-4028-adb7-f78643e17998",
        },
        {
          "id": "061aaf4c-3a57-411e-9df9-2d0f813db859",
        }
      ]
    }
  }


Notes:

  • By omitting a member{} for id=3a72ee87-cf3e-40f1-a1e1-fe8c7263a782 in a PUT operation, this indicates the member should be removed from the replica-set.
  • It's possible that one might want to modify the 'priority', 'hidden', 'votes', etc. fields of the remaining members while dropping a member. So although the example above does not show it, mongodb{} can be included in a member to indicate other changes, *BUT*, since it's a PUT the expectation of what happens to omitted fields in mongodb{} becomes unclear.


Summary: Not very clean, mildly confusing, and very error-prone (nowhere is a "remove" action ever explicitly implied).

Option #2: POST /instances/:id/topology/action

Request:

POST /instances/dfbbd9ca-b5e1-4028-adb7-f78643e17998/topology/action

{
  "mongodb": {
    "remove_member": {
      "id": "dfbbd9ca-b5e1-4028-adb7-f78643e17998"
    }
  }
}


Notes:

  • mongodb{} wrapper isn't necessary, but provides the benefit of schema validation + declaration of intention/understanding.
  • The 'remove_member' action is explicit here, vs. implicit as seen in the PUT option.
  • 'remove_member' has a strict set of fields that are supported, so there is no question as to what can be provided and what will be honored (as compared to the PUT).


Summary: Fairly clean, with no real drawbacks.

Option #3: POST /instances/:id/topology/remove

POST /instances/dfbbd9ca-b5e1-4028-adb7-f78643e17998/topology/remove

{
  "id": "dfbbd9ca-b5e1-4028-adb7-f78643e17998"
}


Notes:

  • Differs from Option #2 in that the action is in the URI vs. the payload.
  • One drawback of this approach is that not every action will be supported across all datastores. So for example, a POST /instances/:id/topology/changeoplogsize (http://docs.mongodb.org/manual/tutorial/change-oplog-size/) makes absolutely no sense to any datastore other than MongoDB.


Summary: At first glance is cleaner than Option #2 from a payload-perspective, but the URI discoverability and expansion is awful.

Option #4: POST /instances/:id/action

POST /instances/dfbbd9ca-b5e1-4028-adb7-f78643e17998/action

{
  "join": ""
}


Notes:

  • Executed against the instance you wish to remove itself from the cluster, so providing the 'id' in the payload is unnecessary.
  • Drawback: There are actions that are replica-set-wide (or against a subset of the replica-set), meaning Option #1 or #2 or #3 would have to co-exist with this option anyway.
  • Drawback: Increase the number of ways to accomplish the same thing (could unjoin against /instances/:id/action, or against /instances/:id/topology)


Summary: For this very specific example it looks great, but isn't expressive enough for other actions.

Option #5: POST /instances/:id/topology/:id/action

POST /instances/dfbbd9ca-b5e1-4028-adb7-f78643e17998/topology/dfbbd9ca-b5e1-4028-adb7-f78643e17998/action

{
  "mongodb": {
    "remove": {}
  }
}


Notes:

  • The /instances/:id is an arbitrary member in the replica-set, it doesn't matter which one; the topology/:id is then a member of said replica-set that this action will be applied to.
  • Executed against the instance you wish to remove itself from the cluster, so providing the 'id' in the payload is unnecessary.
  • Needs More Thought: Could conceivably allow only specific operations here (like remove/unjoin), but not others that could be accomplished in a PATCH against /instances/:id/topology (like 'priority', 'hidden', etc.)


Summary: Fairly clean with no real drawbacks.

Decision: Option #2; it's extremely fine-grained, easy to reason about, and matches the approach taken in Remove A Member.


MongoDB TokuMX

  • TokuMUX will require a new datastore-version and *possibly* a new manager class (same reasoning as why Tungsten/Galera will have their own datastore-version for MySQL)