Trove/Replication-And-Clustering-With-Nodes

=This Wiki Is Outdated=

11/2014: This proposal has been superseded and is kept around for historical purposes. To see the true Clustering specification, visit https://blueprints.launchpad.net/trove/+spec/clustering. To see the true Replication specification, see https://blueprints.launchpad.net/trove/+spec/replication-v1

Summary of Route Changes
New Route: GET /instances//node/ { "node": { "status": " ", "id": "", "name": " ", "created": " ", "updated": " ", "links": [{...}], "ip": [""], //or hostname, depending on conf "volume": { "size": , "used": } } } New Route: POST /instances//node//action { " ": {     } }

Summary

 * the term "instance" now represents either a single standalone instance or a cluster of nodes.
 * each node in a cluster must initially be homogenous in respect to flavor, volume-size, configuration-group, datastore/datastore-version, security-groups, etc.
 * in this approach, a slave is not considered a node, and therefore the two terms and concepts are exclusive.
 * the instance.name now acts as the cluster name, both in trove as well as the datastore (e.g. cluster_name/replSet)
 * node.name is the instance.name with "- " as a suffix (counter for the cluster, starting at 1)
 * the semantic meaning of instance.status, and the possible states, will change
 * the node.status will likely mirror the state values and transitions seen in instance.status prior to the clustering implementation.
 * the cluster.nodes[].ids are generated uuids

Features for First Iteration
Supported: Not Supported:
 * Create Multi-Node Cluster
 * Restart Multi-Node Cluster
 * Terminate Multi-Node Cluster
 * Increase Cluster Size
 * No 'POST /instances//node//action' implementations will land in the first iteration. This means no restart-node, delete-node, reboot-node, promote-node, repair-node, etc.
 * Decreasing Cluster Size
 * Resizing Flavor/Volume
 * Backups
 * Altering/Adding/Dropping Security Groups

Example: Cassandra
To illustrate the approach, Cassandra is used in the examples below. The eccentricities of each Datastore will be explained in their own sections.

Create Instance
Request:

POST /instances { "instance": { "name": "products", "datastore": { "type": "cassandra", "version": "2.0.6" },   "configuration": "b9c8a3f8-7ace-4aea-9908-7b555586d7b6", "flavorRef": "7", "volume": { "size": 1 },   "cluster": { "size": 3 } } }

Response:

{ "instance": { "status": "BUILD", "id": "dfbbd9ca-b5e1-4028-adb7-f78643e17998", "name": "products", "created": "2014-04-25T20:19:23", "updated": "2014-04-25T20:19:23", "links": [{...}], "datastore": { "type": "cassandra", "version": "2.0.6" },   "configuration": { "id": "b9c8a3f8-7ace-4aea-9908-7b555586d7b6", "links": [{...}], },   "flavor": { "id": "7", "links": [{...}], },   "volume": { "size": 1 },   "cluster": { "size": 3, "nodes": [ {"id": "416b0b16-ba55-4302-bbd3-ff566032e1c1"}, {"id": "7f52e4f9-3fa6-4238-ac08-1ce15197329a"}, {"id": "ff9d680c-fde3-49c6-a84e-76173b6df39d"} ]   }  } } Notes:
 * Cassandra-specific fields that are required to construct the initial cluster (num_tokens, endpoint_snitch, seed ip-list, etc.) are to be determined/calculated based on configuration file values and common-sense.

Show Instance
Request:

GET /instances/dfbbd9ca-b5e1-4028-adb7-f78643e17998

Response:

{ "instance": { "status": "ACTIVE", "id": "dfbbd9ca-b5e1-4028-adb7-f78643e17998", "name": "products", "created": "2014-04-25T20:19:23", "updated": "2014-04-25T20:19:23", "links": [{...}], "datastore": { "type": "cassandra", "version": "2.0.6" },   "configuration": { "id": "b9c8a3f8-7ace-4aea-9908-7b555586d7b6", "links": [{...}], },   "flavor": { "id": "7", "links": [{...}], },   "volume": { "size": 1 },   "cluster": { "size": 3, "nodes": [ {"id": "416b0b16-ba55-4302-bbd3-ff566032e1c1"}, {"id": "7f52e4f9-3fa6-4238-ac08-1ce15197329a"}, {"id": "ff9d680c-fde3-49c6-a84e-76173b6df39d"} ]   }  } } Notes:
 * Change: instance.volume.used, instance.ip[], and instance.hostname will never be returned
 * It's possible that instance.ip[] can remain if it only returns the seed ips.
 * It's possible that instance.hostname can remain if it's converted to an array and only contains the seed hostnames.

Show Node
Request:

GET /instances/dfbbd9ca-b5e1-4028-adb7-f78643e17998/node/416b0b16-ba55-4302-bbd3-ff566032e1c1

Response:

{ "node": { "status": "ACTIVE", "id": "416b0b16-ba55-4302-bbd3-ff566032e1c1", "name": "products-1", "created": "2014-04-25T20:19:23", "updated": "2014-04-25T20:19:23", "links": [{...}], "ip": ["10.0.0.1"], "volume": { "size": 2, "used": 0.17 } } }

Resize Cluster
Request:

POST /instances/dfbbd9ca-b5e1-4028-adb7-f78643e17998/action

{ "resize_cluster": { "size": 6 } } Response: HTTP 202 (Empty Body)

All Other Operations
Notes:
 * As mentioned in the Summary, other operations like: removing a node, replacing a running node, replacing a dead node, repairing a node, etc. are not supported in the first iteration.

Create Instance
Request:

POST /instances { "instance": { "name": "products", "datastore": { "type": "mysql", "version": "5.5" },   "configuration": "b9c8a3f8-7ace-4aea-9908-7b555586d7b6", "flavorRef": "7", "volume": { "size": 1 } } }

Response:

{ "instance": { "status": "BUILD", "id": "dfbbd9ca-b5e1-4028-adb7-f78643e17998", "name": "products", "created": "2014-04-25T20:19:23", "updated": "2014-04-25T20:19:23", "links": [{...}], "datastore": { "type": "mysql", "version": "5.5" },   "configuration": { "id": "b9c8a3f8-7ace-4aea-9908-7b555586d7b6", "links": [{...}], },   "flavor": { "id": "7", "links": [{...}], },   "volume": { "size": 1 } } }

Create Slave
Request:

POST /instances { "instance": { "name": "products-slave", "datastore": { "type": "mysql", "version": "5.5" },   "configuration": "fc318e00-3a6f-4f93-af99-146b44912188", "flavorRef": "7", "volume": { "size": 1 },   "cluster": { "slave": { "of": "dfbbd9ca-b5e1-4028-adb7-f78643e17998", "read_only": true }   }  } }

Response:

{ "instance": { "status": "BUILD", "id": "061aaf4c-3a57-411e-9df9-2d0f813db859", "name": "products", "created": "2014-04-25T20:19:23", "updated": "2014-04-25T20:19:23", "links": [{...}], "datastore": { "type": "mysql", "version": "5.5" },   "configuration": { "id": "fc318e00-3a6f-4f93-af99-146b44912188", "links": [{...}], },   "flavor": { "id": "7", "links": [{...}], },   "volume": { "size": 1 },   "cluster": { "slave": { "of": "dfbbd9ca-b5e1-4028-adb7-f78643e17998", "read_only": true }   }  } }

Show Master
Request:

GET /instances/dfbbd9ca-b5e1-4028-adb7-f78643e17998

Response:

{ "instance": { "status": "ACTIVE", "id": "dfbbd9ca-b5e1-4028-adb7-f78643e17998", "name": "products", "created": "2014-04-25T20:19:23", "updated": "2014-04-25T20:19:23", "links": [{...}], "datastore": { "type": "mysql", "version": "5.5" },   "configuration": { "id": "b9c8a3f8-7ace-4aea-9908-7b555586d7b6", "links": [{...}], },   "flavor": { "id": "7", "links": [{...}], },   "volume": { "size": 1 },   "cluster": { "slaves": [ {"id": "061aaf4c-3a57-411e-9df9-2d0f813db859"} ]   }  } }

Show Slave
Request:

GET /instances/061aaf4c-3a57-411e-9df9-2d0f813db859

Response:

{ "instance": { "status": "ACTIVE", "id": "061aaf4c-3a57-411e-9df9-2d0f813db859", "name": "products", "created": "2014-04-25T20:19:23", "updated": "2014-04-25T20:19:23", "links": [{...}], "datastore": { "type": "mysql", "version": "5.5" },   "configuration": { "id": "fc318e00-3a6f-4f93-af99-146b44912188", "links": [{...}], },   "flavor": { "id": "7", "links": [{...}], },   "volume": { "size": 1 },   "cluster": { "slave": { "of": "dfbbd9ca-b5e1-4028-adb7-f78643e17998", "read_only": true }   }  } }

MongoDB
Notes:
 * instance.name will function as the replica-set name
 * The smallest viable deployment pattern for a replica-set is 3 nodes. A three member replica-set can either be (a) a primary and two secondaries or (b) a primary, a secondary, and an arbiter. option (b) provides less redundancy and fault tolerance, so it will not be supported in this first iteration. Therefore, size=3 by default creates a three member replica-set with one primary and two secondaries.
 * instance.cluster.size will only support '3', '5', and '7' as values for MongoDB in this first iteration. Why? (a) an odd number of members ensures the replica-set is always able to elect a primary (b) only 7 members can vote at a time.
 * Scaling the Nodes from 3 to 5, or 5 to 7 will be supported.
 * A replica-set can have up to 12 members, but only 7 voting members. To avoid having to implement and handle 'priority'/'hidden' in the first iteration, the maximum number of nodes is capped at 7.
 * Changing the priority, hidden, slaveDelay, or adding an Arbiter will not be supported in this first iteration.

Couchbase
Notes:
 * instance.cluster.size can be any positive number
 * Resizing the Cluster by increasing the number will be supported.
 * Removing a Node will not be supported in the first iteration.

Option #1: Instances Table
Notes: Example Database Rows (Represented in JSON)
 * on db_sync backfill existing rows with instance.type as 'standalone'
 * type could be an int instead of a varchar, backed by a class enum, but that doesn't seem to be the "OpenStack Way".
 * once a slave is created, the master's type is changed from 'standalone' to 'master'
 * for clusters, an additional row is inserted (the "cluster row"), with type=cluster.
 * parent_id is either the master's uuid (if the row is type=slave), or the cluster's shell row uuid (if the row is type=node)
 * node rows will not infer data (like flavor_id) from the cluster row; instead it will be "redundantly" included.

Standalone Before: { "id": "035dc57c-659c-4cab-83e8-29ec99437655", "created": "2014-04-20 19:38:59", "updated": "2014-04-20 19:39:13", "name": "products", "hostname": null, "compute_instance_id": "1bfa857e-0119-49f5-9f41-d4b5308b69d8", "task_id": 1, "task_description": "No tasks for the instance.", "task_start_time": null, "volume_id": "cdfabe94-6952-4279-b208-9917ac7d30ed", "flavor_id": 10, "volume_size": 200, "tenant_id": "19fb5380ec8444358ad3a9d26ced6b8f", "server_status": null, "deleted": 0, "deleted_at": null, "datastore_version_id": "4141cc0b-6300-4706-a9f2-3ce757fec859", "configuration_id": null } Standalone After: { "id": "035dc57c-659c-4cab-83e8-29ec99437655", "created": "2014-04-20 19:38:59", "updated": "2014-04-20 19:39:13", "name": "products", "hostname": null, "compute_instance_id": "1bfa857e-0119-49f5-9f41-d4b5308b69d8", "task_id": 1, "task_description": "No tasks for the instance.", "task_start_time": null, "volume_id": "cdfabe94-6952-4279-b208-9917ac7d30ed", "flavor_id": 10, "volume_size": 200, "tenant_id": "19fb5380ec8444358ad3a9d26ced6b8f", "server_status": null, "deleted": 0, "deleted_at": null, "datastore_version_id": "4141cc0b-6300-4706-a9f2-3ce757fec859", "configuration_id": null, "type": "standalone", "parent_id": null } Cluster of 3 Nodes: { "rows": [   {      "id": "035dc57c-659c-4cab-83e8-29ec99437655", "created": "2014-04-20 19:38:59", "updated": "2014-04-20 19:44:00", "name": "products", "hostname": null, "compute_instance_id": null, "task_id": 1, "task_description": "No tasks for the cluster.", "task_start_time": null, "volume_id": null, "flavor_id": 10, "volume_size": 200, "tenant_id": "19fb5380ec8444358ad3a9d26ced6b8f", "server_status": null, "deleted": 0, "deleted_at": null, "datastore_version_id": "4141cc0b-6300-4706-a9f2-3ce757fec859", "configuration_id": null, "type": "cluster", "parent_id": null },   {      "id": "ad16a607-bed4-49c7-957e-7d3f869e364f", "created": "2014-04-20 19:38:59", "updated": "2014-04-20 19:39:13", "name": "products-1", "hostname": null, "compute_instance_id": "7b146333-97b0-48cf-b731-a7e4ad61d58f", "task_id": 1, "task_description": "No tasks for the instance.", "task_start_time": null, "volume_id": "801c31b4-d2cd-413c-be7a-1ebdf89f53bb", "flavor_id": 10, "volume_size": 200, "tenant_id": "19fb5380ec8444358ad3a9d26ced6b8f", "server_status": null, "deleted": 0, "deleted_at": null, "datastore_version_id": "4141cc0b-6300-4706-a9f2-3ce757fec859", "configuration_id": null, "type": "node", "parent_id": "035dc57c-659c-4cab-83e8-29ec99437655" },   {      "id": "1f0c10fb-9bb6-45b5-b27d-ff0c32eb3ab3", "created": "2014-04-20 19:39:15", "updated": "2014-04-20 19:40:22", "name": "products-2", "hostname": null, "compute_instance_id": "fa6a6d74-f1bd-4ee8-bd7a-d54cf526bf77", "task_id": 1, "task_description": "No tasks for the instance.", "task_start_time": null, "volume_id": "932225fd-b613-4c62-b3e1-243aae10cc45", "flavor_id": 10, "volume_size": 200, "tenant_id": "19fb5380ec8444358ad3a9d26ced6b8f", "server_status": null, "deleted": 0, "deleted_at": null, "datastore_version_id": "4141cc0b-6300-4706-a9f2-3ce757fec859", "configuration_id": null, "type": "node", "parent_id": "035dc57c-659c-4cab-83e8-29ec99437655" },   {      "id": "b38f0356-6796-434c-acc6-953c081476d0", "created": "2014-04-20 19:41:11", "updated": "2014-04-20 19:42:54", "name": "products-3", "hostname": null, "compute_instance_id": "d120ed57-3f43-4200-b222-802f292f9d56", "task_id": 1, "task_description": "No tasks for the instance.", "task_start_time": null, "volume_id": "5d02d7a1-b7e8-410a-a1bd-eb0e460668fa", "flavor_id": 10, "volume_size": 200, "tenant_id": "19fb5380ec8444358ad3a9d26ced6b8f", "server_status": null, "deleted": 0, "deleted_at": null, "datastore_version_id": "4141cc0b-6300-4706-a9f2-3ce757fec859", "configuration_id": null, "type": "node", "parent_id": "035dc57c-659c-4cab-83e8-29ec99437655" } ] } Master/Slave: { "rows": [   {      "id": "1f0c10fb-9bb6-45b5-b27d-ff0c32eb3ab3", "created": "2014-04-20 19:39:15", "updated": "2014-04-20 19:40:22", "name": "products", "hostname": null, "compute_instance_id": "fa6a6d74-f1bd-4ee8-bd7a-d54cf526bf77", "task_id": 1, "task_description": "No tasks for the instance.", "task_start_time": null, "volume_id": "932225fd-b613-4c62-b3e1-243aae10cc45", "flavor_id": 10, "volume_size": 200, "tenant_id": "19fb5380ec8444358ad3a9d26ced6b8f", "server_status": null, "deleted": 0, "deleted_at": null, "datastore_version_id": "4141cc0b-6300-4706-a9f2-3ce757fec859", "configuration_id": null, "type": "master", "parent_id": null },   {      "id": "b38f0356-6796-434c-acc6-953c081476d0", "created": "2014-04-20 19:41:11", "updated": "2014-04-20 19:42:54", "name": "products-slave", "hostname": null, "compute_instance_id": "d120ed57-3f43-4200-b222-802f292f9d56", "task_id": 1, "task_description": "No tasks for the instance.", "task_start_time": null, "volume_id": "5d02d7a1-b7e8-410a-a1bd-eb0e460668fa", "flavor_id": 12, "volume_size": 500, "tenant_id": "19fb5380ec8444358ad3a9d26ced6b8f", "server_status": null, "deleted": 0, "deleted_at": null, "datastore_version_id": "4141cc0b-6300-4706-a9f2-3ce757fec859", "configuration_id": "9cc334ee-dca6-4002-a13e-b6a26fd4e47f", "type": "slave", "parent_id": "1f0c10fb-9bb6-45b5-b27d-ff0c32eb3ab3" } ] } Notes:
 * GET /instances/id will need to be amended to: where id= and type!=node
 * GET /instances will need to be amended to understand the type column.
 * GET /instances/id/node/node-id will simply be: where id= and type==node (instances table)
 * Pros
 * Single query lookups for instance and node
 * No additional tables
 * Deep copy of data from cluster row to all node rows:
 * Makes business intelligence and analytics easy
 * Makes it easier to support heterogenous clusters (if it becomes a requirement)
 * Cons
 * If a column needs to be added to instances that is nonsensical from a cluster-level, but sensical on a node-level, then that column cannot be NOT NULL (due to the cluster row). Re-worded, data integrity becomes more difficult if the field applicability diverges.
 * Permutations of Approach:
 * Don't deep copy data into node rows
 * Don't include 'standalone' and 'master' as types, because it's inferable (if row's type=null, and there's another row with parent_id our row, we're a master not a standalone).

Option #2: Nodes/Cluster Relationship Table
Notes:
 * Approach: Same as Option #1, except type/parent_id are moved to a separate table
 * would remove the need for the cluster row in Option #1, but node rows would still exist in instances table
 * would remove the need for any added columns to instances table.
 * Cons
 * GET /instances would need to start with relationship table (rows where type!=node), then select data from instances table.
 * GET /instances/id would start with relationship table (is type!=node), then select data from instances table.
 * No single query lookups for instance and node

Option #3: Nodes Table
CREATE TABLE "nodes" ( "id" varchar(36) NOT NULL,  "instance_id" varchar(36) NOT NULL,  "created" datetime DEFAULT NULL,  "updated" datetime DEFAULT NULL,  "name" varchar(255) DEFAULT NULL,  "hostname" varchar(255) DEFAULT NULL,  "compute_instance_id" varchar(36) DEFAULT NULL,  "task_id" int(11) DEFAULT NULL,  "task_description" varchar(32) DEFAULT NULL,  "task_start_time" datetime DEFAULT NULL,  "volume_id" varchar(36) DEFAULT NULL,  "flavor_id" int(11) DEFAULT NULL,  "volume_size" int(11) DEFAULT NULL,  "tenant_id" varchar(36) DEFAULT NULL,  "server_status" varchar(64) DEFAULT NULL,  "deleted" tinyint(1) DEFAULT NULL,  "deleted_at" datetime DEFAULT NULL,  "datastore_version_id" varchar(36) NOT NULL,  "configuration_id" varchar(36) DEFAULT NULL,  PRIMARY KEY ("id"),  KEY "instance_id" ("instance_id"),  KEY "instances_tenant_id" ("tenant_id"),  KEY "instances_deleted" ("deleted"),  CONSTRAINT "instances_ibfk_2" FOREIGN KEY ("configuration_id") REFERENCES "configurations" ("id"), CONSTRAINT "instances_ibfk_1" FOREIGN KEY ("datastore_version_id") REFERENCES "datastore_versions" ("id") ) ENGINE=InnoDB DEFAULT CHARSET=utf8; aka the same table as instances, except: the addition of the instance_id foreign-key, an index on instance_id, and the removal of a few indexes.