Jump to: navigation, search

Trove/Replication-And-Clustering-With-Nodes-3

< Trove
Revision as of 17:33, 8 May 2014 by Amcrn (talk | contribs) (Configuration Groups)

Example: Cassandra


To illustrate the approach, Cassandra is used in the examples below. The eccentricities of each Datastore will be explained in their own sections.

Create Cluster


Request:

POST /instances
{
  "instance": {
    "name": "products",
    "datastore": {
      "type": "cassandra",
      "version": "2.0.6"
    },
    "configuration": "b9c8a3f8-7ace-4aea-9908-7b555586d7b6",
    "flavorRef": "7",
    "volume": {
      "size": 1
    },
    "cluster": {
      "size": 3,
      "nodes": [
        {"region": "phx"},
        {"region": "slc"},
        {"region": "lvs"}
      ]
    }
  }
}

Response:

{
  "instance": {
    "status": "BUILD",
    "id": "dfbbd9ca-b5e1-4028-adb7-f78643e17998",
    "name": "products",
    "created": "2014-04-25T20:19:23",
    "updated": "2014-04-25T20:19:23",
    "links": [{...}],
    "datastore": {
      "type": "cassandra",
      "version": "2.0.6"
    },
    "cluster": {
      "size": 3,
      "nodes": [
        {"id": "416b0b16-ba55-4302-bbd3-ff566032e1c1", "region": "phx"},
        {"id": "7f52e4f9-3fa6-4238-ac08-1ce15197329a", "region": "slc"},
        {"id": "ff9d680c-fde3-49c6-a84e-76173b6df39d", "region": "lvs"}
      ]
    }
  }
}


Notes:

  • For Phase One:
    • cluster.nodes{} will not be supported.
    • cluster.nodes[].region will not be returned.
  • For Phase Two:
    • if cluster.allocations{} is not provided, the current region is assumed.
    • cluster.nodes[].region will always be returned.
  • Cassandra-specific fields that are required to construct the initial cluster (num_tokens, endpoint_snitch, seed ip-list, etc.) are to be determined/calculated based on configuration file values and common-sense.


Show Cluster


Request:

GET /instances/dfbbd9ca-b5e1-4028-adb7-f78643e17998

Response:

{
  "instance": {
    "status": "ACTIVE",
    "id": "dfbbd9ca-b5e1-4028-adb7-f78643e17998",
    "name": "products",
    "created": "2014-04-25T20:19:23",
    "updated": "2014-04-25T20:19:23",
    "links": [{...}],
    "datastore": {
      "type": "cassandra",
      "version": "2.0.6"
    },
    "cluster": {
      "size": 3,
      "nodes": [
        {"id": "416b0b16-ba55-4302-bbd3-ff566032e1c1", "region": "phx"},
        {"id": "7f52e4f9-3fa6-4238-ac08-1ce15197329a", "region": "slc"},
        {"id": "ff9d680c-fde3-49c6-a84e-76173b6df39d", "region": "lvs"}
      ]
    }
  }
}


Notes:

  • In Phase One:
    • cluster.nodes[].region will not be returned.
  • Change: instance.volume.used, instance.ip[], and instance.hostname will never be returned
    • It's possible that instance.ip[] can remain if it only returns the seed ips.
    • It's possible that instance.hostname can remain if it's converted to an array and only contains the seed hostnames.


Show Node


Request:

GET /instances/dfbbd9ca-b5e1-4028-adb7-f78643e17998/nodes/416b0b16-ba55-4302-bbd3-ff566032e1c1

Response:

{
  "node": {
    "status": "ACTIVE",
    "id": "416b0b16-ba55-4302-bbd3-ff566032e1c1",
    "name": "products-1",
    "created": "2014-04-25T20:19:23",
    "updated": "2014-04-25T20:19:23",
    "links": [{...}],
    "ip": ["10.0.0.1"],
    "configuration": {
      "id": "b9c8a3f8-7ace-4aea-9908-7b555586d7b6",
      "links": [{...}],
    },
    "flavor": {
      "id": "7",
      "links": [{...}],
    },
    "volume": {
      "size": 2,
      "used": 0.17
    }
  }
}


Add Node(s)


Request:

POST /instances/dfbbd9ca-b5e1-4028-adb7-f78643e17998/nodes
{
  "nodes": {
    "num": 2,
    "allocations": [
      {"region": "phx"},
      {"region": "phx"}
    ]
  }
}


Response:

HTTP 202 (Empty Body)


Notes:

  • For Phase One:
    • nodes.num will be the only supported field (nodes.allocations will not)
  • For Phase Two:
    • if nodes.allocations[] is not provided, the region of every existing node must match, otherwise the request is failed.


Replace Node


Request:

POST /instances/dfbbd9ca-b5e1-4028-adb7-f78643e17998/action

{
  "replace_node": {
    "id": "7f52e4f9-3fa6-4238-ac08-1ce15197329a"
  }
}


Response:

HTTP 202 (Empty Body)


Notes:


Remove Node


Request:

DELETE /instances/dfbbd9ca-b5e1-4028-adb7-f78643e17998/nodes/7f52e4f9-3fa6-4238-ac08-1ce15197329a


Response:

HTTP 202 (Empty Body)


Notes:


Example: MongoDB


Create Cluster


Request:

POST /instances
{
  "instance": {
    "name": "products",
    "datastore": {
      "type": "mongodb",
      "version": "2.4.10"
    },
    "configuration": "b9c8a3f8-7ace-4aea-9908-7b555586d7b6",
    "flavorRef": "7",
    "volume": {
      "size": 1
    },
    "cluster": {
      "size": 5,
      "nodes": [
        {"region": "phx"},
        {"region": "phx"},
        {"region": "phx"},
        {"region": "slc"},
        {"region": "slc"}
      ]
    }
  }
}

Response:

{
  "instance": {
    "status": "BUILD",
    "id": "dfbbd9ca-b5e1-4028-adb7-f78643e17998",
    "name": "products",
    "created": "2014-04-25T20:19:23",
    "updated": "2014-04-25T20:19:23",
    "links": [{...}],
    "datastore": {
      "type": "mongodb",
      "version": "2.4.10"
    },
    "cluster": {
      "size": 5,
      "nodes": [
        {"id": "416b0b16-ba55-4302-bbd3-ff566032e1c1", "region": "phx"},
        {"id": "965ef811-7c1d-47fc-89f2-a89dfdd23ef2", "region": "phx"},
        {"id": "3642f41c-e8ad-4164-a089-3891bf7f2d2b", "region": "phx"},
        {"id": "7f52e4f9-3fa6-4238-ac08-1ce15197329a", "region": "slc"},
        {"id": "ff9d680c-fde3-49c6-a84e-76173b6df39d", "region": "slc"}
      ]
    }
  }
}


Show Cluster


Request:

GET /instances/dfbbd9ca-b5e1-4028-adb7-f78643e17998

Response:

{
  "instance": {
    "status": "ACTIVE",
    "id": "dfbbd9ca-b5e1-4028-adb7-f78643e17998",
    "name": "products",
    "created": "2014-04-25T20:19:23",
    "updated": "2014-04-25T20:19:23",
    "links": [{...}],
    "datastore": {
      "type": "mongodb",
      "version": "2.4.10"
    },
    "cluster": {
      "size": 5,
      "nodes": [
        {"id": "416b0b16-ba55-4302-bbd3-ff566032e1c1", "region": "phx"},
        {"id": "965ef811-7c1d-47fc-89f2-a89dfdd23ef2", "region": "phx"},
        {"id": "3642f41c-e8ad-4164-a089-3891bf7f2d2b", "region": "phx"},
        {"id": "7f52e4f9-3fa6-4238-ac08-1ce15197329a", "region": "slc"},
        {"id": "ff9d680c-fde3-49c6-a84e-76173b6df39d", "region": "slc"}
      ]
    }
  }
}


Show Node


Request:

GET /instances/dfbbd9ca-b5e1-4028-adb7-f78643e17998/nodes/416b0b16-ba55-4302-bbd3-ff566032e1c1

Response:

{
  "node": {
    "status": "ACTIVE",
    "id": "416b0b16-ba55-4302-bbd3-ff566032e1c1",
    "name": "products-1",
    "created": "2014-04-25T20:19:23",
    "updated": "2014-04-25T20:19:23",
    "links": [{...}],
    "ip": ["10.0.0.1"],
    "configuration": {
      "id": "b9c8a3f8-7ace-4aea-9908-7b555586d7b6",
      "links": [{...}],
    },
    "flavor": {
      "id": "7",
      "links": [{...}],
    },
    "volume": {
      "size": 2,
      "used": 0.17
    }
  }
}


Create Arbiter(s)


Request:

POST /instances/dfbbd9ca-b5e1-4028-adb7-f78643e17998/nodes

{
  "nodes": {
    "num": 2,
    "allocations": [
      {"region": "lvs", "type": "arbiter"},
      {"region": "lvs", "type": "arbiter"}
    ]
  }
}

Response:

HTTP 202 (Empty Body)


Show Cluster (After Arbiters)


Request:

GET /instances/dfbbd9ca-b5e1-4028-adb7-f78643e17998

Response:

{
  "instance": {
    "status": "ACTIVE",
    "id": "dfbbd9ca-b5e1-4028-adb7-f78643e17998",
    "name": "products",
    "created": "2014-04-25T20:19:23",
    "updated": "2014-04-25T20:19:23",
    "links": [{...}],
    "datastore": {
      "type": "mongodb",
      "version": "2.4.10"
    },
    "cluster": {
      "size": 7,
      "nodes": [
        {"id": "416b0b16-ba55-4302-bbd3-ff566032e1c1", "region": "phx"},
        {"id": "965ef811-7c1d-47fc-89f2-a89dfdd23ef2", "region": "phx"},
        {"id": "3642f41c-e8ad-4164-a089-3891bf7f2d2b", "region": "phx"},
        {"id": "7f52e4f9-3fa6-4238-ac08-1ce15197329a", "region": "slc"},
        {"id": "ff9d680c-fde3-49c6-a84e-76173b6df39d", "region": "slc"},
        {"id": "77032c55-4496-4e35-8c0d-6cd1c18e1a9c", "region": "lvs", "type": "arbiter"},
        {"id": "1fd054ed-221f-4c99-8d17-570bcff4c1d2", "region": "lvs", "type": "arbiter"}
      ]
    }
  }
}


Example: MySQL


Create Master


Request:

POST /instances
{
  "instance": {
    "name": "products",
    "datastore": {
      "type": "mysql",
      "version": "5.5"
    },
    "configuration": "b9c8a3f8-7ace-4aea-9908-7b555586d7b6",
    "flavorRef": "7",
    "volume": {
      "size": 1
    }
  }
}

Response:

{
  "instance": {
    "status": "BUILD",
    "id": "dfbbd9ca-b5e1-4028-adb7-f78643e17998",
    "name": "products",
    "created": "2014-04-25T20:19:23",
    "updated": "2014-04-25T20:19:23",
    "links": [{...}],
    "datastore": {
      "type": "mysql",
      "version": "5.5"
    },
    "configuration": {
      "id": "b9c8a3f8-7ace-4aea-9908-7b555586d7b6",
      "links": [{...}],
    },
    "flavor": {
      "id": "7",
      "links": [{...}],
    },
    "volume": {
      "size": 1
    }
  }
}


Create Slave


Request:

POST /instances
{
  "instance": {
    "name": "products-slave",
    "datastore": {
      "type": "mysql",
      "version": "5.5"
    },
    "configuration": "fc318e00-3a6f-4f93-af99-146b44912188",
    "flavorRef": "7",
    "volume": {
      "size": 1
    },
    "slave": {
      "of": "dfbbd9ca-b5e1-4028-adb7-f78643e17998",
      "read_only": true
    }
  }
}

Response:

{
  "instance": {
    "status": "BUILD",
    "id": "061aaf4c-3a57-411e-9df9-2d0f813db859",
    "name": "products",
    "created": "2014-04-25T20:19:23",
    "updated": "2014-04-25T20:19:23",
    "links": [{...}],
    "datastore": {
      "type": "mysql",
      "version": "5.5"
    },
    "configuration": {
      "id": "fc318e00-3a6f-4f93-af99-146b44912188",
      "links": [{...}],
    },
    "flavor": {
      "id": "7",
      "links": [{...}],
    },
    "volume": {
      "size": 1
    },
    "slave": {
      "of": "dfbbd9ca-b5e1-4028-adb7-f78643e17998",
      "read_only": true
    }
  }
}


Show Master


Request:

GET /instances/dfbbd9ca-b5e1-4028-adb7-f78643e17998

Response:

{
  "instance": {
    "status": "ACTIVE",
    "id": "dfbbd9ca-b5e1-4028-adb7-f78643e17998",
    "name": "products",
    "created": "2014-04-25T20:19:23",
    "updated": "2014-04-25T20:19:23",
    "links": [{...}],
    "datastore": {
      "type": "mysql",
      "version": "5.5"
    },
    "configuration": {
      "id": "b9c8a3f8-7ace-4aea-9908-7b555586d7b6",
      "links": [{...}],
    },
    "flavor": {
      "id": "7",
      "links": [{...}],
    },
    "volume": {
      "size": 1
    },
    "slave": {
      "list": [
        {"id": "061aaf4c-3a57-411e-9df9-2d0f813db859"}
      ]
    }
  }
}


Show Slave


Request:

GET /instances/061aaf4c-3a57-411e-9df9-2d0f813db859

Response:

{
  "instance": {
    "status": "ACTIVE",
    "id": "061aaf4c-3a57-411e-9df9-2d0f813db859",
    "name": "products",
    "created": "2014-04-25T20:19:23",
    "updated": "2014-04-25T20:19:23",
    "links": [{...}],
    "datastore": {
      "type": "mysql",
      "version": "5.5"
    },
    "configuration": {
      "id": "fc318e00-3a6f-4f93-af99-146b44912188",
      "links": [{...}],
    },
    "flavor": {
      "id": "7",
      "links": [{...}],
    },
    "volume": {
      "size": 1
    },
    "slave": {
      "of": "dfbbd9ca-b5e1-4028-adb7-f78643e17998",
      "read_only": true
    }
  }
}


Detach Slave


Request:

POST /instances/061aaf4c-3a57-411e-9df9-2d0f813db859/action

{
  "detach": {}
}

Response:

HTTP 202 (Empty Body)


Delete Master


Request:

DELETE /instances/dfbbd9ca-b5e1-4028-adb7-f78643e17998


Response:

HTTP 202 (Empty Body)


Notes:

  • How to handle situation in which a slave is attached to a master, and the user attempts to delete the master?


Delete Slave


Request:

DELETE /instances/061aaf4c-3a57-411e-9df9-2d0f813db859


Response:

HTTP 202 (Empty Body)


Data Model Changes

Nodes Table

Create a new 'nodes' Table:

CREATE TABLE "nodes" (
  "id" varchar(36) NOT NULL,
  "instance_id" varchar(36) NOT NULL,
  "created" datetime DEFAULT NULL,
  "updated" datetime DEFAULT NULL,
  "name" varchar(255) DEFAULT NULL,
  "hostname" varchar(255) DEFAULT NULL,
  "compute_instance_id" varchar(36) DEFAULT NULL,
  "task_id" int(11) DEFAULT NULL,
  "task_description" varchar(32) DEFAULT NULL,
  "task_start_time" datetime DEFAULT NULL,
  "volume_id" varchar(36) DEFAULT NULL,
  "flavor_id" int(11) DEFAULT NULL,
  "volume_size" int(11) DEFAULT NULL,
  "tenant_id" varchar(36) DEFAULT NULL,
  "server_status" varchar(64) DEFAULT NULL,
  "deleted" tinyint(1) DEFAULT NULL,
  "deleted_at" datetime DEFAULT NULL,
  "datastore_version_id" varchar(36) NOT NULL,
  "configuration_id" varchar(36) DEFAULT NULL,
  PRIMARY KEY ("id"),
  KEY "instance_id" ("instance_id"),
  KEY "datastore_version_id" ("datastore_version_id"),
  KEY "configuration_id" ("configuration_id"),
  KEY "instances_tenant_id" ("tenant_id"),
  KEY "instances_deleted" ("deleted"),
  CONSTRAINT "nodes_ibfk_3" FOREIGN KEY ("instance_id") REFERENCES "instances" ("id"),
  CONSTRAINT "nodes_ibfk_2" FOREIGN KEY ("configuration_id") REFERENCES "configurations" ("id"),
  CONSTRAINT "nodes_ibfk_1" FOREIGN KEY ("datastore_version_id") REFERENCES "datastore_versions" ("id")
) ENGINE=InnoDB DEFAULT CHARSET=utf8;


aka the same table as instances, except:

  • addition of: "instance_id" varchar(36) NOT NULL
  • addition of: KEY "instance_id" ("instance_id")
  • addition of: CONSTRAINT "instances_ibfk_3" FOREIGN KEY ("instance_id") REFERENCES "instances" ("id"),
  • TODO: changing of 'DEFAULT NULL' to 'NOT NULL' whenever possible (ex: things like CREATED should never be NULL)
  • TODO: addition of removal of indexes as deemed necessary


Alter Instances Table

Add slave_of Column to Instances Table (+ Constraint + Index):

ALTER TABLE nodes ADD COLUMN slave_of VARCHAR(36) DEFAULT NULL;
KEY "slave_of" ("slave_of"),
CONSTRAINT "instances_ibfk_3" FOREIGN KEY ("slave_of") REFERENCES "instances" ("id")


Alter Other Tables

Add node_id column to the following tables:

  • agent_heartbeats
  • backups
  • conductor_lastseen
  • root_enabled_history
  • security_group_instance_associations
  • service_statuses
  • usage_events


ALTER TABLE <table> ADD COLUMN node_id VARCHAR(36) DEFAULT NULL;
KEY "node_id" ("node_id"),
CONSTRAINT "<table>_ibfk_<num>" FOREIGN KEY ("node_id") REFERENCES "nodes" ("id")


TaskManager

  • add node_id to /etc/guest_info (if it's a node in a cluster). guest_id remains as-is.
  • for-loop create each node.
  • poll until all nodes are active.
  • for each node: use trove/nova to get ip/hostname
  • for couchbase:
    • send ip/hostname list via rpc cast to guest
  • for cassandra:
    • send seed ip list via rpc cast to guest seed nodes, one by one (polling on REBOOT => ACTIVE), then to rest of nodes.
  • for mongodb:
    • send ip/hostname list via rpc cast to guest that is the db.isMaster()


Guest

  • update heartbeat payload ( heartbeat(guest_id, payload, sent) ) from {"service_status": "<status>"} to {"service_status": "<status>", "node_id": "<node-id>"}
  • add method to each datastore guest manager for handling ip/hostname list


Conductor

  • update heartbeat logic to update the nodes table (node status)


Capabilities


A capability might be supported for a datastore-version for standalone instances, but not for clusters. Therefore, the capability tables must be amended to include a cluster-enabled flag.

ALTER TABLE capabilities ADD COLUMN enabled_cluster TINYINT(1) DEFAULT NULL;
ALTER TABLE capability_overrides ADD COLUMN enabled_cluster TINYINT(1) DEFAULT NULL;

The following capabilities should have enabled_cluster set to false for the first iteration of clusters:

  • backup-create + list-instance
  • configuration-attach + detach + instances
  • resize-<all>
  • database-<all>
  • root-<all>
  • secgroup-<all>
  • user-<all>


Configuration Groups


Introduce read_only and hidden Parameters

Need to introduce two additional attributes for configuration group parameters: read_only and hidden.

  • read_only fields include cluster_name, num_tokens, seed_provider, seeds, endpoint_snitch (cassandra) + replSet (mongodb) + server_id, log_bin (mysql).
  • depending on the provider, some of the read_only fields should also be hidden from the user on a configuration-show.
  • once read_only + hidden are available, a parallel effort should move configuration-default to configurations-show if a configuration-group is attached.


Auto-Create and Attach

  • cassandra & mongodb need to have configuration-groups automatically created and attached to each node (for cluster_name, replset, etc.) during provisioning.
  • unique configuration-group per node.
  • auto-created+attached configuration-groups need to not be detachable from the instance.
  • dependency: configuration-group support for mongodb + cassandra


Feedback

Create Slave

  • glucas: Replication will require capturing a snapshot of the master's state and passing that to the slave. It would be preferable to create multiple slaves from a single snapshot and then clean up, rather than repeating the snapshot process multiple times. For that reason we propose a 'replicate' action that can create N slaves from an existing master in one call. I believe this approach works with the schema changes proposed here, i.e. adding the slave_of reference to the instance table.
  • mwj: I thought the point of the topology design was to not introduce properties to the instance table that would not be relevant to all (i.e., non-replication) instances. I take it this is no longer an issue?

Capabilities

  • glucas: We should use capabilities to indicate whether a datastore supports replication (and potentially read-only vs. read-write replication). In the first iteration, replication will not be supported for clusters. mwj: Is it really necessary to have a capability for replication? We are designing replication to use backup/restore functionality, so the backup/restore capability should be enough, no?

Configuration Groups

  • dougshelley66: What if the end user wants to specify some configuration parameters for nodes or replicas? It appears (from the description above) that each node will have an auto generated group attached. Given an instance can only have one config group, this would allow the user to specify one?
  • amcrn (talk) 17:33, 8 May 2014 (UTC): here's how it'd work: you want a master/slave setup, with 2 slaves. all three nodes would get their own configuration-group automatically created and attached (3 unique ids). if the user wishes to configure some parameters, they can do so. if the user provides a configuration-id on provisioning, then we'd add the read_only/hidden parameters to their configuration-group automatically.

Promote Slave

  • mwj: I would rather call this "detach slave" as I expect will may want to have "promote slave" be used in future fail-over designs.
  • amcrn (talk) 17:30, 8 May 2014 (UTC): agreed, just went with the industry standard as a default. changed it to "detach" because i agree with you.

Delete Slave

  • mwj: Shouldn't "delete instance" just work, even for slaves?
  • amcrn (talk) 17:29, 8 May 2014 (UTC): didn't have the example above, but yes it should. added an example.