Difference between revisions of "Zaqar/bp/placement-service"
(Fixed indents.) |
m (Malini moved page Marconi/bp/placement-service to Zaqar/bp/placement-service: Project Rename) |
||
(15 intermediate revisions by 2 users not shown) | |||
Line 2: | Line 2: | ||
<gallery> | <gallery> | ||
− | File:Placement-service.jpg|Placement Service Draft | + | File:Placement-service.jpg|Placement Service Draft v0.1 |
</gallery> | </gallery> | ||
− | Marconi | + | '''Rationale''': Marconi has a storage bottleneck |
− | + | '''Proposal goal''': Remove that bottleneck | |
− | + | The placement service aims to address this by handling storage transparently and dynamically. | |
− | |||
− | |||
− | |||
− | |||
− | |||
=== Transparency === | === Transparency === | ||
Line 21: | Line 16: | ||
* Implementation transparency: storage driver is handed a location/connection and only cares about the serialization/deserialization of data to that storage location. | * Implementation transparency: storage driver is handed a location/connection and only cares about the serialization/deserialization of data to that storage location. | ||
− | == | + | === Terminology === |
− | + | * '''Marconi partition''': one Marconi master, a set of Marconi workers, and a storage deployment. This is the minimum abstraction: one adds a Marconi partition, not a storage node or a Marconi worker | |
+ | * '''Marconi master''': receives requests and forwards them round robin to Marconi workers | ||
+ | * '''Marconi workers''': process requests and communicate with storage | ||
+ | * '''Storage deployment''': a set of storage nodes - one or many, as long as they're addressable with a single client connection | ||
− | + | == Reference Deployment: Smart Proxy and Partition as a Unit == | |
− | |||
− | |||
− | |||
− | + | This approach is emerging as the leading reference implementation for handling scaling of the Marconi service. The primary components are: | |
− | + | * A load balancer that can redirect tenant requests to a cluster URL | |
+ | * Operating Marconi at the partition level | ||
− | + | === Partitions === | |
− | |||
− | |||
− | + | * One master to round-robin tasks to workers | |
+ | * N Marconi web servers | ||
+ | * A storage deployment | ||
− | + | Operators can optimize N to match their storage configuration and persistence needs. | |
− | + | === Smart Proxy === | |
− | |||
− | |||
− | |||
− | + | The smart proxy maintains a mapping from tenants/projects (ID-based) to partition URLs. | |
− | + | === Migration Strategy === | |
− | |||
− | |||
− | + | Freezing Export: have a migration service running on each Marconi partition. The service, when given a queue and a destination partition, launches an export worker. The export worker the communicates the desired data to the new partition's migration service, which in turn launches an import worker to bring in the data. In summary: | |
− | + | * "Freeze" the source queue | |
+ | * Export the queue from the source | ||
+ | * Import the queue to the destination | ||
+ | * "Thaw" the queue | ||
− | + | Freeze: set a particular queue as read-only at the proxy layer | |
− | + | Thaw: restore a particular queue to normal status at the proxy layer | |
− | |||
− | === | + | === Advantages === |
− | + | * Easier to implement | |
+ | * No changes to Marconi | ||
+ | * Scalable | ||
+ | * Transparent | ||
− | + | === Disadvantages === | |
− | * | + | * Requires the implementation of a smart proxy - this includes: routing requests, partition management, catalogue management, regeneration. and synchronization |
− | * | + | * Benefits from having access to raw_read and raw_write functions wrt storage layer |
− | + | == Current State == | |
− | + | === Concepts === | |
− | + | ==== Partitions ==== | |
+ | |||
+ | Partitions have: 1) a name, 2) a weight, and 3) a list of node URIs. For example: | ||
<pre><nowiki> | <pre><nowiki> | ||
{ | { | ||
− | + | "default": { | |
− | + | "weight": 100, | |
− | + | "nodes": [ | |
− | + | "http://localhost:8889", | |
− | } | + | "http://localhost:8888", |
+ | "http://localhost:8887", | ||
+ | "http://localhost:8886" | ||
+ | ] | ||
+ | } | ||
} | } | ||
</nowiki></pre> | </nowiki></pre> | ||
− | + | ==== Catalogue ==== | |
− | |||
− | |||
− | |||
− | ==== | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | + | Catalogue entries have: 1) a key, 2) a node URI, and 3) metadata. For example: | |
− | |||
− | |||
<pre><nowiki> | <pre><nowiki> | ||
{ | { | ||
− | + | "{project_id}.{queue_name}": { | |
− | " | + | "href": "http://localhost:8889", |
− | " | + | "metadata": { |
− | + | "awesome": "sauce" | |
− | + | } | |
+ | } | ||
} | } | ||
</nowiki></pre> | </nowiki></pre> | ||
− | + | === API === | |
− | + | <pre><nowiki> | |
+ | GET /v1/partitions # list all registered partitions | ||
− | + | GET /v1/partitions/{name} # fetch details for a single partition | |
+ | PUT /v1/partitions/{name} # register a new partition | ||
+ | DELETE /v1/partitions/{name} # | ||
− | + | # the catalogue is updated by operations routed through /v1/queues/{name} | |
− | + | GET /v1/catalogue # list all entries in the catalogue for the given project ID | |
− | + | GET /v1/catalogue/{name} # fetch info for the given catalogue entry | |
− | + | </nowiki></pre> | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | + | === Implementation === | |
− | === | + | ==== Needs Review ==== |
− | + | * Proxy (partition, catalogue, queues handling): https://review.openstack.org/#/c/43909/ | |
+ | * Proxy (v1, health): https://review.openstack.org/#/c/44356/ | ||
+ | * Proxy (forward the rest of the routes): https://review.openstack.org/#/c/44364/ | ||
− | === | + | ==== To Do ==== |
− | + | * Hierarchical caching: store data in authoritative store (mongo replicaset) on write operations, and cache locally using Redis instance, hitting authoritative only on failed lookups | |
+ | * Benchmarking | ||
+ | * Unit tests | ||
+ | * Functional tests | ||
+ | * Configuration | ||
+ | * Catalogue and partition registry regeneration | ||
− | + | === Deployment === | |
− | |||
− | + | * Bring up authoritative replicaset | |
+ | * Bring up redis-server on each box | ||
+ | * launch marconi.proxy.app:app using a WSGI/HTTP server | ||
− | == | + | == More Ideas/Deprecated == |
− | + | [[Deprecated]] |
Latest revision as of 18:42, 7 August 2014
Contents
Overview
Rationale: Marconi has a storage bottleneck
Proposal goal: Remove that bottleneck
The placement service aims to address this by handling storage transparently and dynamically.
Transparency
- User transparency: availability and use of the Marconi service must not be interrupted when a migration is taking place.
- Implementation transparency: storage driver is handed a location/connection and only cares about the serialization/deserialization of data to that storage location.
Terminology
- Marconi partition: one Marconi master, a set of Marconi workers, and a storage deployment. This is the minimum abstraction: one adds a Marconi partition, not a storage node or a Marconi worker
- Marconi master: receives requests and forwards them round robin to Marconi workers
- Marconi workers: process requests and communicate with storage
- Storage deployment: a set of storage nodes - one or many, as long as they're addressable with a single client connection
Reference Deployment: Smart Proxy and Partition as a Unit
This approach is emerging as the leading reference implementation for handling scaling of the Marconi service. The primary components are:
- A load balancer that can redirect tenant requests to a cluster URL
- Operating Marconi at the partition level
Partitions
- One master to round-robin tasks to workers
- N Marconi web servers
- A storage deployment
Operators can optimize N to match their storage configuration and persistence needs.
Smart Proxy
The smart proxy maintains a mapping from tenants/projects (ID-based) to partition URLs.
Migration Strategy
Freezing Export: have a migration service running on each Marconi partition. The service, when given a queue and a destination partition, launches an export worker. The export worker the communicates the desired data to the new partition's migration service, which in turn launches an import worker to bring in the data. In summary:
- "Freeze" the source queue
- Export the queue from the source
- Import the queue to the destination
- "Thaw" the queue
Freeze: set a particular queue as read-only at the proxy layer Thaw: restore a particular queue to normal status at the proxy layer
Advantages
- Easier to implement
- No changes to Marconi
- Scalable
- Transparent
Disadvantages
- Requires the implementation of a smart proxy - this includes: routing requests, partition management, catalogue management, regeneration. and synchronization
- Benefits from having access to raw_read and raw_write functions wrt storage layer
Current State
Concepts
Partitions
Partitions have: 1) a name, 2) a weight, and 3) a list of node URIs. For example:
{ "default": { "weight": 100, "nodes": [ "http://localhost:8889", "http://localhost:8888", "http://localhost:8887", "http://localhost:8886" ] } }
Catalogue
Catalogue entries have: 1) a key, 2) a node URI, and 3) metadata. For example:
{ "{project_id}.{queue_name}": { "href": "http://localhost:8889", "metadata": { "awesome": "sauce" } } }
API
GET /v1/partitions # list all registered partitions GET /v1/partitions/{name} # fetch details for a single partition PUT /v1/partitions/{name} # register a new partition DELETE /v1/partitions/{name} # # the catalogue is updated by operations routed through /v1/queues/{name} GET /v1/catalogue # list all entries in the catalogue for the given project ID GET /v1/catalogue/{name} # fetch info for the given catalogue entry
Implementation
Needs Review
- Proxy (partition, catalogue, queues handling): https://review.openstack.org/#/c/43909/
- Proxy (v1, health): https://review.openstack.org/#/c/44356/
- Proxy (forward the rest of the routes): https://review.openstack.org/#/c/44364/
To Do
- Hierarchical caching: store data in authoritative store (mongo replicaset) on write operations, and cache locally using Redis instance, hitting authoritative only on failed lookups
- Benchmarking
- Unit tests
- Functional tests
- Configuration
- Catalogue and partition registry regeneration
Deployment
- Bring up authoritative replicaset
- Bring up redis-server on each box
- launch marconi.proxy.app:app using a WSGI/HTTP server