Difference between revisions of "Zaqar/specs/havana"
(Updated features list based on meeting 1) |
m (Malini moved page Marconi/specs/havana to Zaqar/specs/havana: Project Rename) |
||
(34 intermediate revisions by 4 users not shown) | |||
Line 1: | Line 1: | ||
− | + | ||
+ | '''NOTE: This page is OUT OF DATE. Please see the latest info on the Marconi project here: [[Marconi]]''' | ||
+ | |||
<!-- ## page was renamed from marconi-grizzly-spec --> | <!-- ## page was renamed from marconi-grizzly-spec --> | ||
− | = Marconi: | + | = Marconi: Cloud Message Queuing for OpenStack = |
− | This specification formalizes the requirements and design considerations captured during one of the Grizzly Summit working sessions to initiate a message bus project for OpenStack. As the project evolves, so too will its requirements, so this specification is only meant as a starting point. | + | '''NOTE: This page is OUT OF DATE. Please see the latest info on the Marconi project here: [Marconi]''' |
+ | |||
+ | This specification formalizes the requirements and design considerations captured during one of the Grizzly Summit working sessions to initiate a message bus project for [[OpenStack]]. As the project evolves, so too will its requirements, so this specification is only meant as a starting point. | ||
Here's a brief summary of how Marconi works: | Here's a brief summary of how Marconi works: | ||
Line 9: | Line 13: | ||
# Clients post messages via HTTP to Marconi. The URL contains a tenant ID. | # Clients post messages via HTTP to Marconi. The URL contains a tenant ID. | ||
# Marconi persists messages according to either a default TTL, or one specified by the client. | # Marconi persists messages according to either a default TTL, or one specified by the client. | ||
− | # Clients poll Marconi for messages | + | # Clients poll Marconi for messages. |
− | # Clients may optionally | + | # Clients may optionally claim a batch of messages, hiding them from other clients. Once the client has processed each message, it can delete it from the server. In this way, Marconi provides a mechanism for ensuring each message is processed once and only once. |
== Rationale == | == Rationale == | ||
Line 16: | Line 20: | ||
The lack of an integrated cloud message bus service is a major inhibitor to [[OpenStack]] adoption. While Amazon has SQS and SNS, OpenStack currently provides no alternatives. | The lack of an integrated cloud message bus service is a major inhibitor to [[OpenStack]] adoption. While Amazon has SQS and SNS, OpenStack currently provides no alternatives. | ||
− | OpenStack needs a multi-tenant message bus that is fast, efficient, durable, horizontally-scalable and reliable | + | OpenStack needs a multi-tenant message bus that is fast, efficient, durable, horizontally-scalable and reliable. |
The Marconi project will address these needs, acting as a compliment to the existing RPC infrastructure within OpenStack, while providing multi-tenant services that can be exposed to applications running on public and private clouds. | The Marconi project will address these needs, acting as a compliment to the existing RPC infrastructure within OpenStack, while providing multi-tenant services that can be exposed to applications running on public and private clouds. | ||
− | + | == Use Cases == | |
− | + | '''NOTE: This data is OUT OF DATE. Please see the latest use case info is here: [[Use Cases (Marconi)]]''' | |
1. [[marconi/specs/use-cases/1|Distribute tasks among multiple workers]] (transactional job queues) | 1. [[marconi/specs/use-cases/1|Distribute tasks among multiple workers]] (transactional job queues) | ||
Line 55: | Line 59: | ||
== Major Features == | == Major Features == | ||
+ | |||
+ | '''NOTE: This page is OUT OF DATE. Please see the latest info on the Marconi project here: [[Marconi]]''' | ||
Non-Functional | Non-Functional | ||
Line 61: | Line 67: | ||
* Multi-tenant | * Multi-tenant | ||
* Implemented in Python, following PEP 8 and pythonic idioms | * Implemented in Python, following PEP 8 and pythonic idioms | ||
− | * Modular, | + | * Modular, driver-based architecture |
* Async I/O | * Async I/O | ||
− | |||
− | |||
− | |||
* Client-agnostic | * Client-agnostic | ||
− | * Low response time, turning around requests in 50ms or | + | * Low response time, turning around requests in 20-50ms (or better), even under load |
* High throughput, serving millions of reqs/min with a small cluster | * High throughput, serving millions of reqs/min with a small cluster | ||
− | ** | + | * Thousands of req/sec per queue (?) |
+ | * 100's of thousands of queues per tenant | ||
* Horizontal scaling of ''both'' reads and writes | * Horizontal scaling of ''both'' reads and writes | ||
* Support for HA deployments | * Support for HA deployments | ||
* Guaranteed delivery | * Guaranteed delivery | ||
* Best-effort message ordering | * Best-effort message ordering | ||
− | * Server generates all IDs | + | * Server generates all IDs |
− | * Gzip'd | + | * Gzip'd HTTP bodies |
− | * Secure (audited code, end-to-end HTTPS support, | + | * Secure (audited code, end-to-end HTTPS support, penetration testing, etc.) |
+ | * Schema validation | ||
* Auth caching | * Auth caching | ||
Functional | Functional | ||
− | * JSON | + | * Eventing and work queuing semantics |
− | * Opaque payload ( | + | * JSON |
− | * Max payload size of | + | * Opaque payload (arbitrary JSON, or base64-encoded binary) |
+ | * Max payload size of 4K | ||
* Batch message posting and querying | * Batch message posting and querying | ||
− | |||
− | |||
* Keystone auth driver (service catalog may return endpoints for different regions and/or different characteristics) | * Keystone auth driver (service catalog may return endpoints for different regions and/or different characteristics) | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | == Future Features == | + | == Future Features (Brainstorming) == |
+ | |||
+ | TODO: Create blueprints for these, prioritize | ||
− | + | Brainstormed features, listed in no particular order: | |
+ | * LZ4 compression for messages at rest | ||
+ | * Multi-Transport (Http, ZMQ) | ||
+ | * SQLAlchemy driver | ||
+ | * REPL for debugging, testing, diagnostics | ||
+ | * Client libraries for Python, PHP, Java, and C# | ||
+ | * Auto-generated audit river (read-only queue) for actions and state changes | ||
+ | * Delayed delivery | ||
+ | * Hot-reconfigure | ||
+ | * PATCH support for updating queue metadata | ||
+ | * Set/get arbitrary queue metadata | ||
+ | * Kombu Integration | ||
+ | * API tokens tied to a specific app and a specific queue, OAuth? | ||
+ | * Message signing | ||
* Standalone control panel or at least a simple admin/dashboard app for ops | * Standalone control panel or at least a simple admin/dashboard app for ops | ||
− | * JSON-P support | + | * JSON-P or CORS support (may need to use the while(1); trick to prevent XSS attacks) |
+ | * Multi-get (specify a list of queues to query in a single request) | ||
+ | * Tag-based filtering | ||
+ | ** Includes a way to return in one call, everything with or without the tag (OR semantics) to afford fanout. | ||
+ | * XML support | ||
+ | * LZ4 or snappy body compression (at rest, and in WSGI server as well as client libs) | ||
* Response caching | * Response caching | ||
* Authorization (based on tags and/or queues) | * Authorization (based on tags and/or queues) | ||
Line 117: | Line 131: | ||
* PyPy support | * PyPy support | ||
* HTTP 2.0 support | * HTTP 2.0 support | ||
− | |||
− | |||
− | |||
− | |||
− | |||
* Long-polling | * Long-polling | ||
− | * | + | * Web Socket transport driver |
+ | * Web hooks | ||
== Non-Features == | == Non-Features == | ||
Line 138: | Line 148: | ||
== Architecture == | == Architecture == | ||
− | Marconi will use a micro-kernel architecture. Auth, | + | '''NOTE: This page is OUT OF DATE. Please see the latest info on the Marconi project here: [[Marconi]]''' |
+ | |||
+ | Marconi will use a micro-kernel architecture. Auth, transport, storage, cache, logging, monitoring, etc. will all be implemented as drivers or exposed with standard protocols, allowing vendors to customize Marconi to suit. | ||
+ | |||
+ | Endpoint controllers define the interface between storage and transport. [https://wiki.openstack.org/wiki/Marconi/specs/endpoint More info]. | ||
Possible frameworks that can help realize a highly modular design: | Possible frameworks that can help realize a highly modular design: | ||
* pkg_resources | * pkg_resources | ||
− | * stevedore | + | * [https://github.com/dreamhost/stevedore stevedore] |
− | + | Reference drivers | |
− | + | ||
− | * WSGI- | + | * Transport: HTTP(S) via WSGI using [http://falconframework.org Falcon] |
+ | * Auth: Keystone middleware | ||
+ | * Storage: MongoDB | ||
+ | * Logging: Standard library logging | ||
+ | * Monitoring: TBD - Statsd, as well as HTTP stats page? | ||
− | + | == Deployment Options == | |
− | * | + | * Self-host via gevent.http or ZMQ |
− | + | * Host with a WSGI server. | |
− | * | + | :* Requires writing a small bootstrap script to load the kernel and export the app callable. |
− | * | + | :* Bootstrap script also allows full programmatic customization of logging |
− | |||
− | * | ||
== API == | == API == |
Latest revision as of 18:42, 7 August 2014
NOTE: This page is OUT OF DATE. Please see the latest info on the Marconi project here: Marconi
Contents
Marconi: Cloud Message Queuing for OpenStack
NOTE: This page is OUT OF DATE. Please see the latest info on the Marconi project here: [Marconi]
This specification formalizes the requirements and design considerations captured during one of the Grizzly Summit working sessions to initiate a message bus project for OpenStack. As the project evolves, so too will its requirements, so this specification is only meant as a starting point.
Here's a brief summary of how Marconi works:
- Clients post messages via HTTP to Marconi. The URL contains a tenant ID.
- Marconi persists messages according to either a default TTL, or one specified by the client.
- Clients poll Marconi for messages.
- Clients may optionally claim a batch of messages, hiding them from other clients. Once the client has processed each message, it can delete it from the server. In this way, Marconi provides a mechanism for ensuring each message is processed once and only once.
Rationale
The lack of an integrated cloud message bus service is a major inhibitor to OpenStack adoption. While Amazon has SQS and SNS, OpenStack currently provides no alternatives.
OpenStack needs a multi-tenant message bus that is fast, efficient, durable, horizontally-scalable and reliable.
The Marconi project will address these needs, acting as a compliment to the existing RPC infrastructure within OpenStack, while providing multi-tenant services that can be exposed to applications running on public and private clouds.
Use Cases
NOTE: This data is OUT OF DATE. Please see the latest use case info is here: Use Cases (Marconi)
1. Distribute tasks among multiple workers (transactional job queues)
2. Forward events to data collectors (transactional event queues)
3. Publish events to any number of subscribers (pub-sub)
4. Send commands to one or more agents (RPC via point-to-point or pub-sub)
5. Request information from an agent (RPC via point-to-point)
6. Monitor a Marconi deployment (DevOps)
Design Goals
Marconi's design philosophy is derived from Donald A. Norman's work regarding The Design of Everyday Things:
The value of a well-designed object is when it has such a rich set of affordances that the people who use it can do things with it that the designer never imagined.
Goals related to the above:
- Emergent functionality, utility
- Modular, pluggable code base
- REST architectural style
Principles to live by:
- DRY
- YAGNI
- KISS
Major Features
NOTE: This page is OUT OF DATE. Please see the latest info on the Marconi project here: Marconi
Non-Functional
- Versioned API
- Multi-tenant
- Implemented in Python, following PEP 8 and pythonic idioms
- Modular, driver-based architecture
- Async I/O
- Client-agnostic
- Low response time, turning around requests in 20-50ms (or better), even under load
- High throughput, serving millions of reqs/min with a small cluster
- Thousands of req/sec per queue (?)
- 100's of thousands of queues per tenant
- Horizontal scaling of both reads and writes
- Support for HA deployments
- Guaranteed delivery
- Best-effort message ordering
- Server generates all IDs
- Gzip'd HTTP bodies
- Secure (audited code, end-to-end HTTPS support, penetration testing, etc.)
- Schema validation
- Auth caching
Functional
- Eventing and work queuing semantics
- JSON
- Opaque payload (arbitrary JSON, or base64-encoded binary)
- Max payload size of 4K
- Batch message posting and querying
- Keystone auth driver (service catalog may return endpoints for different regions and/or different characteristics)
Future Features (Brainstorming)
TODO: Create blueprints for these, prioritize
Brainstormed features, listed in no particular order:
- LZ4 compression for messages at rest
- Multi-Transport (Http, ZMQ)
- SQLAlchemy driver
- REPL for debugging, testing, diagnostics
- Client libraries for Python, PHP, Java, and C#
- Auto-generated audit river (read-only queue) for actions and state changes
- Delayed delivery
- Hot-reconfigure
- PATCH support for updating queue metadata
- Set/get arbitrary queue metadata
- Kombu Integration
- API tokens tied to a specific app and a specific queue, OAuth?
- Message signing
- Standalone control panel or at least a simple admin/dashboard app for ops
- JSON-P or CORS support (may need to use the while(1); trick to prevent XSS attacks)
- Multi-get (specify a list of queues to query in a single request)
- Tag-based filtering
- Includes a way to return in one call, everything with or without the tag (OR semantics) to afford fanout.
- XML support
- LZ4 or snappy body compression (at rest, and in WSGI server as well as client libs)
- Response caching
- Authorization (based on tags and/or queues)
- Cross-tenant sharing (need to define business case)
- Temporal queries
- JavaScript client library (browser and Node.js)
- Ruby client library
- PHP client library
- Cross-regional replication
- Horizon plug-in
- Ceilometer data provider
- PyPy support
- HTTP 2.0 support
- Long-polling
- Web Socket transport driver
- Web hooks
Non-Features
Marconi may be used to support other services that provide the following functionality, but will not embed these abilities directly within its code base.
- Any kind of push notifications over persistent connections (leads to complicated state management and poor hardware utilization)
- Forwarding notifications to email, SMS, Twitter, etc. (ala SNS)
- Forwarding notifications to web hooks
- Forwarding notifications to APNS, GCM, etc.
- Scheduling-as-a-service (ala IronWorker)
- Metering and monitoring solutions
Architecture
NOTE: This page is OUT OF DATE. Please see the latest info on the Marconi project here: Marconi
Marconi will use a micro-kernel architecture. Auth, transport, storage, cache, logging, monitoring, etc. will all be implemented as drivers or exposed with standard protocols, allowing vendors to customize Marconi to suit.
Endpoint controllers define the interface between storage and transport. More info.
Possible frameworks that can help realize a highly modular design:
- pkg_resources
- stevedore
Reference drivers
- Transport: HTTP(S) via WSGI using Falcon
- Auth: Keystone middleware
- Storage: MongoDB
- Logging: Standard library logging
- Monitoring: TBD - Statsd, as well as HTTP stats page?
Deployment Options
- Self-host via gevent.http or ZMQ
- Host with a WSGI server.
- Requires writing a small bootstrap script to load the kernel and export the app callable.
- Bootstrap script also allows full programmatic customization of logging
API
See the Marconi API spec. [ROUGH DRAFT]
Test Plan
All development will be done TDD-style using nose and testtools. Pair programming may happen on accident (or even on purpose). Eventually we'll add integration, performance, and security tests, and get everything automated in a nice and tidy CI pipeline.