Jump to: navigation, search

Zaqar/bp/placement-service

< Zaqar
Revision as of 15:21, 5 August 2013 by Alejandro Cabrera (talk | contribs) (Added whiteboard screenshot)

Overview

Marconi depends on storage for maintaining state on messages. However, as it is designed now, this is a scaling bottleneck because at most one storage node can be specified per Marconi instance. This can be partially offset by taking advantage of built-in sharding and replication for storage providers like MongoDB, but many other storage engines do not provide such a system. Furthermore, the migration of queues from one storage node to another is not supported cleanly by most storage providers. More intelligent data migration can be performed by taking into account that Marconi operates at the level of queues.

The placement service aims to address this by providing a means to specify dynamically where storage for Marconi resources should be performed. It is responsible specifically for:

  • Maintaining a catalogue, a set of mapping of (Project + Queue) => storage node locations
  • Handling queue creation transparently
  • Handling migration transparently
  • Handling deletion of queues transparently

Here, transparency must exist at two levels: to the user and to the implementation of the storage driver. Transparency to the user presents itself in the form of availability and use of the Marconi service must not be interrupted when a migration is taking place. Transparency to the storage drivers means that the storage driver is handed a location/connection and only cares about the serialization/deserialization of data to that storage location.

Approach

A placement service consists of the following components:

  • A catalogue of mappings: {queue => [read locations], queue => [write locations], queue => status}
   * where status is one of ACTIVE or MIGRATING
  • A cache maintained in each Marconi instance of the mappings
  • A migration service

Catalogue

The catalog, as described above, maintains these mappings from queues to locations.

It is strictly an optimization. Marconi can continue to function in a degraded state if the placement service goes down. The data contained in the catalogue can be regenerated by querying each existing queue while disabling data migration and caching.

The catalogue contains a series of structures that look as follows:

{
    "{project}.{queue}" : {
        "r": ["mongo://192.168.1.105:7777", "mongo://192.168.1.106:7777"],
        "w": "mongo://192.168.1.105:7777",
        "s": 0
    }
}

An entry in the queue is found by concatenating a project with a queue name. "r" is the collection of locations where data for a particular queue can be gathered from. "w" is the location where new data written to this queue is stored. "s" is the state of the queue: active or migrating.

The catalogue is filled with many such documents. The local cache for a given Marconi node is populated with the entries from this cache, to avoid lookups on each request.

Marconi Cache

This is one of the tricky aspects of the placement service. In the face of data migrations and queue deletions, the cache must be invalidated and updated appropriately.

The current proposed caching scheme is as follows:

  1. Populate the cache when a Marconi instance is brought up
  2. Process a request from the cache whenever possible
  3. If a cache lookup fails, refer to the placement service catalogue - update the cache

Furthermore, to reduce inconsistency, there's another method to handle cache updates - periodic refresh.

Periodic Refresh

On a separate Marconi "thread", poll the catalogue service periodically (say, every 10 seconds). This actor is responsible for updating the cache. It queries the catalogue service and looks for changes.

To enable this, migrations are only allowed a granularity of 5 minutes. This helps avoid race conditions on a catalogue resource, since the migration itself triggers a state change from active to migrating for a particular queue.

Push Refresh

This approach does away with time and puts the responsibility of invalidating caches on the placement service. All Marconi nodes must maintain a listen port connected to the placement service, and whenever a migration occurs, Marconi nodes receive updates on queues that are being affected.

Deletions

Deletions take priority over migrations. If a migration is in progress for Q1, and a request to delete Q1 is made, then all messages for Q1 are deleted both from the initial storage location and the destination storage location. The migration is cancelled.

Further Details

It seems likely that some form of administrative API will need to be exposed for Marconi to handle communication between the placement service and Marconi needs. It might be necessary for cache handling operations and for triggering migrations.