Difference between revisions of "StructuredWorkflowLocks"
(→Rationale) |
(→Providers) |
||
Line 62: | Line 62: | ||
'''Drawbacks:''' | '''Drawbacks:''' | ||
− | *Does not release automatically on lock '' | + | *Does not release automatically on lock ''holder'' failure (but timeouts on locks are possible). |
− | * | + | *Not-possible on all file-systems (NFS for example). |
− | *Not partition tolerant (when machine hosting | + | *Not partition tolerant (when machine hosting filesystem crashes, locks are gone). |
+ | *Not highly available. | ||
'''Relevant python libraries:''' | '''Relevant python libraries:''' | ||
Line 75: | Line 76: | ||
'''Benefits:''' | '''Benefits:''' | ||
*Distributed. | *Distributed. | ||
− | *Tolerant to backend failure | + | *Tolerant to backend failure (''availability''). |
− | *Automatic lock release on lock '' | + | *Automatic lock release on lock ''holder'' failure (''liveness''). |
*Replicated via [http://zookeeper.apache.org/doc/r3.1.2/zookeeperStarted.html#sc_RunningReplicatedZooKeeper quorums]. | *Replicated via [http://zookeeper.apache.org/doc/r3.1.2/zookeeperStarted.html#sc_RunningReplicatedZooKeeper quorums]. | ||
− | *Prefers consistency over partition tolerance | + | *Prefers consistency over partition tolerance. |
− | *Can easily | + | *Can easily scale up and down additional capacity. |
− | *Strong durability guarantees using AOF. | + | *Strong durability guarantees using AOF (append only files). |
+ | *Mature and battle-hardened. | ||
'''Drawbacks:''' | '''Drawbacks:''' | ||
Line 95: | Line 97: | ||
'''Benefits:''' | '''Benefits:''' | ||
*Distributed. | *Distributed. | ||
− | *Tolerant to backend failure | + | *Tolerant to backend failure (''availability''). |
− | *Prefers consistency over partition tolerance | + | *Prefers consistency over partition tolerance. |
'''Drawbacks:''' | '''Drawbacks:''' | ||
− | *Unknown maturity | + | *Unknown maturity. |
*Lacks built-in durability. | *Lacks built-in durability. | ||
Line 110: | Line 112: | ||
'''Benefits:''' | '''Benefits:''' | ||
− | *Distributed | + | *Distributed via key hashing to different servers (''availability''). |
*Simple to deploy. | *Simple to deploy. | ||
Line 120: | Line 122: | ||
** ''Ex:'' server A goes up, gets lock, dies, server B takes over A's key range, server B dies, now server A has original key range but inconsistent values. | ** ''Ex:'' server A goes up, gets lock, dies, server B takes over A's key range, server B dies, now server A has original key range but inconsistent values. | ||
*Can not easily scale up and down additional capacity (consistent hashing helps). | *Can not easily scale up and down additional capacity (consistent hashing helps). | ||
− | * Not durable. | + | *Not durable. |
'''Relevant python libraries:''' | '''Relevant python libraries:''' | ||
Line 131: | Line 133: | ||
'''Benefits:''' | '''Benefits:''' | ||
− | *Distributed | + | *Distributed when partitioning via hashing keys to different servers (''availability''). |
*Can be setup to [http://redis.io/topics/persistence persist] stored information using an AOF (durable). | *Can be setup to [http://redis.io/topics/persistence persist] stored information using an AOF (durable). | ||
*Can be setup to [http://redis.io/topics/replication replicate] stored information. | *Can be setup to [http://redis.io/topics/replication replicate] stored information. | ||
Line 139: | Line 141: | ||
'''Drawbacks:''' | '''Drawbacks:''' | ||
*Locks may be inconsistent due to backend failure (even when setup with replication). | *Locks may be inconsistent due to backend failure (even when setup with replication). | ||
− | *Does not release automatically on lock '' | + | *Does not release automatically on lock ''holder'' failure (but timeouts on locks are possible). |
*Does not handle network partitions (keys will be rehashed to a different server on failure). | *Does not handle network partitions (keys will be rehashed to a different server on failure). | ||
*Built-in replication is non-blocking and can not be depended upon to be consistent. | *Built-in replication is non-blocking and can not be depended upon to be consistent. |
Revision as of 19:20, 25 May 2013
Contents
Rationale
Locks (and semaphores) are a critical component of most typical applications. This is especially so for structuring workflows in a manner that allows the entity applying the workflow to ensure that it is the only entity working on that workflow and its associated resources (for example, multiple servers that modify shared resources concurrently may cause data inconsistency). Ensuring correct locking and locking order is typically a very difficult component of creating reliable and fault tolerant distributed workflows. Therefore they are absolutely necessary to ensure consistent workflow operations. This is especially relevant & important in large scale distributed systems such as OpenStack which have many concurrent workflows being processed at the same time by many varying services (nova, cinder, quantum...).
Requirements
Since different workflows will need different ['mutex', 'semaphore'] types there needs to be built-in flexibility of the providing solution that allows said developers using said solution to provide a set of desired requirements and get back a ['mutex', 'semaphore'] that will match (or closely match) there desired requirements.
Oslo (WIP)
- Intra-process (thread) lock using eventlet
- Across-process lock using the local filesystem
Ironic (WIP)
- Exclusive when distributed across hosts (only one host can get it)
- Shared and exclusive between threads (only one gets exclusive, other threads may take shared lock)
- Reference from the lock to all holders (to allow for manual lock destruction)
Solution (WIP)
In order to accommodate the multiple varying requirements for different ['mutex', 'semaphore'] types this wiki proposes that there would be an API that would be created which would take in a set of ['mutex', 'semaphore'] requirements and provide back objects that would attempt to satisfy those requirements (or raise an exception if the requirements are not satisfiable). The providing API would be backed by varying & configurable implementations. Each backing implementation would have the ability to be queried about which requirements it can satisfy and the providing API would retrieve ['mutex', 'semaphore'] objects from the most compatible backend provider. This allows for deployers of this API to configure backends they feel comfortable with (memcache, redis, zookeeper, ... for example) while allowing for developers using said API to be only concerned that some backend matches said requirements that they desire (and if said requirement is not satisfiable the application should not work, or should fallback to using less strict requirements).
API
# Lock types that can be requested. DISTRIBUTED = 1 INTER_PROCESS = 2 INTRA_PROCESS = 4 # Lock properties that can be requested. MULTI_READER_SINGLE_WRITER = 1 HIGHLY_AVAILABLE = 2 REFERENCEABLE = 4 AUTOMATIC_RELEASE = 8 # These two can not be both requested at the same time. ALWAYS_CONSISTENT = 32 USUALLY_CONSISTENT = 64 class InvalidLockSpecification(Exception): pass def provide(type, requirements, resource_identifer): """Provides a lock object on the given resource identifier that attempts to meet the given requirement or combination of requirements. The requirement should be an a 'or' of different desired requirements that you want your lock to have.""" if ((type & DISTRIBUTED) and (type & INTER_PROCESS or type & INTRA_PROCESS)): raise InvalidLockSpecification("A lock can not be distributed and " "inter-process or intra-process at the same" "time.")
Providers
Filesystem
Benefits:
- Simple
- Consistent to local systems processes.
Drawbacks:
- Does not release automatically on lock holder failure (but timeouts on locks are possible).
- Not-possible on all file-systems (NFS for example).
- Not partition tolerant (when machine hosting filesystem crashes, locks are gone).
- Not highly available.
Relevant python libraries:
Built-in language: N/A
Zookeeper
Benefits:
- Distributed.
- Tolerant to backend failure (availability).
- Automatic lock release on lock holder failure (liveness).
- Replicated via quorums.
- Prefers consistency over partition tolerance.
- Can easily scale up and down additional capacity.
- Strong durability guarantees using AOF (append only files).
- Mature and battle-hardened.
Drawbacks:
- Complex to deploy (its java).
Relevant python libraries:
- https://pypi.python.org/pypi/kazoo
- https://github.com/python-zk/kazoo/blob/master/kazoo/recipe/lock.py
Built-in language: Java
Doozer
Benefits:
- Distributed.
- Tolerant to backend failure (availability).
- Prefers consistency over partition tolerance.
Drawbacks:
- Unknown maturity.
- Lacks built-in durability.
Relevant python libraries:
Built-in language: Go
Memcached
Benefits:
- Distributed via key hashing to different servers (availability).
- Simple to deploy.
Drawbacks:
- Does not release automatically on lock consumer failure (but timeouts on locks are possible).
- Does not handle network partitions (keys will be rehashed to a different server on failure).
- Prefers partition tolerance over consistency.
- Inconsistencies possible due to server flip-flopping problem.
- Ex: server A goes up, gets lock, dies, server B takes over A's key range, server B dies, now server A has original key range but inconsistent values.
- Can not easily scale up and down additional capacity (consistent hashing helps).
- Not durable.
Relevant python libraries:
Built-in language: C
Redis
Benefits:
- Distributed when partitioning via hashing keys to different servers (availability).
- Can be setup to persist stored information using an AOF (durable).
- Can be setup to replicate stored information.
- Consistent (when not using partitioning via hashing keys to different servers).
- Simple to deploy.
Drawbacks:
- Locks may be inconsistent due to backend failure (even when setup with replication).
- Does not release automatically on lock holder failure (but timeouts on locks are possible).
- Does not handle network partitions (keys will be rehashed to a different server on failure).
- Built-in replication is non-blocking and can not be depended upon to be consistent.
- Prefers partition tolerance over consistency.
- Can not easily scale up and down additional capacity (consistent hashing helps).
- Inconsistencies possible due to server flip-flopping problem.
- Ex: server A goes up, gets lock, dies, server B takes over A's key range, server B dies, now server A has original key range but inconsistent values.
Relevant python libraries:
Built-in language: C
Databases
Not provided due to MVCC. Could be provided with limited semantics and limited capabilities if absolutely required.
Relevant Links
- The CAP theorem provides upper bounds on the capabilities that solutions can provide.
- What are lock hierarchies.
- Neat research paper on lock hierarchy middleware.
- Distributed lock managers