Jump to: navigation, search

Difference between revisions of "Manila/Replication API Design"

(Replica States)
Line 1: Line 1:
 +
= Design =
 +
 
== Intro ==
 
== Intro ==
  
 
The Manila DR API will be implemented as an extension to the Manila API (not part of core) initially, because we want to prove the concept without investing heavily in support in reference implementation (generic driver) or automated testing/CI.
 
The Manila DR API will be implemented as an extension to the Manila API (not part of core) initially, because we want to prove the concept without investing heavily in support in reference implementation (generic driver) or automated testing/CI.
  
== Replication Styles ==
+
== Replication styles ==
  
 
There are 3 styles of DR that we would like to support in the long run:
 
There are 3 styles of DR that we would like to support in the long run:
# "replicated_writable" - Amazon EFS-style synchronously replicated shares where all replicas are writable. Failover it not supported and not needed.
+
# writable - Amazon EFS-style synchronously replicated shares where all replicas are writable. Failover it not supported and not needed.
# "replicated_readable" - Mirror-style replication with a primary (writable) copy and one or more secondary (read-only) copies which can become writable after a failover.
+
# readable - Mirror-style replication with a primary (writable) copy and one or more secondary (read-only) copies which can become writable after a failover.
# "replicated_dr" - Generalized replication with secondary copies that are inaccessible until after a failover.
+
# dr - Generalized replication with secondary copies that are inaccessible until after a failover.
  
== Replica States ==
+
== Replica states ==
  
 
Each replica has a state which has 3 possible values:
 
Each replica has a state which has 3 possible values:
Line 25: Line 27:
 
* Remove share replica - Takes a replica UUID. Deletes the replica, regardless of state. Must not be the only active replica.
 
* Remove share replica - Takes a replica UUID. Deletes the replica, regardless of state. Must not be the only active replica.
 
* Set active replica - Takes a replica UUID. Make that replica active. The state of the replica must be in_sync.
 
* Set active replica - Takes a replica UUID. Make that replica active. The state of the replica must be in_sync.
 +
 +
== New share states ==
 +
 +
* replication_change - New transient state triggered by a change of the active replica. Access to the share is cut off while in this state.
  
 
== Changes to existing APIs ==
 
== Changes to existing APIs ==
Line 31: Line 37:
 
* Share create will create a replicated share if the share type is has replication extra spec. The style of replication is determined by the share type's replication vale.
 
* Share create will create a replicated share if the share type is has replication extra spec. The style of replication is determined by the share type's replication vale.
 
* Share list/details APIs will return the replication style (writable, readable, dr) and a flag if more than one replica exists.
 
* Share list/details APIs will return the replication style (writable, readable, dr) and a flag if more than one replica exists.
* Shares can have an additional share state called "replication_change" which is a transient state triggered by a change of the active replica
 
 
* Create snapshot - creates snapshots of all the replicas
 
* Create snapshot - creates snapshots of all the replicas
 
* Delete share/snapshot - deletes ALL replicas of the share/snapshot
 
* Delete share/snapshot - deletes ALL replicas of the share/snapshot
Line 48: Line 53:
 
appropriate network in each AZ. Multi-AZ share networks would also be useful for
 
appropriate network in each AZ. Multi-AZ share networks would also be useful for
 
non-replicated use cases.
 
non-replicated use cases.
 +
 +
= Examples =
 +
 +
== Writable replication example ==
 +
 +
# Administrator sets up backends in AZs b1, b2, and b3, that have capability replication=writable
 +
# Administrator creates a new share_type called foo
 +
# Administrator sets replication=writable extra spec on share type foo
 +
# User creates new share of type foo in AZ b1
 +
# Share is created with replication=writable, and 1 active replica in AZ b1
 +
# User grants access on share to client1 in AZ b1, obtains the export location of the replica, mounts the share on a client, and starts to write data
 +
# User add new replica of share in AZ b2
 +
# A second replica is created in AZ b2 which initially has state out_of_sync
 +
# Shortly afterwards, the replica state changes to active (after the replica finishes syncing with the original copy)
 +
# The user grants access on the share to client2 in AZ b2, obtains the export location of the new replica, mounts the share, and sees the same data that client1 wrote
 +
# Client2 writes some data to the share, which is immediately visible to client1
 +
 +
== Readable replication example ==

Revision as of 02:22, 22 July 2015

Design

Intro

The Manila DR API will be implemented as an extension to the Manila API (not part of core) initially, because we want to prove the concept without investing heavily in support in reference implementation (generic driver) or automated testing/CI.

Replication styles

There are 3 styles of DR that we would like to support in the long run:

  1. writable - Amazon EFS-style synchronously replicated shares where all replicas are writable. Failover it not supported and not needed.
  2. readable - Mirror-style replication with a primary (writable) copy and one or more secondary (read-only) copies which can become writable after a failover.
  3. dr - Generalized replication with secondary copies that are inaccessible until after a failover.

Replica states

Each replica has a state which has 3 possible values:

  1. active - All writable replicas are active
  2. in_sync - Passive replica which is up to date with the active replica, and can be promoted to active
  3. out_of_sync - Passive replica which has gone out of date, or new replica that is not yet up to date

New APIs

We will implement a Manila extension that includes several new APIs needed to support replicated shares.

  • List share replicas - Takes a share ID. Returns a table of replicas with details. Must be a replicated share. Details include AZ, replica state (active, in_sync, out_of_sync) and export locations.
  • Add share replica - Takes a share ID, and an AZ. The share must not already have a replica in the specified AZ. Returns replica UUID (actually share instance UUID).
  • Remove share replica - Takes a replica UUID. Deletes the replica, regardless of state. Must not be the only active replica.
  • Set active replica - Takes a replica UUID. Make that replica active. The state of the replica must be in_sync.

New share states

  • replication_change - New transient state triggered by a change of the active replica. Access to the share is cut off while in this state.

Changes to existing APIs

  • Share type APIs will have a new user-visible extra spec - replication=writable/readable/dr. The absence of this extra spec indicates non-replicated shares and the presence of the extra spec will indicate that the share is replicated with the given style.
  • Share create will create a replicated share if the share type is has replication extra spec. The style of replication is determined by the share type's replication vale.
  • Share list/details APIs will return the replication style (writable, readable, dr) and a flag if more than one replica exists.
  • Create snapshot - creates snapshots of all the replicas
  • Delete share/snapshot - deletes ALL replicas of the share/snapshot
  • Migrate/retype/etc - only the primary replica is considered as the source

Network issues with multi-SVM and replication

!!OPTIONAL!!

If we choose to make replication a single-svm-only feature, the share-network API doesn't need to change. In order to support replication with share-networks, we also need to modify the share-network create API which allows creation of share networks with a table of AZ-to-subnet mappings. This approach allows us to keep a single share-network per share (with associated security service) while allowing the tenant to specify enough information that each share instance can be attached to the appropriate network in each AZ. Multi-AZ share networks would also be useful for non-replicated use cases.

Examples

Writable replication example

  1. Administrator sets up backends in AZs b1, b2, and b3, that have capability replication=writable
  2. Administrator creates a new share_type called foo
  3. Administrator sets replication=writable extra spec on share type foo
  4. User creates new share of type foo in AZ b1
  5. Share is created with replication=writable, and 1 active replica in AZ b1
  6. User grants access on share to client1 in AZ b1, obtains the export location of the replica, mounts the share on a client, and starts to write data
  7. User add new replica of share in AZ b2
  8. A second replica is created in AZ b2 which initially has state out_of_sync
  9. Shortly afterwards, the replica state changes to active (after the replica finishes syncing with the original copy)
  10. The user grants access on the share to client2 in AZ b2, obtains the export location of the new replica, mounts the share, and sees the same data that client1 wrote
  11. Client2 writes some data to the share, which is immediately visible to client1

Readable replication example