Manila/design/manila-generic-groups

= Manila generic groups =

Executive summary
Manila needs a grouping construct that, like shares, is a 1st-class atomic data type. Our experience with CGs has demonstrated the complexity of adding a grouping capability, yet there are other use cases such as migration, replication, and backup in which some storage controllers could only offer such features on share groups. CGs also highlighted the poor optics of an advanced feature with comparatively little potential for vendor support. And adding new grouping constructs for each new feature is not technically feasible. All of the above may be addressed by generic groups, which we think is a clean extension to the original architecture of Manila.

An Intractable Matrix
Manila is at an exciting time in its development, where all the simple features are available and the community is focused on adding high-value features such as migration, replication, and consistency groups. Experimental code is available for each of these, but little consideration has been given to the obvious need for these complex features to interact not only with each other but also with whatever additional features were added later. And looking a little deeper, it is apparent that each of these has limitations that could be solved by a single architectural enhancement.

Consistency groups
Consistency groups (CGs) are the only construct in Manila that operates on a group of shares. For some, a CG implies a guarantee of ordered writes by a storage controller, while for others a CG is a mechanism for taking consistent point-in-time snapshots of a set of shares. A group with either attribute may have value by itself, but the CGs implementation doesn’t distinguish between them.

CGs are a highly specialized construct with dedicated CLI commands, REST APIs, database tables, scheduler filters, and driver APIs that totaled over 9900 lines. None of these are reusable for anything else, and none of the rest of Manila has any awareness of CGs. Even worse, despite all the complexity of the CG feature, only a small minority of storage backends can support them, so the code-to-value ratio is very low and the limited availability of CGs ensures the user experience is inconsistent between clouds.

Replication
Replication is another high value feature, and the Manila implementation is arguably clean and flexible, but as constituted there is no core ability to replicate groups of shares. It seems reasonable to implement the feature iteratively, beginning with replication of single shares. However, there has been the tacit acknowledgement that some backends would not be able to replicate individual shares, even though those same backends could potentially replicate a group of shares if the group were constituted in a certain way.

Following the precedent established with CGs, one approach would be to introduce ‘replication groups’ whereby multiple shares could be replicated together. But this would introduce yet another set of APIs, tables, etc. that are too specialized to use for anything else.

Another approach might be to define CGs as the grouping unit of replication, but this makes little sense because consistency groups (however we define them) and replication are distinct features that shouldn’t be artificially bundled for the sake of expediency of definition or implementation.

Migration
Just as some backends may only be able to replicate shares in groups, it follows that those backends may also need to migrate shares in groups. And in the case of CGs, it doesn’t make sense to migrate CG members individually; the whole group must be moved, requiring the migration engine to be CG aware.

Consider the following reasonable use cases:
 * Replicate a consistency group
 * Snapshot a replication group
 * Migrate a replication group
 * Retype shares in a consistency group
 * Retype shares in a replication group
 * Backup a consistency group
 * Backup a replication group
 * Apply a common policy to a set of shares

On the current path, the support matrix has a dimension dedicated solely to features, so each feature must be coded to be interoperable with every other feature. This quickly becomes a support and testing nightmare, and it becomes exponentially more complicated to add more features going forward.

Fundamentally, the problem is that we are adding features without an underlying architectural framework on which to hang them. To escape the matrix, we must step back and rethink a few things.

A simple solution
So to arrive at a solution, let’s enumerate what we have:
 * Primitive objects (shares) with supported operations controlled by share types that vary by backend
 * A very specific group object (CGs) with limited support potential by backends
 * A number of APIs (CGs, etc.) that many users can’t use at all and appear grafted into the project

And what would we prefer:
 * A uniform API and user experience that varies as little as possible by backend
 * A clean way to group shares so we can do CGs, multi-object replication, migration, etc.

How do we get here?
 * Universally available groups of primitive objects with supported operations that vary by backend

Idea #1: Introduce a generic Share Group object
Just as Manila has ‘shares’, it should also have ‘share groups’. In its simplest form, unlike CGs, a share group should not guarantee any specialized operation. Instead, it should merely constitute an atomic Manila data type on which nearly any Manila action is available. For example, given a share group, the user should be able to select the group in the Manila UI and invoke features such as snapshot, clone, backup, migrate, replicate, retype, etc.

In the general case, group actions are handled by the share manager layer (or potentially the common share driver superclass). For example, invoking ‘snapshot’ on a group causes the share manager to take a snapshot of each group member individually. The resulting group snapshot object may be used to create a new group, not unlike how CG snapshots were implemented.

There are numerous advantages to this approach. Every driver, no matter how unsophisticated, can leverage the manageability goodness of groups. The user experience is uniform, since groups are always available regardless of which backends are present and because nearly every Manila action available on shares is also available on groups.

Of course, feature-rich backends would like to add their secret sauce, whether it be CGs, group-based replication, or whatever else comes along that might be easier/cheaper/faster to do to groups of shares. That brings us to a related idea.

Idea #2: Introduce Share Group types
Just as Manila has ‘share types’, it should also have ‘share group types’. Any driver that can perform a group operation in an advantaged way may report that as a group capability, such as:
 * Ordered writes
 * Consistent snapshots
 * Group replication
 * Group backup
 * Group modification

As with share types, the cloud administrator predefines share group types that may contain extra specs corresponding to the group capabilities reported by the backends. The admin also specifies which share type(s) a given group type may contain. When creating a group, the user specifies the group type and share type, and the scheduler then creates the group on one of the backends that match the specified share and share group type.

Anytime a group action, such as ‘snapshot’, comes into the share manager, the manager checks whether its driver offers an advantaged implementation of that operation. If not, the manager handles the workflow itself as described above. But if so, the manager routes the workflow to its driver for fulfillment.

The advantages of this approach should be obvious. Development and testing are simplified because there isn’t a need to define and test a different set of group management APIs for each feature, or to test every combination of every feature. Instead of becoming an N-by-N matrix of interacting features, Manila largely becomes an N-by-2 matrix of actions that may be invoked on either individual shares or share groups. Users and admins are already familiar with share types, so introducing share group types would seem a natural and consistent evolution of the same foundational concept.

The matrix would look like:

Manila implementation plan
Implementing generic groups in Manila should be a straightforward series of steps:
 * 1) Implement generic groups.  Because we did CGs, we already know all parts of the codebase that must change to support any kind of group.  So the simplest approach is to modify the CG code to morph it into the generic groups feature.
 * 2) Add the share group type feature by duplicating and customizing the share type code.
 * 3) Enhance the scheduler to place groups according to share group types.  Like #1, this is already informed by the CG project.
 * 4) Implement group snapshots in the share manager to demonstrate group snapshots in any driver.
 * 5) Plumb group snapshots to a CG-capable driver to demonstrate CG functionality in the new framework.
 * 6) Update Tempest to cover all of the above.
 * 7) Update Manila client to change CGs to share groups.
 * 8) Enhance Manila client with share group types.
 * 9) Add share group support to Horizon.  We never built CG support into Horizon, so this is all-new work that now has much broader appeal and applicability.
 * 10) Add additional group actions over time (migrate, replicate, retype, clone, …).  At this point, new group capabilities become vertical slices that are simple to add incrementally.

Because we implemented CGs just recently as an experimental feature, we have the freedom to replace that code without deprecation or upgrade considerations. Steps 1-8 would get Manila to parity with the CG feature added in Liberty, would require only a few person-weeks of effort, and would better position Manila for long-term evolution and supportability.

Implementation details
There are few things to note, several of which were already solved during the CG work.

Groups are first-class objects in Manila, and operations on groups are treated as atomic. To enable support by as many backends as possible, Manila will still maintain DB objects for both group and member snapshots, just as was done with CGs.

The capabilities of a group will all be public extra specs, similar to snapshot_support in share types. Users will need to know what a group can do.

A few actions, such as extend & shrink, are inherently applicable only to individual shares. One could theoretically apply extend to a group, increasing the size of each member, but that seems like an unlikely use case. Any actions in this category must remain available to group members, and other actions such as taking snapshots of group members could be allowed, but actions such as migration or replication would be available only at the group level and not on its members.

A group is limited to a single backend. Allowing groups that span backends is theoretically possible, but that would require fanout of operations from the API layer to multiple share managers across the asynchronous event bus, which would lead to complicated synchronization and state management for little operational benefit.

As was done with Manila CGs, a driver may optionally limit a group to either the confines of a pool or an entire backend. It is known that pools are the unit of data motion (i.e. replication or migration) for some backends, so we think drivers need this flexibility.

In a departure from the CG implementation, a group type may support multiple share types, but a group may only contain shares of a single type. This restriction could be revisited in a later release, but it avoids numerous challenges:


 * With multiple share types in a group, the number of potential share/group combinations becomes much larger and it becomes more likely that a group is requested that cannot be scheduled.
 * Operators would have to test and support all share/group combinations.
 * Migration becomes more complicated, especially if a group spans pools, since the destination would have to support the same combination of group & share types.
 * Replication is similarly affected.
 * Retype gets weird if some but not all group members must be migrated.

We considered reusing share types for groups as well. But share types include a set of public extra specs that may not map well to groups. And by adding group types as a separate object, there can be little confusion about their purpose and use.

Some have noted that extra specs tend to be treated as an AND operation, where all features must be available for a backend to be chosen. It is conceivable that even if a backend can replicate a group or take consistent snapshots of a group, it might not be able to perform both operations on the same group. But this problem already exists with share types and hasn’t been a serious issue. Creating types is an admin-only operation and the burden remains on the admin to understand the capabilities of the backends in use and to create the share types and share group types appropriately.

Note that this proposal explicitly does not address pool or backend replication, which is fundamentally different. Actions on shares or share groups are intended for tenants, whereas a pool or backend can contain data from multiple tenants. So pool or backend operations, while serving potentially valuable use cases, are inherently admin-only workflows that would be designed and exercised differently should Manila support them.

Share Group APIs
It is possible to design a REST API that seamlessly handles both shares and share groups with little duplication of APIs. But at this point in Manila's development, it is arguably too late to radically redesign the API.

It may be possible to overload some of the existing APIs to handle both shares and share groups. For example, POST /shares/{id}/action could accept the ID of a share or share group and just do the right thing. But other APIs, such as GET /shares are less practical to overload, since a single endpoint would be returning objects of different types.

It seems better to merely duplicate a few APIs with group versions as needed. For example:
 * POST /shares --> POST /groups
 * POST /shares/{share_id}/action --> POST /groups/{group_id}/action
 * POST /snapshots --> POST /group_snapshots

Acknowledgments
Most of the analysis and recommendations contained herein apply equally well to Cinder, which is arguably much further down the path of crippling complexity.

This proposal was developed by cknight with input from Manila community members bswartz, ameade, tbarron, and cfouts. The Cinder conversations about grouping, as well as the groundbreaking contributions of CGs to Cinder, all of which inspired this proposal, were led by xyang.