Swift/ContainerACLWithKeystoneV3

ARCHIVE

Status
Done. Merged patch https://review.openstack.org/#/c/86430/ addressed the "preferred solution" discussed on this wiki page. This wiki page is an archive of background thinking that informed the patch but *the code is normative* - this wiki page should not be interpreted as documentation of merged code.

This page is intended to capture formative thoughts on how swift can handle name-based ACLs in the context of keystone v3 domains.

Problem statement
Swift keystoneauth middleware allows container ACLs to specify 'cross-tenant' access in the form tenant:user where tenant and user can be a UUID, a name or a wildcard *. With the introduction of domains with the keystone v3 API, names (of both tenants and users) are no longer globally unique and are only unique within a domain. Consequently, cross-tenant ACLs specified using unqualified names will be ambiguous.

Summary / preferred solution
(Based on discussions at Atlanta summit. Implemented here https://review.openstack.org/#/c/86430/)


 * In future, container cross-tenant ACLs SHOULD use only ids or wildcards for tenant and user - names cannot be unambiguously expressed and names are mutable.
 * When validating an ACL, keystoneauth MUST not match names unless (a) the requesting user is in the default domain and (b) the tenant is in the default domain
 * When accounts are created using a v3 token, record the tenant domain id as account sysmeta to enable determination of tenant domain when validating ACL.

Discussion of forward looking options
1. (PREFERRED) Allow only UUIDs (or wildcards) in X-Container-[Read|Write] ACLs when using keystone v3. This is enforced by patch https://review.openstack.org/#/c/86430/

2. Require domain qualified names in X-Container-[Read|Write] ACLs. An obstacle to this is that keystone currently has no reserved characters in names, and therefore no character that could be used as a separator in a qualified name. (This may change IF keystone adopts and enforces a normative way to express domain-qualified names (or hierarchical names). E.g. if keystone were to reserve the '@@' string then this could be used as a separator between user/tenant names and domain names then swift could accept ACLs having the form tenant_name@@domain_name:user_name@@domain_name.) Furthermore, names in keystone are not guaranteed to be immutable, meaning that any ACL expressed using names could become stale (or even insecure if a name is reassigned to another user or tenant).

3. If name-based ACL support is required, provide a new JSON encoded structured ACL specification in which names and domains are explicitly called out. This may require new header key(s) to discriminate from the legacy X-Container-[Read|Write] ACL format. This addresses the inability to express domain qualified names using the current tenant:user form of ACL, but does not address the risk of names being mutable. Persisted ACLs should be expressed using ONLY ids - user-specified ACLs could safely use names IF those names were resolved to ids as the ACL is ingressed to swift i.e. swift keystoneauth would query keystone service each time an ACL header is received with only names, populate the ACL with IDs returned by keystone and then persist the fully populated ACL. For completeness, keystoneauth should perform the reverse process whenever the container headers are returned in response to a GET/HEAD i.e. query keystone for the current name mapped to an id, insert names into ACL and return to user. That way, if a name has changed then the ACL returned to users remains up to date. (This clearly introduces an overhead of requests to keystone).

Discussion of backwards compatibility issues
Existing systems have unqualified names in ACLs. We need to continue to honor thoses ACLs as legacy users and tenants migrate to keystone v3.

For simplicity let's assume that all legacy users and tenants migrate to a single v3 domain which we will refer to as the 'legacy domain'. (keystone has the notion of a default domain, which may be named 'Default' and may have id 'default', but the default domain name and id are configurable during migration. Using the term 'legacy domain' may avoid confusion in this discussion.)

We assume that the legacy domain id is available to keystoneauth middleware via config.

(We will return later to discuss the implications of migrating legacy users and tenants to multiple domains - Update: apparently this cannot happen, legacy users all move to one default domain.' ).

We can state several goals:

Goal1: Existing unqualified-name ACLs ('legacy ACLs') continue to be honored for users and tenants migrated to the legacy domain.

Goal2: Users in the legacy domain can continue to specify ACLs using unqualified-names, with the restriction that they will only be granted to users also in the legacy domain.

Goal3: Users in non-legacy domains can specify ACLs using unqualified-names, with the restriction that they will only be granted to users also in the same domain.

To achieve Goal1 we need to verify that (a) a user is in the legacy domain and (b) that the tenant account being accessed is also in the legacy domain.

Achieving (a) is straightforward since the user's domain id will be included in the token info for the validated request token. Some caution is needed because keystone's authtoken middleware sets the user domain id to 'default' when no domain info is available (e.g. when processing a v2 token), and as discussed above this may not in fact be the id of the legacy domain.

Achieving (b) is harder since swift has no a priori knowledge of the tenant/account domain membership, and determining tenant domain membership is not straightforward...

1. Fetch from keystone
Request account/tenant domain id from keystone on-demand when legacy ACLs need to be applied. Note that swift's keystoneauth middleware does not currently make any requests to keystone (that is handled by the keystone authtoken middleware). Once retrieved from keystone the tenant domain id could be cached or persisted as account sysmeta.

2. Infer from ACL format
We could infer from the existence of a legacy ACL format that the tenant is in the legacy domain. This is only safe if we enforce that all ACLs created in non-legacy domains are domain-qualified (assuming such a thing is even possible). To enforce this we either need to know the domain membership of tenants when ACLs are set (i.e. we have moved the problem somewhere else) or we enforce that ALL new ACLs must use domain-qualified names, including new ACLs in the legacy domain, which prevents us achieving Goal2 and Goal3.

Discriminating permitted ACL format when ACLs are set
'''[On reflection, I'm not sure we can enforce that an ACL uses domain-qualified names when the ACL is being set, since the ACL value may actually be expressed in terms of id's, and I'm not sure we can discriminate between and id and a name. So this section may be irrelevant.]'''

Can we continue to allow legacy ACLs to be set in the legacy domain by discriminating when handling the container POST with an X-Container-[Read|Write]?

Options considered:

A. Allow legacy ACL format to be set only when request is scoped on a legacy tenant. When an ACL is set by a user with admin role the request token will be scoped to the tenant, so keystoneauth can easily determine if the domain is legacy domain or not. Unfortunately this is not the case when a user with reseller-admin role sets the ACL - those tokens do not need to be scoped to the tenant. REJECT.

B. Allow legacy ACL to be set only when user (granter) is in legacy domain. The user's domain id is available in the request's token info, including users with reseller-admin role. But this would allow legacy users to set legacy ACLs on non-legacy tenants if they are given the admin role. REJECT.

3. Discover from ACL POST request
When a container POST with an X-Container-[Read|Write] header is received, inspect the token info for tenant domain info.


 * v3 token, admin user role: token is scoped on tenant so tenant domain id is available. Store this as sysmeta on container.
 * v3 token, reseller admin role: token may not be scoped on tenant. Store tenant domain id sysmeta = unknown
 * v2 token - tenant must be in legacy domain, store no tenant domain sysmeta

When validating ACL:
 * if tenant domain sysmeta does not exist -> ACL was set using a v2 token -> tenant is legacy -> allow unqualified names in ACL
 * if tenant domain sysmeta = id_x -> tenant domain id is known -> only allow unqualified ACL if user domain == tenant domain
 * if tenant domain sysmeta = unknown -> tenant may be in a non-legacy domain -> do not allow unqualified names in ACL

This approach looks hopeful...The final case is the wrinkle i.e. a reseller admin with a v3 token sets an ACL using unqualified names, keystoneauth cannot learn the tenant domain id when the ACL is set, so the only safe option is to deny the ACL. Is it acceptable to require that reseller admins should know better??

4. (PREFERRED) Discover when account is created
This is similar to (3). When an account created, inspect the token info for tenant domain info.


 * v3 token, admin user role: token is scoped on tenant so tenant domain id is available. Store this as sysmeta on account.
 * v3 token, reseller admin role: token may not be scoped on tenant. Store tenant domain id sysmeta = unknown
 * v2 token - tenant must be in legacy domain, store no tenant domain sysmeta (Note: we may not know the id of the default domain)

When validating ACL:
 * if tenant domain sysmeta does not exist -> account was created using a v2 token -> tenant is legacy -> allow unqualified names in ACL
 * if tenant domain sysmeta = id_x -> tenant domain id is known -> only allow unqualified ACL if user domain == tenant domain
 * if tenant domain sysmeta = unknown -> tenant may be in a non-legacy domain -> do not allow unqualified names in ACL

This approach has similar wrinkle to (3) - if the account using a reseller admin with v3 token we don't know its domain id. We could fail the account create, or assume that if the account was created under v3, even in the default domain, it's users should be aware of deprecation of name based ACLs.

Alternative interpretation of goals
How about if we allow legacy ACLs to be set in any domain but only grant a legacy ACL only when the granter and user requesting access are in the same domain?

To enable this we would persist the domain id of the granter when an ACL is set, and then compare this to the domain id of the user requesting access (both are always available in the token info). The granter request id could be persisted as an item of container metadata e.g. X-Container-Sysmeta-[Read|Write]-Granter.

This results in a slightly different semantic: a user in domainX could set a legacy ACL in a tenant in domainY but the ACL would only be granted to other users in domainX. Perhaps that is acceptable?

One potentially confusing consequence would be that a user in domainZ could not update the ACL without inadvertantly revoking access to users in domainX, because the granter's domain would be switched from domainX to domainZ.

Miscellaneous
1. In some instances global name uniqueness will be enforced independently of keystone. In this case an override config flag could be used to permit unqualified names to continue to be used in ACLs. This is provided by the current patch.

2. We could re-write all existing ACLs to a new domain qualified format. Unlikely to be a popular choice.