This page contains a summary of the Vancouver Forum discussions about the topic. Full notes of the discussion are in here. The features and requirements for edge cloud infrastructure are described in OpenStack_Edge_Discussions_Dublin_PTG.

Concerns to be addressed

Usability

Some data may be modified locally and must persist when changed

Functionality

There may be significant times with no connectivity and all functions (e.g. autoscaling) must continue to function

Security

Some data should NOT be synchornized to some sites, if the site is compromised, it should only hold relevant local data
Centralized "view" to synch status of edge clouds would be needed for audit / compliance
Centralized Management (of some sort) required.

Scalability

Edge sites may be very limited hardware (eg, may be single-node infrastructure)

Architecture options

Identity Provider (IdP) Master with shadow users

Challenge: Synchronizing user, project, and role assignments from a central source of truth to all clusters as part of managing identity across numerous deployments of OpenStack.

Solution: Federation with an Identity Provide master.

Oath's current implementation:

https://github.com/yahoo/openstack-collab/tree/master/keystone-federation-ocata
Thoughts on upstreaming items to Keystone: https://etherpad.openstack.org/p/keystone-shadow-mapping-athenz-delta

Berlin OpenStack Summit recap with relevant information:

https://www.lbragstad.com/blog/openstack-summit-berlin-recap

Details: The mapping between users, roles, and projects is managed in an Identity Provider (IdP), which will present a signed token to the user asserting their username, project, and keystone role mapping. This signed token is passed to Keystone as an authentication credential, and Keystone will then create the user, project, and role mapping if they do not already exist.

Open Questions:

Do we have an IdP supporting the above scenario and return the token with the information needed by Keystone?
How to handle sites with different configuration, like Jane is allowed to perform certain operations on Site A, but not on Site B?
How does this case handle connection loss between the IdP and the edge site? E.g.: expired Keystone token and/or CLI's.

The below diagram shows how the 'Admin' creates 'Jane' as a user who is a 'member' of project 'Foo'. After authenticating with the IdP Jane subsequently receives a signed token asserting her identity and authorization

The second diagram shows the User taking the signed token from the IdP and passing it to Keystone when requesting a Keystone token. Keystone validates the request, adds the user 'Jane' to the DB if necessary, and returns the token. Jane can now call other OpenStack services with that token.

This style of federation eliminates the need for the deployer to proactively synchronize users, projects, and role assignments to the Keystone instances in other clusters, dramatically reducing operational complexity.

Several keystone instances with federation and API synchronsation

Every edge cloud instance runs its own keystone instances. These keystone instances are federated where each keystone node is a "service provider" accepting and validating SAML assertions from a trusted identity provider (this is not the same as k2k federation). Each keystone maintains a mapping to control access depending on who needs what (this is going to be a lot of mappings, since there can be multiple for each deployment).
Basic flow:

A user presents a SAML assertion to prove their idenitty
The mapping processes their attributes, creates a shadow user, etc..
From there the user creates an application credential with their shadow user
A user generates tokens with their application credential to do things with that specific keystone deployment

More info

Analysis

Pros
- Federation is already supported by Keystone
Cons
- Connectivity loss between the client and the IdP or the client and the edge cloud instance leads to authentication problems
- Lots of mapping rules need to be maintained, but hey can be static

Questions

Can a Keystone in VIO act as an Identity provider for K2K federation?
Do we need further synchronisation of data on top of what we have in Keystone federation?
- There are some data, like the users or projects what needs to be distributed
  - Can be done using the mapping rules or with a logic in Keystone what explicitly creates the missing data (in this later case a logic is also needed to remove the data what is not needed anymore)
How to handle the situation when the IdP is isolated?
- Our clusters almost never actually need to communicate directly with the IdP. A user calls the IDP for the auth token, and passes that auth token to the cluster in question. Think of this type of federation as a "hidden master" where each keystone is capable of operating completely independently when called by a user. Service users within a cluster do not need to call the IDP, because they can be authenticated locally.

Keystone database replication with a distributed database

Every edge cloud instance runs its own keystone instances. The database of these instances are syncronised and the data is syncronised between the edge cloud instances by the standard replication mechanism of the database.

Related materials

Enhancing Edge Computing with Database Replication from 2007
Galera Multi-master replication: Region Support for Keystone with TripleO Ansible / TripleO proof of concept
StarlingX DRAFT Design Doc for Distributed DB-Sync'd Keystone Edge Architecture - DRAFT - open to any comments
- https://www.dropbox.com/s/653tjwnyvl3q544/dc_keystone_fernet_key_sync_and_db_sync_Jul24_2018.pptx?dl=0
Galera/cockRoach DB evaluation (performed within the FEMDC SiG)
- http://beyondtheclouds.github.io/blog/openstack/cockroachdb/2018/06/04/evaluation-of-openstack-multi-region-keystone-deployments.html

Analysis

Opinion: This alternative should not be used
Pros
Cons
- Distributed databases have limitations, for example Galera is able to synch only 16 DB-s
- Rolling upgrade of edge cloud instances is not supported

Keystone database replication with a synch service

Every edge cloud instance runs its own Keystone. There is a synchronisation agent on every edge cloud instance which can read and write the Keystone database. The synchronisation agent reads selected data from the database of a master Keystone and synchronises it to the slaves. Fernet keys are synchronized to achieve generate anywhere - use anywhere operation. After a partitioning ends the fernet keys should be deleted and resynched. Note - the clusters that get resynced will automatically have all their token "revoked". Tokes are not persisted in the database and updating the key repository could result in pre-mature token invalidation (because the key used to encrypt the token payload disappeared due to the update after the partition). There is a specification proposed that uses asynchronous signing instead of synchronous encryption, which could have ramifications on key management (since you're only syncing public keys instead of private keys or shared secrets).

Related materials

[StarlingX solution https://www.dropbox.com/s/653tjwnyvl3q544/dc_keystone_fernet_key_sync_and_db_sync_Jul24_2018.pptx?dl=0]

Analysis

Pros
Cons
- The synchronisation agent needs to understand the details of the Keystone databases structure
- Writing the data and keeping consistency might not be trivial

Distributed LDAP database as Keystone backend

Keystones in the edge cloud instances are using an LDAP database as a backend and the LDAP is configured to synchronize the data. </br> LDAP can be set up only as the auth realm and keystone RDB will provide identity service database. But Keystone can also handle both authentication and identity service which would imply there is no keystone relational database needed in this scenario.

Related materials

Keystone documentation about LDAP integration

Analysis

Pros
- LDAP synchronisation is a solved problem
Cons
- This is not the right tool to synchronise among a high number of sites

Questions

Is it possible to store and synchronize all Keystone related data in this way?

Isolated Domains Per Edge and Localized Authority to Change data within isolated domain(s)

"Spoke/Hub Model"-ish
"Local DB for local "data" and pending writes
Local data is send up to central hub once connectivity is restored
Sites are authoritative for it's domain(s) no other "remote" domains are aurhoritative
Central Hub is authoritative to write to any domain
"Code/Service" written to handle bundling local changes and ship to central for distribution/synchonization down when/if connictivity is restored
- This must be allowed to do things that normal Keystone-API work cannot do (create project in the database with a specific UUID)

Analysis

Pros
Cons

Keystone API Synchronization & Fernet Key Synchronization

Every Edge Cloud instance runs its own keystone instance,
Keystone resources are replicated from central site to edge clouds using API-based Synchronization,
- i.e. projects, users, groups, domains, ...
Also supporting Fernet Key synchronization and management across Edge Clouds in order to enable Tokens created at any Edge / Central cloud being able to be used (and authenticated) in any other clouds.
( NOTE THIS OPTION IS ONLY POSSIBLE IF KEYSTONE API CAN BE CHANGED TO SYNCH USERID AND PROJECT ID )
- Fernet Tokens contain userId and projectId, so these MUST be synchronized across all clouds,
- Previous attempts to get this upstreamed in Keystone have failed ==> which likely RULES THIS OPTION OUT.

Analysis

Pros
Cons

Replicated data

This is the list of data what is syncronised by StarlingX

Keystone
- Users
- Projects
- Roles
- Assignments
- Groups (not yet implemented)
- Domains (not yet implemented)
- Fernet keys (not yet implemented)
Nova
- Flavors
- Flavor extra specs
- Keypairs
- Quotas (should be managed dynamically in edge cloud infrastructure level I.e. a project that has a quota of 10 instances, can only create 10 instances across ALL Edge Clouds; NOT 10 instances per Edge Cloud.)
Neutron
- Security Groups
- Security Group Rules
Cinder
- Quotas

Keystone edge architectures

Contents

Concerns to be addressed

Usability

Functionality

Security

Scalability

Architecture options

Identity Provider (IdP) Master with shadow users

Several keystone instances with federation and API synchronsation

More info

Analysis

Questions

Keystone database replication with a distributed database

Related materials

Analysis

Keystone database replication with a synch service

Related materials

Analysis

Distributed LDAP database as Keystone backend

Related materials

Analysis

Questions

Isolated Domains Per Edge and Localized Authority to Change data within isolated domain(s)

Analysis

Keystone API Synchronization & Fernet Key Synchronization

Analysis

Replicated data

Related links