Jump to: navigation, search

Difference between revisions of "KeyManager"

(Security Model)
(Key Manager)
 
(62 intermediate revisions by 2 users not shown)
Line 2: Line 2:
 
= Key Manager =
 
= Key Manager =
  
Server side encryption with key management would make data protection more readily available, enable harnessing of any special hardware encryption support on the servers, make available a larger set of encryption algorithms and reduce client maintenance effort.  Amazon and Google’s object storage systems provide transparent data encryption.
+
===The Key Manager effort became [[Barbican]].  This documentation is here for historical purposes.===
 +
 
 +
malini.k.bhandaru "at" intel.com
 +
 
 +
https://etherpad.openstack.org/havana-key-manager
 +
 
 +
== History ==
 +
March 06, 2013:  Initial version
 +
 
 +
April 14, 2013: Added reference to Rackspace session at OpenStack Summit
 +
 
 +
April 18, 2013: Added pointer to etherpad from Key Manager design session at OpenStack Summit
 +
 
 +
April 23, 2013: Added section on "Post summit discussion" changes/clarifications
 +
 
 +
 
 +
 
 +
Server side encryption with key management would make data protection more readily available, enable harnessing of any special hardware encryption support on the servers, make available a larger set of encryption algorithms and reduce client maintenance effort.  Amazon and Google’s object storage systems provide transparent data encryption. Encryption is no longer prohibitive with newer chips available that carry hardware support for AES-NI, and implementations that harness possible parallelisms in data and processor architecture. The popular wisdom today is increase security <ref>http://blog.dustinkirkland.com/2012/10/encrypt-everything-everywhere-in.html</ref>
 +
 
 
Recently interest has grown in  [[http://www.openstack.org/|OpenStack]]  to provide server side encryption in Cinder ( Volume ) <ref> https://blueprints.launchpad.net/nova/+spec/encrypt-cinder-volumes
 
Recently interest has grown in  [[http://www.openstack.org/|OpenStack]]  to provide server side encryption in Cinder ( Volume ) <ref> https://blueprints.launchpad.net/nova/+spec/encrypt-cinder-volumes
 
</ref>, Swift ( Object ) <ref> http://www.mirantis.com/blog/openstack-swift-encryption-architecture</ref>, Glance ( Snapshot ).  
 
</ref>, Swift ( Object ) <ref> http://www.mirantis.com/blog/openstack-swift-encryption-architecture</ref>, Glance ( Snapshot ).  
Line 8: Line 26:
 
Protecting data involves not only encryption support but also '''key management''', the creating, storing, protecting, and providing ready access to the encryption keys. The keys would need to be stored on a device separate from that housing the data they seek to protect. Key management could be a separate OpenStack service or a sub-service of Keystone, OpenStack's identity service.
 
Protecting data involves not only encryption support but also '''key management''', the creating, storing, protecting, and providing ready access to the encryption keys. The keys would need to be stored on a device separate from that housing the data they seek to protect. Key management could be a separate OpenStack service or a sub-service of Keystone, OpenStack's identity service.
  
The keys themselves ideally would be random, of the desired length, with associated meta data such as ownership and themselves encrypted before being stored.  
+
The keys themselves ideally would be random, of the desired length, with associated meta data such as ownership and themselves encrypted before being stored.
  
== Security Model ==
+
= Security Model =
 
* '''Protection of data at rest''': the encrypted data  and the encryption keys are held in separate locations. Stealing the data disk still leaves the data protected.  
 
* '''Protection of data at rest''': the encrypted data  and the encryption keys are held in separate locations. Stealing the data disk still leaves the data protected.  
 
* '''Keys opaque''': The keys  themselves are  encrypted using  '''Master" keys.
 
* '''Keys opaque''': The keys  themselves are  encrypted using  '''Master" keys.
Line 22: Line 40:
 
different keys.
 
different keys.
  
== Keys ==
+
= Design Considerations =
# API
+
 
 +
== High Availability ==
 +
Think of the key manager as a dictionary of <key-ids> <key-strings>
 +
The keys have to be as accessible as the objects they encrypt.  Either the Key Manager backing store should be something along the lines of Swift which provides high availability and redundancy by way of Swift Proxies and multiple replication sites. Alternately, the backing store could be mirrored databases.
 +
Ideally the mirrors or replication sites should be in different geographical zones.
 +
 
 +
For security, keys and the data they seek to protect should not be co-resident on the same physical device. Given this constraint, should one take a Swift backing store solution approach, it may be simple to introduce a separate Swift cluster to store the keys. The storage needs just for the keys would be less than the typical storage needs of Swift for object and snapshot/image storage.
 +
 
 +
Other high availability solutions that are typically used instead of Swift in a production environment also meet the needs of the Key Manager storage.
 +
 
 +
== Opaque Keys ==
 +
Keys while in storage will be encrypted for security. This calls for master keys to encrypt the key strings.
 +
 
 +
== Protecting Master Keys ==
 +
Master keys are long-lived and used to encrypt a large number of keys and require strong protection. These criteria recommend that master
 +
keys be readily accessible, stored locally, and as securely as possible. Trusted Compute Platform storage meets these requirements.
 +
<ref>http://opensecuritytraining.info/IntroToTrustedComputing_files/Day1-7-tpm-keys.pdf</ref> <ref>http://en.wikipedia.org/wiki/Trusted_Platform_Module</ref>
 +
 
 +
== Restricting Service Access ==
 +
The Key Management service will be available only to the OpenStack services, excluding the Compute Hosts, which are the least trusted of the hosts (and the reason no-compute-db  feature was developed). Note, it shall not be available to end-users. The user will never be provided access to the keys directly. At the time of account creation, they may request the
 +
creation of keys for users/projects/domains.
 +
 
 +
== Restricting Key Access ==
 +
Keys are owned by the service that creates them, and access to such keys is limited to the service introducing them. 
 +
 
 +
The exception to the above are the wider scope keys that are used in dual locking. That is the  User/Project/Domain keys, which belong to  the Identity Service, which is a part of Keystone.
 +
Keystone's Trust feature will be used to delegate access to such keys to the encryption service needing them. Delegation comes with an expiry period. Delegation brings with a need for services to access the Keystone Identity service master key. Transfer of the Identity Keystone Master key from one service to the other can be securely performed using TPM symmetric key sharing protocols. <ref>http://shazkhan.files.wordpress.com/2010/10/http__www-trust-rub-de_media_ei_lehrmaterialien_trusted-computing_keyreplication_.pdf
 +
</ref>
 +
 
 +
 
 +
== Key Attributes / meta-data ==
 +
Keys could have attributes such as no-cache, number-of-uses. KMIP has a notion of usage-mask, who can use for what purposes (encryption/decryption etc).
 +
 
 +
==  Key Caching ==
 +
Key Manager’s keys need to be accessible at the same level as the objects they encrypt, to ensure ready access. The keys themselves could be cached at the service endpoint using them with an expiration equal to or less than that of the access token lifetime used to obtain them.  Caching reduces network traffic and the load on the key manager.
 +
With dual keys, where the wider scope key is obtained through access delegation, the lifetime would be that of the delegation period.
 +
== Logging ==
 +
Need to log all access to keys,<who, what, when>
 +
CRUD: creation, read, update, delete actions should be logged as important events.
 +
This would meet regulation/audit needs such as HIPAA, Sarbanes Oxley etc.
 +
 
 +
Invalid/autherized attempts shpuld also be logged including IP address, time etc that may indicate
 +
hacking attempts and need for action.
 +
 
 +
== Life Cycle Management: Background Tasks ==
 +
# archiving, re-keying
 +
# API for life cycle management
 +
# Plug-in solutions/implementations (open source and proprietary)
 +
 
 +
== Side Benefits ==
 +
# Communication between the service and the key manager do not need to be further encrypted using ssl or https because they keys flying between them are at all times encrypted. The decrypted key string would at any time only reside on the service that seeks to save it or use.
 +
# Keys used by different open stack services could reside in a single storage system but if one service were to be compromised, the keys from other services would still be safe.
 +
# Further, should there be a desire to change a master key, only keys stored by that service need to be re-encrypted. The actual data that they were used to encrypt do not  need to be re-encrypted.
 +
 
 +
= Key Manager in OpenStack =
 +
 
 +
 
 +
== Key API ==
 +
 
 +
  create <authorization-token>
 +
        Key manager will create a random key and save the same, and return a tuple  <key-id, key-string>
 +
        The communication between requester and key-manager should be secure to ensure that the key is not compromised.
 +
         
 
   get <key-id>  <authorization-token>
 
   get <key-id>  <authorization-token>
 
    
 
    
Line 31: Line 111:
  
 
   update<key-id> <authorization-token>
 
   update<key-id> <authorization-token>
 +
 +
By supporting key delete, we essentially render any stored data associated with it inaccessible. It will not be necessary to "wipe" clean / shred  for instance a block
 +
device. Just necessary to update that they area is free and can be re-used.
  
 
=== Key Scope: ===
 
=== Key Scope: ===
Line 40: Line 123:
 
For strong encryption, typically a key is used in conjunction with an initialization vector (IV). The per-entity key would serve as an IV. It could be used alone or in conjunction with a wider scoped key, such as a domain scope key.
 
For strong encryption, typically a key is used in conjunction with an initialization vector (IV). The per-entity key would serve as an IV. It could be used alone or in conjunction with a wider scoped key, such as a domain scope key.
  
=== Key-Size ===
+
=== Key Size ===
 
* 128, 192, 256, .. 2048 .. longer or shorter (possibly used with padding).
 
* 128, 192, 256, .. 2048 .. longer or shorter (possibly used with padding).
 
Some algorithms require longer keys, so we support a wide range.
 
Some algorithms require longer keys, so we support a wide range.
  
== Typical Use ==
 
Consider for example an object in swift is encrypted. The Swift file system representation of the object X would carry with it  meta data
 
that would indicate:
 
encrypted(object-x), meta_data: <enc:true, algorithm:aes-cbc, key-id: 1234567899 >
 
Volume<id>, meta_data: <enc:true, algorithm:aes-xts, key-id:abcdefghijklmnopqrstuvxyz>
 
  
=== Encryption Algorithm ===
+
[[File:Key-manager.JPG||Key Manager in OpenStack]]
These would be obtained by the OpenStack services by directly querying the libraries used to provide encryption support. The options would also be provided as
 
options during user/project/domain creation to be used as a default, and possibly offered with the creation of each entity (could get too chatty for high volume data such as objects).
 
Typical options would be RSA, AES, DES etc.
 
  
== Design Considerations ==
+
The master keys would be held in TPM Storage
=== Key Manager Access ===
+
[[File:TPM_english.svg||TPM Close-Up]]
        Restricting access to the Key Manager to only OpenStack services other than the Compute Nodes (the least trusted of the hosts), and in so doing  increase security. No end user access.
 
=== Access Control ===
 
Keys inserted by a service only accessible by that service. User/Project/Domain, that is wider scope keys, owned by the Identity Service, which is part of keystone.
 
Dual locking requires access to wider scope keys that do not belong to the service that provides the encryption support. This access is obtained using
 
Keystone's Trust mechanism, available in the V3 API. It also needs access to the keystone master key used to encrypt the keys it owns.
 
=== Master Key ===
 
      Each [[OpenStack]] service that uses Key Manager to maintain its keys could have its own master key and use the same to encrypt a key string before passing it for storage to the Key Manager. The Master key could reside on a python key ring (currently it is included in common module in [[OpenStack]] and readily available to all packages).
 
'''Benefits''':
 
  
1. Communication between the service and the key manager do not need to be further encrypted using ssl or https because they keys flying between them are at all times encrypted. The decrypted key string would at any time only reside on the service that seeks to save it or use.
+
[[File:Service_KeyManager_Internals.JPG||OpenStack Service and Key Manager Internals]]
  
2. Keys used by different open stack services could reside in a single storage system but if one service were to be compromised, the keys from other services would still be safe.
+
== Encryption ==
  
3. Further, should there be a desire to change a master key, only keys stored by that service need to be re-encrypted. The actual data that they were used to encrypt do not  need to be re-encrypted.
+
Available encryption algorithm options would be obtained by the OpenStack services directly querying the libraries used to provide such encryption support. The options would also be provided as options during user/project/domain creation, to set defaults. The options may further be offered with each entity creation (could get too chatty for high volume data such as objects).
[[File:Key-manager.JPG||Key Manager in OpenStack]]
+
Typical options would be RSA, AES etc.
=== Fault Tolerance and High Availability ===
 
Key Manager’s keys need to be accessible at the same level as the objects they encrypt, to ensure ready access. The keys themselves could be cached at the service endpoint using them with an expiration equal to or less than that of the access token lifetime used to obtain them.  Caching reduces network traffic and the load on the key manager.
 
With dual keys, where the wider scope key is obtained through access delegation, the lifetime would be that of the delegation period.
 
 
 
== Concerns and Questions ==
 
# With another service, Key Manager, in the picture, we have another component that could fail. But encryption will always need keys to be maintained, so this would be
 
a cost of the feature. Caching keys mitigates some of the problems, and using TPM protects the keys while they are being saved and transmitted by way of encryption, and
 
only decrypted at point of use.
 
  
# Do we need to support KMIP in the key Manager. If the keys are just of end user encryption provision, perhaps not. However if we desire to use the key manager to save
 
private and public keys of the services in OpenStack, KMIP would be good to have.
 
  
# '''Data transfer overhead: Swift uses Rsync for file transfer during replication.''' Any encryption algorithm that uses some form of block cipher chaining or new initialization vector each time would result in the object representation changing drastically on each update. This would result in a larger network payload for transmission.
+
'''Swift'''(object storage) example: assume an object X is stored in encrypted for on the Swift object store.
 +
Let enc-object-x be the encrypted representation of object X. Then the Swift file system would contain:
  
# '''Unauthorized key deletion:''' If we use a Swift based system to store keys and insert tombstone records to mimic a legitimate deletion after breaking into a Swift key storage node, yes, keys could indeed be deleted by a reaper task, but this would be no new security hazard from what Swift deals with today. Perhaps we could introduce a check that there was a logged request to delete a key before deleting a key.
+
'''Swift'''
  
# '''Wary of losing control of encryption key(s):''' Support the use case where the end user provides the encryption key (and stores a copy of their own key, and is responsible for maintaining safety of the key). The said key will not then be saved in the Key Manager.
+
'''enc-object-x,  meta_data: <enc:true, algorithm:aes-cbc, key-id: 1234567899 >'''
  
7• '''Do we need an IV (initialization vector) for each object encrypted.''' Yes, if we take the common key for a project or domain approach. In this case the IV would need to be encrypted, and could be stored against a key-id. We could specify “compound-encryption” to imply use a master key in conjunction with the IV (accessed via the iv-id attached to the object meta-data).
+
Similarly, an encrypted Cinder volume might be represented as
  
8• '''No re-keying in phase-1.''' Not addressing background tasks of object re-keying such as that mentioned in Mirantis blog.
+
'''Cinder'''
  
== Implementation versions ==
+
'''Volume<id>, meta_data: <enc:true, algorithm:aes-xts, key-id:abcdefghijklmnopqrstuvxyz>'''
'''Phase 1:''' Develop stub Key Manager service and specify encryption parameters in the url
 
  
Key manager could just be a hash table in the first version to get all the APIs specified and implement, to get the plumbing correct. Support a single most popular encryption algorithm.
+
== Key Flow ==
This would fully implement object encryption.
+
The figures below illustrate how the Key Manager fits into the regular flow of putting and getting an object in Swift.
 +
For simplicity, caching of keys and secondary key handling (for dual locking) is omitted.
 +
[[File:Key-manager-put-example.JPG|Encrypted saving of an object]]
 +
[[File:Get_encrypted_flow.JPG|Decrypted retrieval of an object]]
  
'''Phase-2:''' Make Key Manager is Swift instance, with multiple zones for storage. This would support true HA and fault tolerance.
+
== Concerns/Questions ==
 +
#'''Another failure point''': With another service, Key Manager, in the picture, we have another component that could fail. But encryption need keys, maintained by either the end user or the server. This is a feature cost. Caching keys mitigates some of the problems that arise from network latency and server failure. Using the TPM to protect the encryption master keys makes the cache less of a security hole. 
 +
# '''KMIP''': Do we need to support KMIP <ref>http://en.wikipedia.org/wiki/Key_Management_Interoperability_Protocol</ref> in OpenStack?  If the keys are not for end user direct consumption, KMIP is not mandated. However if we desire to use the key manager to save private and public keys of the OpenStack services, then  KMIP would be useful to exchange information across cloud boundaries.
 +
# '''Encryption data transfer overhead''':  Keys typically are not updated, except on master key re-keying. Swift uses Rsync for replication, and for objects of size keys, it is not a performance criterion. However, strong encryption requires Initialization Vectors (IV)/Salts/and cipher chaining. Thus a small change in a document towards the beginning will generate a totally different encrypted object, which is what we desire, but in the context of  Rsync, it implies a full data payload needs to be transferred. But data protection overrides all else here. Further, use cases may establish that this is an unwarranted concern if typically there are few updates to an object.
 +
# '''Unauthorized key deletion:''' If we use a Swift based system for the Key Manager backend store, a hijacked server with spurious insertions of tombstone records to mimic a legitimate deletion Swift storage nodes would result in key loss by way of a background reaper task periodically deleting such objects. This would not be a new security hazard, and has to be handled as today. Perhaps key deletion could be turned off to prevent such havoc.
 +
#'''Fear of Key Loss''': Key Manager back end storage should ideally be distributed in geographically disparate locations. 
 +
#'''Salts/IV''': A key per object/entity behaves like a salt/IV, especially when used in conjunction with a user/project/domain (wider scope) key
  
'''Phase-3:''' Support multiple encryption algorithms. For instance, volume encryption may prefer XTS, an encryption strategy that uses sector address.
+
= Phased Implementation =
 +
Key Manager implementation could be in phases along the lines below. Double locking could even be part of phase I.
  
'''Phase-4:''' Reaper routine to change a master key for a service
+
== Phase I ==
 +
#Stub Key Manager, could pull out JHU-APL or Mirantis Key Manager implementation or Rackspace's (*) key manager as a service solution(added April 14, 2013), and  float as either a new OpenStack Service ( or a sub-service of KeyStone). Essentially establish all the plumbing flows. Define a KeyManager_client that the other services use, via the KeyManager API. The key manager back end could initially be a file or mysql or sqlite database. The default could be mysql backend for devstack like single machine deployments for developer/testing.
 +
#Master keys stored on Python key ring or using mechanisms similar to private key protection on the various OpenStack service host machines.
 +
#Encryption algorithm and parameters could initially be defaults in a nova conf file (JHU-APL approach for volume encryption) or defaults per user/project/domain, going up the generalization chain till something specifc is found else use a domain level default.
  
== Intel's Interest ==
+
== Phase II ==
 +
Make Key Manager a separate Swift instance, with multiple zones for storage. This would support true HA and fault tolerance.
 +
== Phase III ==
 +
#Support multiple encryption algorithms via encryption library querying. Provide user interface support to select preferences and store as part of user/project/domain profile.
 +
#Reaper routine to change a master key for a service, aka re-keying.
 +
#Support dual locking, a feature that uses KeyStone Identity V3 API's trust feature.
  
Intel X86 hardware in Westmere and beyond provides AES-NI, hardware support for encryption/decryption, which provides performance improvements that can be as high as six or more times than pure software encryption. The hardware support also protects from certain side channel threats based on timing and memory analysis.
+
== Phase IV ==
In addition to hardware support, Intel's open source contributions for algorithm acceleration, that exploit context such common keys, data depencies, register widths to increase parallelism, and architectural features buy additional performance. <ref>http://www.intel.com/content/www/us/en/communications/communications-ia-multi-buffer-paper.html</ref>
+
Introduce true TPM support for master keys. For instance, volume encryption may prefer XTS, an encryption strategy that uses sector address.
<ref>http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/linux-secure-storage-performance-aes-ni-paper.pdf</ref>
+
We have expertise on TPM within Intel, and attestation service support already in OpenStack, which is currently being used to verify goodness of compute nodes.
<ref>http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/ia-high-performance-storage-encryption-paper.pdf</ref>
+
This would extend to the OpenStack service nodes.
 +
== Phase V ==
 +
Chef puppet support for transferring symmetric keys to the various service host machines, particularly to scale horizontally in as automated a fashion as possible.
  
 
== Glossary ==
 
== Glossary ==
  
Key-string:  A string of bits used to encrypt data. Ideally auto-generated using a random number generator that exploits entropy. Intel's hardware random number generator is a high speed source of quality randomness.  
+
'''Key-string''':  A string of bits used to encrypt data. Ideally auto-generated using a random number generator that exploits entropy. Intel's hardware random number generator is a high speed source of quality randomness.  
  
Key-id: a unique ID used to index a key-string in the system. The key-id will be attached as meta data with the encrypted object/volume/.
+
'''Key-id''': a unique ID used to index a key-string in the system. The key-id will be attached as meta data with the encrypted object/volume/.
  
 
'''Master-key''': a key-string used to encrypt the keys (key-strings) before saving in the key manager, saved in trusted storage at the service end-point.
 
'''Master-key''': a key-string used to encrypt the keys (key-strings) before saving in the key manager, saved in trusted storage at the service end-point.
 +
 +
'''TPM''': Trusted Platform Module
 +
 +
== Post Summit Discussion/Revisions/Clarifications ==
 +
 +
 +
* '''Where the key is generated''' The original design had each agent (Swift, Cinder etc) have its own master key and generate the keys for object/volume etc encryption and use the key manager to just store these along with additional attributes. An advantage of this approach is that the key is never transferred in the clear between any of the cloud service endpoints, and buys time for us to routinely encrypt the communication channels between the end-points. While the design reduces the amount of damage/exposure should an agent get compromised, it adds a layer of deployment complexity. This stems from the master key per agent have to be shared/transferred to sibling agents for high availability. Sure such master keys can be transferred securely, using PKI and encrypting the master with the public key of the receiving agent sibling, or using the TPM transfer protocol which internally also uses PKI public key based encryption of the payload. 
 +
 +
Instead, having all keys created at the key manager and encrypted with a master, possibly a different master per key requesting agent, would isolate all key management related activity to the key manager. It would then be a modular plug in.One of the comments regarding dispersed master keys and key generation was that to support high availability.
 +
 +
With this approach the "put object" or encryption example, sequence diagram would change. Swift would invoke "Create_key" instead of creating key itself and invoking put_key.
 +
 +
Keys could still be cached at the agents using a per agent master key that was protected in its TPM or other ways.
 +
 +
* '''Encrypted Communication'''  Users such as the NSA need/want source and destination authenticated and encrypted communication.  Adam Young suggested exploring NSS http://www.mozilla.org/projects/security/pki/nss/  to meet this requirement and in so doing also possibly avoid all things eventlet/blocking/performance draining.
 +
 +
* ''' API '''  Red Hat folks suggested examining the DogTag API to determine if it met our needs and act as a point of reference.
 +
 +
* ''' Life cycle tasks '''
 +
Adam Young and Guang-Yee suggested exploring FreeIPA and leveraging.
 +
 +
* '''Logging for Complaince''' Rackspace's demo for logging is rich.
 +
 +
* '''Additional security via time-limited delegated authorization"  The original design allowed access to an encrypted object string only to the original owner who did the "put". Swift was then delegated access to the encryption key for a limited time. But Swift could access all the keys it inserted in the original design because they belonged to Swift.
 +
The keys could instead belong to the original user, and the delegation would provide access to a key if and only if its owner-id matched the user-id (or project-id or domain-id based on key-scope).
 +
 +
* '''Supporting KMIP where necessary'''  Both the Intel and Rackspace designs support formatters, KMIP being a formatter. (KMIP spec mentions either dynamic formatting or saving data in KMIP format are both acceptable).
 +
 +
* '''Separate Service'''  We were all in agreement about a separate key manager service, to keep its functionality separate, enable isolated hardening, easy plugin should a cloud vendor perhaps want a hardware based solution, not expose it externally (and if necessary offer private and public urls for access and along the lines of security groups limit access) etc.
 +
 +
== References ==
 +
<references>
 +
</references>

Latest revision as of 17:04, 12 August 2016

Key Manager

The Key Manager effort became Barbican. This documentation is here for historical purposes.

malini.k.bhandaru "at" intel.com

https://etherpad.openstack.org/havana-key-manager

History

March 06, 2013: Initial version

April 14, 2013: Added reference to Rackspace session at OpenStack Summit

April 18, 2013: Added pointer to etherpad from Key Manager design session at OpenStack Summit

April 23, 2013: Added section on "Post summit discussion" changes/clarifications


Server side encryption with key management would make data protection more readily available, enable harnessing of any special hardware encryption support on the servers, make available a larger set of encryption algorithms and reduce client maintenance effort. Amazon and Google’s object storage systems provide transparent data encryption. Encryption is no longer prohibitive with newer chips available that carry hardware support for AES-NI, and implementations that harness possible parallelisms in data and processor architecture. The popular wisdom today is increase security [1]

Recently interest has grown in [[1]] to provide server side encryption in Cinder ( Volume ) [2], Swift ( Object ) [3], Glance ( Snapshot ).

Protecting data involves not only encryption support but also key management, the creating, storing, protecting, and providing ready access to the encryption keys. The keys would need to be stored on a device separate from that housing the data they seek to protect. Key management could be a separate OpenStack service or a sub-service of Keystone, OpenStack's identity service.

The keys themselves ideally would be random, of the desired length, with associated meta data such as ownership and themselves encrypted before being stored.

Security Model

  • Protection of data at rest: the encrypted data and the encryption keys are held in separate locations. Stealing the data disk still leaves the data protected.
  • Keys opaque: The keys themselves are encrypted using Master" keys.
  • Master Key Protection: Master keys are protected in Hardware using Trusted Platform Module (TPM) technology. Keys are released to only trusted host machines (BIOS and initial boot sequences that are measurably good ).
  • Secure Master Key Transmission: TPM technology is used to transfer master keys, from cooperating services and between sibling services (in the case of horizontal scaling).
  • Support Dual Locking: High value data could be protected with a user/project/domain specific key and a service key. This is akin to using two keys such as with bank safe-deposit boxes, a bank key and a customer key.
  • Limited Knowledge: Key Manager will not maintain mapping between keys to encrypted entities. Encrypted entities will maintain as meta data key-id (a pointer to a key to be used to unlock the same. But with dual keys, the customer key is not referenced, it is implicit as part of the authentication.
  • Limited Access: Authorization and access control mechanisms limit access to keys.
  • Protection from denial of service: multiple replicas of key manager.
  • Data Isolation: Should some audit or law enforcement authority demand access for a certain customer, data belonging to other customers not exposed because they use

different keys.

Design Considerations

High Availability

Think of the key manager as a dictionary of <key-ids> <key-strings> The keys have to be as accessible as the objects they encrypt. Either the Key Manager backing store should be something along the lines of Swift which provides high availability and redundancy by way of Swift Proxies and multiple replication sites. Alternately, the backing store could be mirrored databases. Ideally the mirrors or replication sites should be in different geographical zones.

For security, keys and the data they seek to protect should not be co-resident on the same physical device. Given this constraint, should one take a Swift backing store solution approach, it may be simple to introduce a separate Swift cluster to store the keys. The storage needs just for the keys would be less than the typical storage needs of Swift for object and snapshot/image storage.

Other high availability solutions that are typically used instead of Swift in a production environment also meet the needs of the Key Manager storage.

Opaque Keys

Keys while in storage will be encrypted for security. This calls for master keys to encrypt the key strings.

Protecting Master Keys

Master keys are long-lived and used to encrypt a large number of keys and require strong protection. These criteria recommend that master keys be readily accessible, stored locally, and as securely as possible. Trusted Compute Platform storage meets these requirements. [4] [5]

Restricting Service Access

The Key Management service will be available only to the OpenStack services, excluding the Compute Hosts, which are the least trusted of the hosts (and the reason no-compute-db feature was developed). Note, it shall not be available to end-users. The user will never be provided access to the keys directly. At the time of account creation, they may request the creation of keys for users/projects/domains.

Restricting Key Access

Keys are owned by the service that creates them, and access to such keys is limited to the service introducing them.

The exception to the above are the wider scope keys that are used in dual locking. That is the User/Project/Domain keys, which belong to the Identity Service, which is a part of Keystone. Keystone's Trust feature will be used to delegate access to such keys to the encryption service needing them. Delegation comes with an expiry period. Delegation brings with a need for services to access the Keystone Identity service master key. Transfer of the Identity Keystone Master key from one service to the other can be securely performed using TPM symmetric key sharing protocols. [6]


Key Attributes / meta-data

Keys could have attributes such as no-cache, number-of-uses. KMIP has a notion of usage-mask, who can use for what purposes (encryption/decryption etc).

Key Caching

Key Manager’s keys need to be accessible at the same level as the objects they encrypt, to ensure ready access. The keys themselves could be cached at the service endpoint using them with an expiration equal to or less than that of the access token lifetime used to obtain them. Caching reduces network traffic and the load on the key manager. With dual keys, where the wider scope key is obtained through access delegation, the lifetime would be that of the delegation period.

Logging

Need to log all access to keys,<who, what, when> CRUD: creation, read, update, delete actions should be logged as important events. This would meet regulation/audit needs such as HIPAA, Sarbanes Oxley etc.

Invalid/autherized attempts shpuld also be logged including IP address, time etc that may indicate hacking attempts and need for action.

Life Cycle Management: Background Tasks

  1. archiving, re-keying
  2. API for life cycle management
  3. Plug-in solutions/implementations (open source and proprietary)

Side Benefits

  1. Communication between the service and the key manager do not need to be further encrypted using ssl or https because they keys flying between them are at all times encrypted. The decrypted key string would at any time only reside on the service that seeks to save it or use.
  2. Keys used by different open stack services could reside in a single storage system but if one service were to be compromised, the keys from other services would still be safe.
  3. Further, should there be a desire to change a master key, only keys stored by that service need to be re-encrypted. The actual data that they were used to encrypt do not need to be re-encrypted.

Key Manager in OpenStack

Key API

  create <authorization-token>
        Key manager will create a random key and save the same, and return a tuple  <key-id, key-string>
       The communication between requester and key-manager should be secure to ensure that the key is not compromised.
          
  get <key-id>   <authorization-token>
  
  put <key-id> <encrypted-key-string> <authorization-token>
  delete<key-id> <authorization-token>
  update<key-id> <authorization-token>

By supporting key delete, we essentially render any stored data associated with it inaccessible. It will not be necessary to "wipe" clean / shred for instance a block device. Just necessary to update that they area is free and can be re-used.

Key Scope:

  • Per entity (entity could be a volume, an object, a VM image/snapshot)
  • Per user
  • Per project (within a domain)
  • Per domain

For strong encryption, typically a key is used in conjunction with an initialization vector (IV). The per-entity key would serve as an IV. It could be used alone or in conjunction with a wider scoped key, such as a domain scope key.

Key Size

  • 128, 192, 256, .. 2048 .. longer or shorter (possibly used with padding).

Some algorithms require longer keys, so we support a wide range.


Key Manager in OpenStack

The master keys would be held in TPM Storage TPM Close-Up

OpenStack Service and Key Manager Internals

Encryption

Available encryption algorithm options would be obtained by the OpenStack services directly querying the libraries used to provide such encryption support. The options would also be provided as options during user/project/domain creation, to set defaults. The options may further be offered with each entity creation (could get too chatty for high volume data such as objects). Typical options would be RSA, AES etc.


Swift(object storage) example: assume an object X is stored in encrypted for on the Swift object store. Let enc-object-x be the encrypted representation of object X. Then the Swift file system would contain:

Swift

enc-object-x, meta_data: <enc:true, algorithm:aes-cbc, key-id: 1234567899 >

Similarly, an encrypted Cinder volume might be represented as

Cinder

Volume<id>, meta_data: <enc:true, algorithm:aes-xts, key-id:abcdefghijklmnopqrstuvxyz>

Key Flow

The figures below illustrate how the Key Manager fits into the regular flow of putting and getting an object in Swift. For simplicity, caching of keys and secondary key handling (for dual locking) is omitted.

Encrypted saving of an object

Decrypted retrieval of an object

Concerns/Questions

  1. Another failure point: With another service, Key Manager, in the picture, we have another component that could fail. But encryption need keys, maintained by either the end user or the server. This is a feature cost. Caching keys mitigates some of the problems that arise from network latency and server failure. Using the TPM to protect the encryption master keys makes the cache less of a security hole.
  2. KMIP: Do we need to support KMIP [7] in OpenStack? If the keys are not for end user direct consumption, KMIP is not mandated. However if we desire to use the key manager to save private and public keys of the OpenStack services, then KMIP would be useful to exchange information across cloud boundaries.
  3. Encryption data transfer overhead: Keys typically are not updated, except on master key re-keying. Swift uses Rsync for replication, and for objects of size keys, it is not a performance criterion. However, strong encryption requires Initialization Vectors (IV)/Salts/and cipher chaining. Thus a small change in a document towards the beginning will generate a totally different encrypted object, which is what we desire, but in the context of Rsync, it implies a full data payload needs to be transferred. But data protection overrides all else here. Further, use cases may establish that this is an unwarranted concern if typically there are few updates to an object.
  4. Unauthorized key deletion: If we use a Swift based system for the Key Manager backend store, a hijacked server with spurious insertions of tombstone records to mimic a legitimate deletion Swift storage nodes would result in key loss by way of a background reaper task periodically deleting such objects. This would not be a new security hazard, and has to be handled as today. Perhaps key deletion could be turned off to prevent such havoc.
  5. Fear of Key Loss: Key Manager back end storage should ideally be distributed in geographically disparate locations.
  6. Salts/IV: A key per object/entity behaves like a salt/IV, especially when used in conjunction with a user/project/domain (wider scope) key

Phased Implementation

Key Manager implementation could be in phases along the lines below. Double locking could even be part of phase I.

Phase I

  1. Stub Key Manager, could pull out JHU-APL or Mirantis Key Manager implementation or Rackspace's (*) key manager as a service solution(added April 14, 2013), and float as either a new OpenStack Service ( or a sub-service of KeyStone). Essentially establish all the plumbing flows. Define a KeyManager_client that the other services use, via the KeyManager API. The key manager back end could initially be a file or mysql or sqlite database. The default could be mysql backend for devstack like single machine deployments for developer/testing.
  2. Master keys stored on Python key ring or using mechanisms similar to private key protection on the various OpenStack service host machines.
  3. Encryption algorithm and parameters could initially be defaults in a nova conf file (JHU-APL approach for volume encryption) or defaults per user/project/domain, going up the generalization chain till something specifc is found else use a domain level default.

Phase II

Make Key Manager a separate Swift instance, with multiple zones for storage. This would support true HA and fault tolerance.

Phase III

  1. Support multiple encryption algorithms via encryption library querying. Provide user interface support to select preferences and store as part of user/project/domain profile.
  2. Reaper routine to change a master key for a service, aka re-keying.
  3. Support dual locking, a feature that uses KeyStone Identity V3 API's trust feature.

Phase IV

Introduce true TPM support for master keys. For instance, volume encryption may prefer XTS, an encryption strategy that uses sector address. We have expertise on TPM within Intel, and attestation service support already in OpenStack, which is currently being used to verify goodness of compute nodes. This would extend to the OpenStack service nodes.

Phase V

Chef puppet support for transferring symmetric keys to the various service host machines, particularly to scale horizontally in as automated a fashion as possible.

Glossary

Key-string: A string of bits used to encrypt data. Ideally auto-generated using a random number generator that exploits entropy. Intel's hardware random number generator is a high speed source of quality randomness.

Key-id: a unique ID used to index a key-string in the system. The key-id will be attached as meta data with the encrypted object/volume/.

Master-key: a key-string used to encrypt the keys (key-strings) before saving in the key manager, saved in trusted storage at the service end-point.

TPM: Trusted Platform Module

Post Summit Discussion/Revisions/Clarifications

  • Where the key is generated The original design had each agent (Swift, Cinder etc) have its own master key and generate the keys for object/volume etc encryption and use the key manager to just store these along with additional attributes. An advantage of this approach is that the key is never transferred in the clear between any of the cloud service endpoints, and buys time for us to routinely encrypt the communication channels between the end-points. While the design reduces the amount of damage/exposure should an agent get compromised, it adds a layer of deployment complexity. This stems from the master key per agent have to be shared/transferred to sibling agents for high availability. Sure such master keys can be transferred securely, using PKI and encrypting the master with the public key of the receiving agent sibling, or using the TPM transfer protocol which internally also uses PKI public key based encryption of the payload.

Instead, having all keys created at the key manager and encrypted with a master, possibly a different master per key requesting agent, would isolate all key management related activity to the key manager. It would then be a modular plug in.One of the comments regarding dispersed master keys and key generation was that to support high availability.

With this approach the "put object" or encryption example, sequence diagram would change. Swift would invoke "Create_key" instead of creating key itself and invoking put_key.

Keys could still be cached at the agents using a per agent master key that was protected in its TPM or other ways.

  • Encrypted Communication Users such as the NSA need/want source and destination authenticated and encrypted communication. Adam Young suggested exploring NSS http://www.mozilla.org/projects/security/pki/nss/ to meet this requirement and in so doing also possibly avoid all things eventlet/blocking/performance draining.
  • API Red Hat folks suggested examining the DogTag API to determine if it met our needs and act as a point of reference.
  • Life cycle tasks

Adam Young and Guang-Yee suggested exploring FreeIPA and leveraging.

  • Logging for Complaince Rackspace's demo for logging is rich.
  • Additional security via time-limited delegated authorization" The original design allowed access to an encrypted object string only to the original owner who did the "put". Swift was then delegated access to the encryption key for a limited time. But Swift could access all the keys it inserted in the original design because they belonged to Swift.

The keys could instead belong to the original user, and the delegation would provide access to a key if and only if its owner-id matched the user-id (or project-id or domain-id based on key-scope).

  • Supporting KMIP where necessary Both the Intel and Rackspace designs support formatters, KMIP being a formatter. (KMIP spec mentions either dynamic formatting or saving data in KMIP format are both acceptable).
  • Separate Service We were all in agreement about a separate key manager service, to keep its functionality separate, enable isolated hardening, easy plugin should a cloud vendor perhaps want a hardware based solution, not expose it externally (and if necessary offer private and public urls for access and along the lines of security groups limit access) etc.

References

  1. http://blog.dustinkirkland.com/2012/10/encrypt-everything-everywhere-in.html
  2. https://blueprints.launchpad.net/nova/+spec/encrypt-cinder-volumes
  3. http://www.mirantis.com/blog/openstack-swift-encryption-architecture
  4. http://opensecuritytraining.info/IntroToTrustedComputing_files/Day1-7-tpm-keys.pdf
  5. http://en.wikipedia.org/wiki/Trusted_Platform_Module
  6. http://shazkhan.files.wordpress.com/2010/10/http__www-trust-rub-de_media_ei_lehrmaterialien_trusted-computing_keyreplication_.pdf
  7. http://en.wikipedia.org/wiki/Key_Management_Interoperability_Protocol