Jump to: navigation, search

Difference between revisions of "MessageSecurity"

(Message Encryption)
(Key Derivation)
Line 203: Line 203:
 
<pre>
 
<pre>
 
Pseudo-Random Key (PRK) = HKDF-Extract(Rnd.K, Key.A)
 
Pseudo-Random Key (PRK) = HKDF-Extract(Rnd.K, Key.A)
SEK = HKDF-Expand(PRK, Svc.A+Svc.B, Le+Ls)
+
SEK = HKDF-Expand(PRK, Svc.A+'\x00'+Svc.B, Le+Ls)
 
</pre>
 
</pre>
  

Revision as of 19:31, 25 April 2013

Message Security

Message Security in OpenStack is currently not implemented. Recently there have been a couple of proposals to implement signatures and eventually encryption for RPC messages.

Implementing this kind of security features is a delicate task as there are the usual conflicting trade off between security and performance as well as some peculiar issues with the nature of OpenStack distributed environment.

Public Crypto versus Shared Keys

One of the assumption in the present proposals is that Public Key crypto will be used to provide Integrity for messages, however a simple Shared Key crypto model is also well suited to handle Messaging Queues.

One reason why Public Key crypto is being proposed is the perceived lower overhead of the public key trust model. However let's analyze what is required to use either model.

Public Key Infrastructure

The first point is that each service that needs to send messages will need to own a Public/Private Key pair, and this means some form of secure storage for the Private Key.

Next the Public Key also must be made available, there are 2 strategies to do so, a PKI model where a Public Key is signed by a Central Authority, or a Trusted Repository where all Public Key are deposited and guaranteed to be good by either cross signatures or a well know party that can authoritatively assert whether a key is good or bad. This in turn require that quite regularly all clients check that their peers keys are still valid and not revoked. This is true with either a PKI style system where CRLs or OCSP responders are queried, or a central trust authority where Public Keys are checked for validity.

There is also the non-trivial task of deciding where keys are generated as virtual machine based systems tend to have poor entropy and sourcing enough to generate key pairs can be a problem at installation time. Following that there is the problem of how to communicate the public key to the CA for signing or to the trusted repository for depositing it.

Shared Key Infrastructure

The first point is that each service that needs to send messages will need to own its own secret key, and this means some form of secure storage for the Secret Key.

In a shared key model ideally each actor would have a different shared key with each and every other peer, however this becomes very quickly impossible to achieve as the number of peers scales up, both in terms of storage and exchanges necessary. A Key Server approach is the only reasonable way.

With a Key Server each service only needs one Secret Key which it shares with the Key Server. The actual key used to communicate between any 2 peers is provided on the fly by the key server, which needs to be contacted by at least one of the peers before they can actually send messages between themselves. The Key server will provide Signing and Encryption Keys (SEK) that are bound to a specific peer-pair and allow them to communicate safely.

Once keys are obtained the two peers become independent from the Key Server and can send as many messages as required until the keys are valid. In such a system the Secret Keys shared between Services and Key Server have long term validity while the signing and encryption keys can have a relatively short validity period so that brute force attacks on the messages will not lead to gaining access to any long term secret and be of limited value.

Security considerations for the trusted server

With either model the central server will have to store some keys and guarantee their validity.

In the Public Key model the central server needs to be able to provide proof of validity of a key and mark as revoked keys that are considered compromised. In order to do that a signing key needs to be handled by this server and signatures are used to mark public keys as valid or revoked via CRLs or OCSPs. In the case of short lived Public Keys instead an authentication system needs to be provided so that services can authenticate and store their public keys. In both cases the overall system will have to rely on some Public Key that identifies the trusted authority, be it in the form of the Public Key and Certificated representing a PKI or be it in the form of a Public Key and x509 Certificate used to secure the connection with the Trusted Repository. In either case, a compromise of this key will compromise the whole system.

In the shared key model the Key Server will hold a master key used to encrypt the Service keys in its storage. Authentication between services and the Key Server is based on a shared Secret that can be easily rotated, unless it has been compromised. Revocation is not necessary as all that is needed is to remove the compromised Secret Key from the Key Server. However generating a new shared Secret Key will require the entire enrollment process to be repeated, and if a Secret Key is compromised in the middle of a server session with other servers, all these sessions will need to be terminated since it wont be obvious which sessions are genuine and which are bogus.

With either system a stateless service will have to contact the trusted authority, be it a PKI, Trusted Repository or Key Server in order to either get a session key or to check the revocation status. With both systems services that can keep state can cache the session key or the revocation checks until expiration of the keys or until the next validation interval expires so the required communication overhead between services and a central system is similar.

Shared Keys and Key Server Proposal

Public Key crypto is generally slow and complex, so given that the overhead for both systems in terms of security of the trusted server or communication requirements looks similar we put forward the proposal of using a Share Key system based on a Key Server.

Note that the service long term key stored in Key Server is needed for key derivation and may be used for authentication, however authentication can be deferred to existing components, for example password based authentication over an HTTPS connection would be sufficient to authenticate a Service to the Key Server. Other methods that would work as well would be Kerberos keytabs and a KDC for authentication, x509 User Certificates and so on. Basically, the authentication part can be abstracted away if needed. If authentication is performed through external means the long term key does not need to be shared with the service and can be maintained exclusively in the Key Server.

That said we will proceed to describe also an authentication method based on shared keys that will work for communication over pure HTTP (non encrypted) transport for completeness. We'll defer to the deployment strategy which method to use in preference.

Message Integrity and Confidentiality

Securing the message queue requires two distinct components:

  • Integrity or Signing and authentication of messages
  • Confidentiality or Encryption of the messages

In order to reduce the chance of cryptoanalysis with some authentication and encryption keys we will play safe and propose to use separate keys for encryption and authentication even though we will not use mechanisms susceptible to known attacks. For the same reason in order to reduce replay attacks we will propose a scheme that uses different keys depending on the direction of the communication. I.E. The SEK pair for Svc.A -> Svc.B will not be the same as the one for Svc.B -> Svc.A.

Standards

The standard for providing message integrity is HMAC. For encryption the most respected algorithm is currently AES, a block cipher with a fixed 128bits block size.

Because the current feeling is that encryption may not be necessary we will consider it optional. In order to avoid changing message formats this means it is more convenient to use an "encryption first, authentication later" approach, whereby the authentication step does not differ based on whether encryption is performed or not, rather the message being authenticated can be either plain text or encrypted.

The next step is sketching out how to apply encryption and authentication to the message keeping in mind the Horton Principle.

Message Format

The data interchange format en vogue in the project is JSON so we will create a message format based on JSON syntax. The first thing we want to assure is that authentication covers all the message as well as the metadata tied to the message, this is important to avoid substitution attacks where the metadata may be swapped out and replaced without affecting the signature. This means that Message and Metadata will be serialized objects contained in a simpler container..

Pseudo JSON notation:

MetaData = jsonutils.dumps({
    'source': <sender>,
    'destination': <receiver>,
    'timestamp': <python time.time()>,
    'encryption': <true | false>
})
Message = jsonutils.dumps(raw_msg)

_METADATA_KEY = 'oslo.secure.metadata'
_SIGNATURE_KEY = 'oslo.secure.hmac'

RPC_Message = {
    _VERSION_KEY: _RPC_ENVELOPE_VERSION,
    _METADATA_KEY: MetaData,
    _MESSAGE_KEY: Message,
    _SIGNATURE_KEY: Signature
}

Message Signature

The Signature is calculated over the concatenation of the version string and the buffers.

Version = null terminated string containing the version number
MetaData = serialized JSON Metadata
Message = serialized JSON Message

Signature = HMAC(SignKey, (Version || MetaData || Message))

We propose to use HMAC-SHA-256 by default as the authentication function as per RFC 6234.

NOTE: Particular care needs to be taken to make sure the RPC_Message obtained in input cannot be abused and the rest of the pipeline will use exclusively what has been authenticated. For this reason the output of the validation function should be a separate structure that provides unserialized Metadata and Message, and further components should not have access to the original RPC_Message. If the same format needs to be maintained a new RPC_Message containing only the version and serialized message will be provided in output, rebuilt from the verified values.

Hashlib has all the code needed to implement this.

Message Encryption

Optionally the message may be encrypted, in this case the MetaData field 'encryption' will be set to True.

Because the use of nonces is particularly difficult to get right, and the use of message queues may involve multiple parties using the same keys when they act in a cluster and because there is a desire to allow as much as possible stateless services, we propose to use AES-128-CBC with a Random IV by default in order to encrypt the content. This requires the availability of a pseudo-random generator on the sender side, we do not expect this to be an issue in practice on the machines used in a typical OpenStack deployment.

Encryption:

Plain-Text = P1 || P2 || P3 || ...
C0 = Random IV (128bit)
for i in range(1, N):
   Ci = ENC(EncKey, Pi^Ci-1)
Encrypted-Message = C0 || C1 || C2 || C3 || ...

Decryption:

IV = C0
Cipher-Text = C1 || C2 || C3 || ...
for i in range (1, N):
    Pi = DEC(EncKey, Ci)^Ci-1
Plain-Text = P1 || P2 || P3 || ...

Various python crypto modules have all the code needed to implement this.

Client Authentication and Key Derivation

Although not mandatory to implement or use we propose an authentication and key retrieval scheme to request and transfer SEKs.

Authentication scheme

A simple authentication scheme is used to request a SEK. The request does not need to be encrypted, because none of the data sent is sensitive and all of it can be deduced by the activity that is going to be performed, last but not least, some of this data needs to be in the clear to identify the requesting service and look up the correct key to use to check the authentication.

We want to reduce the ability to play replay attacks against the Key Server so we will embed a timestamp useful to restrict the validity period of any given message.

In addition to timestamp and counter the request needs to contain 3 names.

  • The name of the service making the request, which will be used to lookup the Shared Key and Authenticate the request.
  • The name of the sending service
  • The name of the receiving service.

When receiving the request the first operation must be Authentication of the request, no other field should be considered until the request is authenticated with the Shared Key. Once the HMAC function validate the request timestamp and counter MUST be checked for validity.

Usually one of the receiving or sending service name will match the name of the service making the request, if this is the case the server will simply proceed without further checks. However if the Service name does not match either sending or receiving name further Access Control needs to be performed. A list of services allowed to impersonate a service role will need to be provided to allow release of a SEK to a service that does not match either receiving or sending names. This may be legitimate for high availability cases when multiple copies of the service may impersonate the same identity. This kind of delegation is out of scope for our first implementation and will not be further discussed.

Pseudo JSON notation:

MetaData = jsonutils.dumps({
    'requestor': <requestor>,
    'sender': <sender>,
    'receiver': <receiver>,
    'timestamp': <timestamp>
})

KeyEx_Request = {
    'meta': MetaData,
    'hmac': Signature = HMAC(Key, MetaData)
}

NOTE: we do not use random values here as replies will be identical given identical input (or denied if the timestamp is too old), this means that a replay attack will give an attacker no advantage and it will allow stateless services to request the same key multiple times if needed to process multiple messages from the same sender. (A sender will probably never send exactly the same request as the timestamp will likely vary between them).

NOTE: If external authentication is used the Signature will be omitted.

Key Derivation

In order to avoid easy attacks on Keys and in order to be able to quickly expire keys, a key derivation scheme is used to generate the SEK

In all cases, whether a shared key is used for client authentication or authentication is performed by external means (for example via x509 certificates over HTTPS), the Key Server will maintain (or create on the fly if missing) a long rterm Service Key (which is also the shared key in our authentication scheme) that is used to perform Key Derivation on the server's behalf. These Keys are stored reversibly encrypted with a Key Server master key.

In addition to per service keys, the Key Server generates a new random key every X minutes where X is also the TTL of SEKs. The key server stores 2 or more previous random values to allow a service to retrieve older SEK values if needed, this allows the Key Server to operate in a stateless fashion without disrupting SEK distribution at random key change time. Note that the Random Key could be generated by using a pseudo random function primed with a key derived from the master key. This would allow scaling of the Key Server to multiple machines without the need of interaction in order to exchange Random Keys. The only requirement to allow this deployment is that clocks be kept reasonably in sync. This is already a requirement in general as we want to quicky expire keys and messages to reduce replay attacks therefore we do not see it as an obstacle in a typical OpenStack scenario.

Key derivation is performed using a standard Hash based Key Derivation Function (HKDF) as described in RFC 5869.

The extract function will be used with the Key Server Random Key in order to change the output of the key derivation function at regular intervals and therefore causing effective expiration of previously released keys as the Random Key changes in time.

The expansion function is also given in input parameters to generate different keys based on which pair of services is involved in the process.

Key Derivation inputs:

Time.T = The time in the request
Svc.A = the sender service name
Svc.B = the receiver service name
Key.A = the sender long term key
Rnd.K = The Key Server Random Key valid at Time.T (might be an historic key if the Random Key has just been rotated)
Le = Lenght of Encryption Key (128bits)
Ls = Length of Signing Key (128bits)

Key derivation:

Pseudo-Random Key (PRK) = HKDF-Extract(Rnd.K, Key.A)
SEK = HKDF-Expand(PRK, Svc.A+'\x00'+Svc.B, Le+Ls)

The output of the expand function is an array of bytes of length 256 bits (Le+Ls), the first half will be used as the Encryption Key, the second half as the Signing Key.

Key Exchange

The keys obtained by the Key Derivation step need to be sent back to the requester.

If the communication is happening over a secure transport like verified HTTPS, then it is possible to simply return the keys directly in the clear, however in case the above authentication scheme is used over a clear-text protocol like HTTP the keys need to be protected with encryption. The reply must also be authenticated in order to avoid substitution attacks.

We'll reuse an encryption and authentication scheme similar to the one described previously for securing the messages exchanged between the 2 parties.

Reply Format

Pseudo JSON notation:

MetaData = jsonutils.dumps({
    'source': <sender>,
    'destination': <receiver>,
    'expiration': <calculated as timestamp sent in the request + TTL>,
    'encryption': <true | false>
})

KeyEx_Reply = {
    'meta': MetaData,
    'sek': SEK,
    'hmac': Signature
}

Reply Signature

The Signature is calulated over all the data:

MetaData = serialized JSON Metadata
SEKStore = optionally encrypted buffer containing the Encryption and Signature pair as returned by the HKDF.

Signature = HMAC(Key, (MetaData || SEKStore))

We propose again to use HMAC-SHA-256 as the default authentication function as per RFC 6234.

Reply Encryption

On untrusted transport the SEK will be encrypted, in this case the MetaData field 'encryption' will be set to True. We'll reuse the same exact scheme used for Message Encryption with AES-128-CBC and a Random IV

RESTful API

As custom a RESTful API will be proposed to access the Key Server, a GET call will be used to obtain a SEK.

Request:

GET /keyserver/sek

{
    'meta': MetaData,
    'hmac': Signature
}

Reply:

200 OK

{
    'meta': MetaData,
    'sek': SEKStore,
    'hmac': Signature
}

Error codes

  • 200 OK - This status code is returned in response to a successful GET operation
  • 401 Unauthorized - This status code is returned when either authentication has not been performed, or the authentication fails.
  • 403 Forbidden - This status code is returned when the requester field does not match either the sender or the receiver fields.
  • 500 Internal Server Error - This status code is returned when an unexpected error has occurred in the server implementation.
  • 501 Not Implemented - This status code is returned when the implementation is unable to fulfill the request because it is incapable of implementing the entire API as specified.
  • 503 Service Unavailable - This status code is returned when the server is unable to communicate with a backend service (database, memcache, ...)

Operation Considerations

Key Server lookups

Worst case 2 lookups per message (1 for sender and 1 for receiver) TODO: Expand

Fanout messages

Fanout messages signing with symmetric keys is problemaitc however only 3 cases use fanout so far:

  • nova network, but I have been assured this case will go away
  • nova compute to all schedulers, but we can use a 'scheduler' key the all schedulers have access to
  • nova scheduler to all compute, this is problematic, but the message is a broadcast request, not a command, so we could simply not sign in this case

if not signing doesn't work, we might need to use a onetime-only key scheme, but the lookups to the key server would be quite numerous (one per compute node).

Implementation

Implementation would be done in phases

Phase 1

Add basic crypto functions and new message envelope building functions Change code to use new envelope by default with signing optional

Phase 2

Add support for fetching keys but fall back to non signed if lookup fails

Phase 3

Add Key Server with basic per-host keys only

Phase 4

Add Support for shared per service-type keys Add Access Control checks to limit access to these keys

Phase 5

Turn signing on as required by default