Difference between revisions of "Security/Guidelines/logging guidelines"

Latest revision as of 19:45, 13 June 2014

Overview

OpenStack needs logging and notification security guidelines and best practices to prevent accidental leakage of confidential information to unauthorized users. This wiki is an attempt to gain the OpenStack community consensus on what those standards should be and how they should be implemented.

Status

This is currently in review by the OpenStack Security Group (OSSG).

Problem Definition

Difficulty Identifying Confidential Data

There is no standard/structured logging and notification data format across OpenStack projects which would enable OpenStack operators to unambiguously identify and filter out confidential data which should not be exposed to certain users. Simple architectural diagram example:

Note: For brevity, this document will use "logs" in place of "logs and notifications".

This diagram represents multiple OpenStack services generating logs which may be formatted differently and may hold different types of confidential data (data that an operator would not want a user to access). There may be an optional operator-created filtering and aggregation system and some method of exposing the sanitized logs to users or operators. The delivery method isn't strictly relevant to this discussion but the ability to unambiguously filter confidential data out of logs is very important.

Some non-exhaustive examples of accidental credential disclosure to unauthorized users within OpenStack:

(Ceilometer) Log contains DB password in plain text (CVE-2013-6384) [OSSA 2013-031]
(Keystone) Plaintext passwords are logged
(Nova) Clear text password has been print in log by some API call

The problem is exacerbated by the CI/CD nature of OpenStack in general. Code is being merged daily in many projects and some of this code may introduce new logging entries which need to be examined and filtered by operators running the service. Operators who update frequently may spend more time and effort on this process. Currently, many operators are in a reactive mode when addressing log leaks as they must actively monitor log data changes and act quickly to head off potential data leaks as soon as possible. This is obviously not an optimal solution for OpenStack operators especially when the difficulty increases with the number of OpenStack services run.

Use of Log Level for Security

In some cases, OpenStack security issues around logging are due to the use of log levels to filter out confidential data. To provide a more specific example, setting log level to DEBUG or INFO has caused plain text credentials to be logged (sometimes in a user globally accessible location). This causes operators to make a choice between potential confidential data leaks and better performance/debug data in logs. Again, this is not an optimal design for operators.

Solution Space

Unambiguously Identifying Confidential Log Data

Proposed rules:

Identify confidential data in the OpenStack code to provide administrators a single "tag" to sanitize data in back end log filtering systems
Purge all log data, in OpenStack code, identified as confidential or sensitive by OpenStack

OpenStack community feedback to date explicitly rejects the concept of an Oslo Config-like setting to disable security features (to allow some sensitive data to be logged).

Discussion point: One piece of feedback is to enable an AUDIT level log setting which would enable sensitive data logging. Community thoughts on this?

OpenStack Sensitive/Confidential Data

(should not be exposed to users)

Data Type	Status	Description
Raw Exception Data	Review In Progress	Do not log unfiltered exceptions. There have been cases where database exceptions logged credential information in plaintext. Different exception types may log different data so this is generally a practice to avoid.
Credentials	Review In Progress	Never log credential information such as login, password, auth tokens, etc.
Personally Identifiable Information	Review In Progress	Do not log PII information such as customer address, phone number, etc.
Keys	Review In Progress	Never log private or symmetric keys.
Server/Service State	Review In Progress	Do not allow users to access non-relevant service state that could be used as an attack vector. Examples: PRNG state/seed, file paths, code file names, service account names, etc.
URI and Request Object	Review In Progress	Do not allow logging of URIs or full request objects which could potentially expose credential or data location information.

Note: This list is meant to be a living list that adapts to new OpenStack issues/architecture.

Reminder: Always ensure that users may only access data that is associated with their tenant/project.

Possible Implementation Options

There is one area of interest where the author of this wiki will admit lack of knowledge: The Cloud Auditing Data Federation (CADF) seems to be working in a similar space. There is a PyCADF project in OpenStack that may of interest for further investigation.

Trace Class

In Project Solum, a TraceData class was created to work with Oslo Log to perform two main tasks:

Enable identification of confidential/sensitive data in code
Allow trace data to be persistent and potentially built up and used in each following log call

Code: https://github.com/stackforge/solum/blob/master/solum/common/trace_data.py

Unit Test/Usage example: https://github.com/stackforge/solum/blob/master/solum/tests/common/test_trace_data.py

This code will accept an Oslo Context class and fill itself in with the data to prevent unexpected interactions with that common library. Oslo Log may consume this trace data in the same fashion that a context class can be used.

One potential concern with this approach is that the Trace class duplicates some of the Oslo Context class data. Other approaches are offered below.

Native Oslo Log Support

This is the preferred implementation path based on OpenStack community feedback to date.

Extend Oslo Log itself to incorporate the capability to flag data as sensitive (first class citizen feature in Oslo Log). This may take a little more work from the Oslo team to architect this type of solution.

One thought is that there might be a: LOG.debug("my log message", confidential=True) type mechanism for identifying sensitive data. This would mandate breaking up log entries into "public" and "private" log calls (i.e. back to back logging calls in some cases).

Oslo Log Extra Structure

Oslo log provides an "extra" field which may hold arbitrary data. One option would be to manually create a "private" key which holds a JSON dictionary/list of confidential data that the operator may use to filter log data.

Example: LOG.debug("Random non-confidential log data", extra={private={value=confidential_data}})

The benefit to this path is that there are no Oslo code changes needed. The problem is that this is a very tedious and error prone process to properly structure each Oslo Log call.

Log Level Usage Recommendations

Proposed rule: Do not use log level to filter out confidential data (such as passwords, etc).

This post has a great definition of log level usage for consideration by the community: When To Use Log Level Warn vs Error

In order to make this more specific for OpenStack and to obtain community input, here is a table with recommendations for log level:

Log Level	Intended Usage
Critical	No automated way for a service to recover and the service must be shut down to prevent (further) data loss or corruption.
Error	No automated way to recover and requires administrator/operator or user intervention to recover.
Warning	Some minor error happened but it is likely a recoverable problem with time or retries. Usually no manual intervention or admin paging is needed.
Info	Shows service version, start/stop and other indications that give a deeper understanding of service operation. Typically the lowest log level used in normally functioning systems.
Debug	Dumps all log data that does not compromise security or cause confidential data leaks. This is typically not used in normal operation as the logs will be enormous.

Socializing Recommendations

Potential ways to socialize security recommendations:

Style Guidelines: Update OpenStack Style Guidelines (http://docs.openstack.org/developer/hacking/) with the OpenStack agreed upon recommendations
Code Reviews: Use code reviews to link to OSSG recommendations: https://wiki.openstack.org/wiki/Security/Guidelines. Use the anchors to specify the exact topic.
Convince Leadership: Socialize ideology and benefits to PTLs and TC. Challenge PTLs to become early adopters and set a good example for others.
Engage Oslo Team: Put recommendations into common OpenStack libraries such as Oslo

@@ Line 1: / Line 1: @@
-=== paulmo (#openstack-security IRC channel) is currently editing this page; please check back later for a finished wiki. --- 2014/05/29 ===
 == Overview ==
-OpenStack needs logging and notification security guidelines and best practices to prevent accidental leakage of confidential information to unauthorized users.  This wiki is an in progress attempt to gain the OpenStack community consensus on what those standards should be and how they should be implemented.
+OpenStack needs logging and notification security guidelines and best practices to prevent accidental leakage of confidential information to unauthorized users.  This wiki is an attempt to gain the OpenStack community consensus on what those standards should be and how they should be implemented.
-Disclaimer: This is not OpenStack Security Group (OSSG) approved at this point.
+== Status ==
+This is currently in review by the OpenStack Security Group (OSSG).
 == Problem Definition ==
-There is no standard/structured logging and notification data format across OpenStack projects which would enable OpenStack operators to unambiguously identify and filter out confidential data which should not be exposed to certain users.  Let's take a look at a simple architectural diagram:
+=== Difficulty Identifying Confidential Data ===
+There is no standard/structured logging and notification data format across OpenStack projects which would enable OpenStack operators to unambiguously identify and filter out confidential data which should not be exposed to certain users.  Simple architectural diagram example:
 [[File:openstack logging security.png]]
+''Note: For brevity, this document will use "logs" in place of "logs and notifications".''
-There are many examples of accidental credential disclosure to unauthorized users within OpenStack.  Some non-exhaustive examples:
+This diagram represents multiple OpenStack services generating logs which may be formatted differently and may hold different types of confidential data (data that an operator would not want a user to access).  There may be an optional operator-created filtering and aggregation system and some method of exposing the sanitized logs to users or operators.  The delivery method isn't strictly relevant to this discussion but the ability to unambiguously filter confidential data out of logs is very important.
+Some non-exhaustive examples of accidental credential disclosure to unauthorized users within OpenStack:
 * [https://bugs.launchpad.net/ceilometer/+bug/1244476 (Ceilometer) Log contains DB password in plain text] (CVE-2013-6384) [OSSA 2013-031]
 * [https://bugs.launchpad.net/horizon/+bug/1004114 (Keystone) Plaintext passwords are logged]
 * [https://bugs.launchpad.net/nova/+bug/1231263 (Nova) Clear text password has been print in log by some API call]
+The problem is exacerbated by the CI/CD nature of OpenStack in general.  Code is being merged daily in many projects and some of this code may introduce new logging entries which need to be examined and filtered by operators running the service.  Operators who update frequently may spend more time and effort on this process.  Currently, many operators are in a reactive mode when addressing log leaks as they must actively monitor log data changes and act quickly to head off potential data leaks as soon as possible.  This is obviously not an optimal solution for OpenStack operators especially when the difficulty increases with the number of OpenStack services run.
+=== Use of Log Level for Security ===
+In some cases, OpenStack security issues around logging are due to the use of log levels to filter out confidential data.  To provide a more specific example, setting log level to DEBUG or INFO has caused plain text credentials to be logged (sometimes in a user globally accessible location).  This causes operators to make a choice between potential confidential data leaks and better performance/debug data in logs.  Again, this is not an optimal design for operators.
+== Solution Space ==
+=== Unambiguously Identifying Confidential Log Data ===
+'''Proposed rules''':
+* Identify confidential data in the OpenStack code to provide administrators a single "tag" to sanitize data in back end log filtering systems
+* Purge all log data, in OpenStack code, identified as confidential or sensitive by OpenStack
+OpenStack community feedback to date explicitly rejects the concept of an Oslo Config-like setting to disable security features (to allow some sensitive data to be logged).
----------------Old Wiki Below ----------------------
+Discussion point: One piece of feedback is to enable an AUDIT level log setting which would enable sensitive data logging.  Community thoughts on this?
-In order to prevent accidental leakage of confidential information to unauthorized users, there are some guidelines to assist in isolating this confidential data for easy/accurate filtering on the back end log management tools.  Why is this important?  There have been several OpenStack security issues around logging in the past:
+=== OpenStack Sensitive/Confidential Data ===
-* [https://bugs.launchpad.net/ceilometer/+bug/1244476 (Ceilometer) Log contains DB password in plain text] (CVE-2013-6384) [OSSA 2013-031]
+''(should not be exposed to users)''
-* [https://bugs.launchpad.net/horizon/+bug/1004114 (Keystone) Plaintext passwords are logged]
+{| class="wikitable sortable"
-* [https://bugs.launchpad.net/nova/+bug/1231263 (Nova) Clear text password has been print in log by some API call]
+|-
-etc...
+! Data Type !! Status !! Description
+|-
+| Raw Exception Data || Review In Progress || Do not log unfiltered exceptions.  There have been cases where database exceptions logged credential information in plaintext.  Different exception types may log different data so this is generally a practice to avoid.
+|-
+| Credentials || Review In Progress || Never log credential information such as login, password, auth tokens, etc.
+|-
+| Personally Identifiable Information || Review In Progress || Do not log PII information such as customer address, phone number, etc.
+|-
+| Keys || Review In Progress || Never log private or symmetric keys.
+|-
+| Server/Service State || Review In Progress || Do not allow users to access non-relevant service state that could be used as an attack vector.  Examples: PRNG state/seed, file paths, code file names, service account names, etc.
+|-
+| URI and Request Object || Review In Progress || Do not allow logging of URIs or full request objects which could potentially expose credential or data location information.
+|}
+''Note: This list is meant to be a living list that adapts to new OpenStack issues/architecture.''
+Reminder: Always ensure that users may only access data that is associated with their tenant/project.
+=== Possible Implementation Options ===
+There is one area of interest where the author of this wiki will admit lack of knowledge: The [http://www.dmtf.org/standards/cadf Cloud Auditing Data Federation (CADF)] seems to be working in a similar space.  There is a [http://docs.openstack.org/developer/pycadf/ PyCADF] project in OpenStack that may of interest for further investigation.
+==== Trace Class ====
+In Project Solum, a TraceData class was created to work with Oslo Log to perform two main tasks:
+* Enable identification of confidential/sensitive data in code
+* Allow trace data to be persistent and potentially built up and used in each following log call
+Code: https://github.com/stackforge/solum/blob/master/solum/common/trace_data.py
+Unit Test/Usage example: https://github.com/stackforge/solum/blob/master/solum/tests/common/test_trace_data.py
+This code will accept an Oslo Context class and fill itself in with the data to prevent unexpected interactions with that common library.  Oslo Log may consume this trace data in the same fashion that a context class can be used.
+One potential concern with this approach is that the Trace class duplicates some of the Oslo Context class data. Other approaches are offered below.
+==== Native Oslo Log Support ====
+This is the preferred implementation path based on OpenStack community feedback to date.
-Logs should have a format that enables grouping of confidential data especially when logging data such as:
+Extend Oslo Log itself to incorporate the capability to flag data as sensitive (first class citizen feature in Oslo Log).  This may take a little more work from the Oslo team to architect this type of solution.
-* '''Exceptions:''' Unless the developer is sure that an exception will never contain confidential information, exceptions should be identified as confidential.  This has historically been especially problematic with database exceptions which may contain real field data.
+One thought is that there might be a: LOG.debug("my log message", confidential=True) type mechanism for identifying sensitive data.  This would mandate breaking up log entries into "public" and "private" log calls (i.e. back to back logging calls in some cases).
-** Recommend parsing the specific exception or error and providing an abstracted/safe version back to the user
-* '''Passwords:''' Never log plain text passwords
-* '''Private Keys''': Never log plain text private keys
-* '''PII:''' Minimize Personally Identifiable Information (PII) logging where possible
-* '''Local Server State:''' Avoid logging local server state which may provide hints to attackers (examples: file paths, code file names, user account names, PRNG state)
-* '''Tenant/Project ID Checking:''' If a user identifier (tenant/project ID) is not present in the log record or does not match the current authenticated user, do not show this log data to the user
-* '''Log Insecure Configurations:''' If a configuration option causes the system to enter a potentially less secure state, log a message to this effect for operators to see
-OpenStack's Oslo Log is capable of creating formatted logs with a section for confidential data.  The following example contains two pieces of variable data: key_name which is not confidential data (and will equal 'ssh') and key_value which is a confidential key that should not be visible to anyone but admins/operators.
+==== Oslo Log Extra Structure ====
+Oslo log provides an "extra" field which may hold arbitrary data.  One option would be to manually create a "private" key which holds a JSON dictionary/list of confidential data that the operator may use to filter log data.
-''Note: This is a contrived example for simplicity.  If the key_value is a public ssh key, it probably isn't critical to hide it in the logs from the authorized user that it belongs to.  If the key_value is a private ssh key, it shouldn't be logged to begin with.''
+Example:
+LOG.debug("Random non-confidential log data", extra={private={value=confidential_data}})
-'''Bad Example:'''
+The benefit to this path is that there are no Oslo code changes needed.  The problem is that this is a very tedious and error prone process to properly structure each Oslo Log call.
-LOG.debug("User set %s key to value %s" % [key_name, key_value])
+=== Log Level Usage Recommendations ===
+'''Proposed rule''': Do not use log level to filter out confidential data (such as passwords, etc).
-'''Revised/Good Example:'''
+This post has a great definition of log level usage for consideration by the community:
+[http://stackoverflow.com/questions/2031163/when-to-use-log-level-warn-vs-error When To Use Log Level Warn vs Error]
-LOG.debug("User set %s key" % [key_name], extra={private={value=key_value}})
+In order to make this more specific for OpenStack and to obtain community input, here is a table with recommendations for log level:
+{| class="wikitable sortable"
+|-
+! Log Level !! Intended Usage
+|-
+| Critical || No automated way for a service to recover and the service must be shut down to prevent (further) data loss or corruption.
+|-
+| Error || No automated way to recover and requires administrator/operator or user intervention to recover.
+|-
+| Warning || Some minor error happened but it is likely a recoverable problem with time or retries.  Usually no manual intervention or admin paging is needed.
+|-
+| Info || Shows service version, start/stop and other indications that give a deeper understanding of service operation.  Typically the lowest log level used in normally functioning systems.
+|-
+| Debug || Dumps all log data that does not compromise security or cause confidential data leaks.  This is typically not used in normal operation as the logs will be enormous.
+|}
-Note that the extra->private structure is used to hold all confidential data within logs so that it may be filtered out later before a user views logs.  In this example, the key value is moved to a 'private' dictionary which makes filtering out confidential data from logs easier as there will be a single keyword to locate in log entries if these guidelines are followed.  An authenticated user may see that an ssh key has been changed but an operator may see the actual ssh key value in the logs.
-'''Log level definitions:'''
+== Socializing Recommendations ==
-http://stackoverflow.com/questions/2031163/when-to-use-log-level-warn-vs-error
+Potential ways to socialize security recommendations:
-This is a nice writeup about when to use each log level.  Here is a brief description:
+* '''Style Guidelines''': Update OpenStack Style Guidelines (http://docs.openstack.org/developer/hacking/) with the OpenStack agreed upon recommendations
-* '''Debug''': Shows everything and is likely not suitable for normal production operation due to the sheer size of logs generated
+* '''Code Reviews''': Use code reviews to link to OSSG recommendations: https://wiki.openstack.org/wiki/Security/Guidelines.  Use the anchors to specify the exact topic.
-* '''Info''': Usually indicates successful service start/stop, versions and such non-error related data
+* '''Convince Leadership''': Socialize ideology and benefits to PTLs and TC.  Challenge PTLs to become early adopters and set a good example for others.
-* '''Warning''': Indicates that there might be a systemic issue; potential predictive failure notice
+* '''Engage Oslo Team''': Put recommendations into common OpenStack libraries such as Oslo
-* '''Error''': An error has occurred and an administrator should research the event
-* '''Critical''': An error has occurred and the system might be unstable; immediately get administrator assistance