Security/Guidelines/logging guidelines

1 paulmo (#openstack-security IRC channel) is currently editing this page; please check back later for a finished wiki. --- 2014/05/29
2 Overview
3 Status
4 Problem Definition
- 4.1 Difficulty Identifying Confidential Data
- 4.2 Use of Log Level for Security
5 Solution Space
- 5.1 Unambiguously Identifying Confidential Log Data
- 5.2 Log Level Usage Recommendations
6 Socializing Recommendations

paulmo (#openstack-security IRC channel) is currently editing this page; please check back later for a finished wiki. --- 2014/05/29

Overview

OpenStack needs logging and notification security guidelines and best practices to prevent accidental leakage of confidential information to unauthorized users. This wiki is an attempt to gain the OpenStack community consensus on what those standards should be and how they should be implemented.

Status

This is not OpenStack Security Group (OSSG) approved at this point.

Problem Definition

Difficulty Identifying Confidential Data

There is no standard/structured logging and notification data format across OpenStack projects which would enable OpenStack operators to unambiguously identify and filter out confidential data which should not be exposed to certain users. Simple architectural diagram example:

Note: For brevity, this document will use "logs" in place of "logs and notifications".

This diagram represents multiple OpenStack services generating logs which may be formatted differently and may hold different types of confidential data (data that an operator would not want a user to access). There may be an optional operator-created filtering and aggregation system and some method of exposing the sanitized logs to users or operators. The delivery method isn't strictly relevant to this discussion but the ability to unambiguously filter confidential data out of logs is very important.

Some non-exhaustive examples of accidental credential disclosure to unauthorized users within OpenStack:

(Ceilometer) Log contains DB password in plain text (CVE-2013-6384) [OSSA 2013-031]
(Keystone) Plaintext passwords are logged
(Nova) Clear text password has been print in log by some API call

The problem is exacerbated by the CI/CD nature of OpenStack in general. Code is being merged daily in many projects and some of this code may introduce new logging entries which need to be examined and filtered by operators running the service. Operators who update frequently may spend more time and effort on this process. Currently, many operators are in a reactive mode when addressing log leaks as they must actively monitor log data changes and act quickly to head off potential data leaks as soon as possible. This is obviously not an optimal solution for OpenStack operators especially when the difficulty increases with the number of OpenStack services run.

Use of Log Level for Security

In some cases, OpenStack security issues around logging are due to the use of log levels to filter out confidential data. To provide a more specific example, setting log level to DEBUG or INFO has caused plain text credentials to be logged (sometimes in a user globally accessible location). This causes operators to make a choice between potential confidential data leaks and better performance/debug data in logs. Again, this is not an optimal design for operators.

Solution Space

Unambiguously Identifying Confidential Log Data

TODO

Log Level Usage Recommendations

Proposed rule: Do not use log level to filter out confidential data (such as passwords, etc).

This post has a great definition of log level usage for consideration by the community: When To Use Log Level Warn vs Error

In order to make this more specific for OpenStack and to obtain community input, here is a table with recommendations for log level:

Log Level	Intended Usage
Critical	No automated way for a service to recover and the service must be shut down to prevent further data loss.
Error	No automated way to recover and requires administrator/operator or user intervention to recover.
Warning	Some minor error happened but it is likely a recoverable problem with retries. Usually no manual intervention or admin paging is needed.
Info	Shows service version, start/stop and other indications that give a deeper understanding of service operation. Typically the lowest log level used in normally functioning systems.
Debug	Dumps all log data that does not compromise security or cause confidential data leaks. This is typically not used in normal operation as the logs will be enormous.

Socializing Recommendations

Recommended ways to socialize security recommendations:

Style Guidelines: Update OpenStack Style Guidelines (http://docs.openstack.org/developer/hacking/) with the OpenStack agreed upon recommendations
Code Reviews: Use code reviews to link to OSSG recommendations: https://wiki.openstack.org/wiki/Security/Guidelines. Use the anchors to specify the exact topic.
Convince Leadership: Socialize ideology and benefits to PTLs and TC.
Engage Oslo Team: Put recommendations into common OpenStack libraries such as Oslo

Old Wiki Below ----------------------

In order to prevent accidental leakage of confidential information to unauthorized users, there are some guidelines to assist in isolating this confidential data for easy/accurate filtering on the back end log management tools. Why is this important? There have been several OpenStack security issues around logging in the past:

(Ceilometer) Log contains DB password in plain text (CVE-2013-6384) [OSSA 2013-031]
(Keystone) Plaintext passwords are logged
(Nova) Clear text password has been print in log by some API call

etc...

Logs should have a format that enables grouping of confidential data especially when logging data such as:

Exceptions: Unless the developer is sure that an exception will never contain confidential information, exceptions should be identified as confidential. This has historically been especially problematic with database exceptions which may contain real field data.
- Recommend parsing the specific exception or error and providing an abstracted/safe version back to the user
Passwords: Never log plain text passwords
Private Keys: Never log plain text private keys
PII: Minimize Personally Identifiable Information (PII) logging where possible
Local Server State: Avoid logging local server state which may provide hints to attackers (examples: file paths, code file names, user account names, PRNG state)
Tenant/Project ID Checking: If a user identifier (tenant/project ID) is not present in the log record or does not match the current authenticated user, do not show this log data to the user
Log Insecure Configurations: If a configuration option causes the system to enter a potentially less secure state, log a message to this effect for operators to see

OpenStack's Oslo Log is capable of creating formatted logs with a section for confidential data. The following example contains two pieces of variable data: key_name which is not confidential data (and will equal 'ssh') and key_value which is a confidential key that should not be visible to anyone but admins/operators.

Note: This is a contrived example for simplicity. If the key_value is a public ssh key, it probably isn't critical to hide it in the logs from the authorized user that it belongs to. If the key_value is a private ssh key, it shouldn't be logged to begin with.

Bad Example:

LOG.debug("User set %s key to value %s" % [key_name, key_value])

Revised/Good Example:

LOG.debug("User set %s key" % [key_name], extra={private={value=key_value}})

Note that the extra->private structure is used to hold all confidential data within logs so that it may be filtered out later before a user views logs. In this example, the key value is moved to a 'private' dictionary which makes filtering out confidential data from logs easier as there will be a single keyword to locate in log entries if these guidelines are followed. An authenticated user may see that an ssh key has been changed but an operator may see the actual ssh key value in the logs.

Log level definitions: http://stackoverflow.com/questions/2031163/when-to-use-log-level-warn-vs-error This is a nice writeup about when to use each log level. Here is a brief description:

Debug: Shows everything and is likely not suitable for normal production operation due to the sheer size of logs generated
Info: Usually indicates successful service start/stop, versions and such non-error related data
Warning: Indicates that there might be a systemic issue; potential predictive failure notice
Error: An error has occurred and an administrator should research the event
Critical: An error has occurred and the system might be unstable; immediately get administrator assistance