Jump to: navigation, search

LoggingStandards

Revision as of 12:44, 27 January 2014 by Sdague (talk | contribs)

Basic Principles

The point of this document is to bring together a set of common guidelines that we can all agree upon, to make logging across all the OpenStack services as consistent as possible.

Some common principles:

INFO

Every Inbound WSGI request should be logged Exactly Once

Every inbound WSGI should get logged exactly once. It should be logged with enough information that the operation the user was attempting can be reconstructed.

Current Issues:

  • Keystone is logging all requests twice (once from keystone, once from eventlet.wsgi)
  • Nova is logging all ec2 requests twice (once from nova, once from eventlet.wsgi)

INFO log level is for Operators

Everything at INFO level or above should be relevant to Operators, which means it should describe in abstract terms what is going on on. Referencing internal function names is not useful:

2014-01-26 15:35:07.158 INFO nova.virt.libvirt.firewall [req-88cce8e9-dd2a-4f87-adc7-4ce2f527ba6e AggregatesAdminTestXML-1384929043 AggregatesAdminTestXML-1314829414] [instance: 6e017b40-d5bc-442b-ac0a-6784efe8a70b] Called setup_basic_filtering in nwfilter

Non-OpenStack libraries should not be logging at INFO

Third party libraries which use the Info level should have their default logging levels set to WARN. It is not useful to an Operator to know that a new item was added to a connection pool (thanks request lib!).

Messages at INFO should be a unit of work

Good:

 2014-01-26 15:36:10.597 28297 INFO nova.virt.libvirt.driver [-] [instance: b1b8e5c7-12f0-4092-84f6-297fe7642070] Instance spawned successfully.
 2014-01-26 15:36:14.307 28297 INFO nova.virt.libvirt.driver [-] [instance: b1b8e5c7-12f0-4092-84f6-297fe7642070] Instance destroyed successfully.

Bad:

 2014-01-26 15:36:11.198 INFO nova.virt.libvirt.driver [req-ded67509-1e5d-4fb2-a0e2-92932bba9271 FixedIPsNegativeTestXml-1426989627 FixedIPsNegativeTestXml-38506689] [instance: fd027464-6e15-4f5d-8b1f-c389bdb8772a] Creating image
 2014-01-26 15:36:11.525 INFO nova.virt.libvirt.driver [req-ded67509-1e5d-4fb2-a0e2-92932bba9271 FixedIPsNegativeTestXml-1426989627 FixedIPsNegativeTestXml-38506689] [instance: fd027464-6e15-4f5d-8b1f-c389bdb8772a] Using config drive
 2014-01-26 15:36:12.326 AUDIT nova.compute.manager [req-714315e2-6318-4005-8f8f-05d7796ff45d FixedIPsTestXml-911165017 FixedIPsTestXml-1315774890] [instance: b1b8e5c7-12f0-4092-84f6-297fe7642070] Terminating instance 2014-01-26
 15:36:12.570 INFO nova.virt.libvirt.driver [req-ded67509-1e5d-4fb2-a0e2-92932bba9271 FixedIPsNegativeTestXml-1426989627 FixedIPsNegativeTestXml-38506689] [instance: fd027464-6e15-4f5d-8b1f-c389bdb8772a] Creating config drive at /opt/stack/data/nova/instances/fd027464-6e15-4f5d-8b1f-c389bdb8772a/disk.config

This is incredible overshare, in normal operating it means all these messages just need to be thrown away.

INFO messages shouldn't need a secret decoder ring

Bad:

 2014-01-26 15:36:14.256 28297 INFO nova.compute.manager [-] Lifecycle event 1 on VM b1b8e5c7-12f0-4092-84f6-297fe7642070 

DEBUG

Log level definitions:

http://stackoverflow.com/questions/2031163/when-to-use-log-level-warn-vs-error This is a nice writeup about when to use each log level. Here is a brief description:

  • Debug: Shows everything and is likely not suitable for normal production operation due to the sheer size of logs generated
  • Info: Usually indicates successful service start/stop, versions and such non-error related data
  • Audit: (An OpenStack invented, level, still not sure what we should be doing with it)
  • Warning: Indicates that there might be a systemic issue; potential predictive failure notice
  • Error: An error has occurred and an administrator should research the event
  • Critical: An error has occurred and the system might be unstable; immediately get administrator assistance