Jump to: navigation, search

Difference between revisions of "GuruMeditationReport"

(Created page with "= Guru Meditation Reports = When things go wrong in (production) deployments of OpenStack collecting debug data is a key first step in the process of triaging & ultimately re...")
(No difference)

Revision as of 17:49, 18 February 2013

Guru Meditation Reports

When things go wrong in (production) deployments of OpenStack collecting debug data is a key first step in the process of triaging & ultimately resolving the problem. Nova has extensively used logging capabilities which produce a vast amount of data. This does not, however, enable an admin to obtain an accurate view on the current live state of the system. For example, what threads are running, what config parameters are in effect, and more. The eventlet backdoor facility provides an interactive shell interface for any eventlet based process, allowing an admin to telnet to a pre-defined port and execute a variety of commands. This can be used to collect the necessary state information, but is has a number of limitations.

  • Every service running on a host needs to have the backdoor running on a different TCP port and the admin has to remember which process is listening where. Get this wrong and very bad things can happen.
  • The backdoor needs to have been enabled when the process was started. If this was not done before the problem arose, the admin is out of luck because restarting the service to enable the backdoor will loose the critical state that was desired.
  • The backdoor shell is too powerful. By presenting an interactive python shell too much burden is placed on the admin to find the right data, without causing problems.