Jump to: navigation, search

Difference between revisions of "TroubleshootingNova"

(Added euca-authorize commands to troubleshooting section)
(Removed the list of common errors, moved those to Ask and linked that here.)
 
(5 intermediate revisions by 3 users not shown)
Line 1: Line 1:
__NOTOC__
 
 
= Troubleshooting Tips for [[OpenStack]] Compute (Nova) =
 
= Troubleshooting Tips for [[OpenStack]] Compute (Nova) =
 
<<[[TableOfContents]]()>>
 
  
 
== Starting Points ==
 
== Starting Points ==
Line 8: Line 5:
 
Most of the installation instructions are written for Ubuntu, but there is also information about Centos and Fedora.  
 
Most of the installation instructions are written for Ubuntu, but there is also information about Centos and Fedora.  
  
You can also check the [https://answers.launchpad.net/nova Answers area of the Launchpad site for Nova] to find asked and answered questions.
+
You can also check the [https://ask.openstack.org Ask OpenStack site] to find asked and answered questions.
  
 
== Log Files ==
 
== Log Files ==
Line 18: Line 15:
 
== Common Errors ==
 
== Common Errors ==
  
The Launchpad Answers site offers a place to ask and answer questions, and you can also mark questions as frequently asked questions. This section describes some errors people have posted to Launchpad Answers and IRC. We are constantly fixing bugs, so online resources are a great way to get the most up-to-date errors and fixes.
+
The [https://ask.openstack.org/questions/scope:all/sort:activity-desc/tags:nova,faq/page:1/ Nova FAQs on Ask OpenStack] site offers a place to ask and answer questions to common errors. You'll find there some errors people have posted and common solutions. We are constantly fixing bugs, so online resources are a great way to get the most up-to-date errors and fixes.
 
 
=== Credential errors, 401, 403 forbidden errors ===
 
 
 
A 403 forbidden error is caused by missing credentials. Through current installation methods, there are basically two ways to get the novarc file. The manual method requires getting it from within a project zipfile, and the scripted method just generates novarc out of the project zip file and sources it for you. If you do the manual method through a zip file, then the following novarc alone, you end up losing the creds that are tied to the user you created with nova-manage in the steps before.
 
 
 
When you run nova-api the first time, it generates the certificate authority information, including openssl.cnf. If it gets started out of order, you may not be able to create your zip file. Once your CA information is available, you should be able to go back to nova-manage to create your zipfile.
 
 
 
You may also need to check your proxy settings to see if they are causing problems with the novarc creation.
 
 
 
=== Cannot Ping or SSH to an Instance ===
 
 
 
Sometimes a particular instance shows "pending" or you cannot SSH to it. Sometimes networking settings are the problem. Sometimes the image itself is the problem.
 
 
 
 
 
<pre><nowiki>#!rst
 
 
 
One of the most commonly missed configuration areas is not allowing the proper access to VMs. Use the 'euca-authorize' command to enable access.  Below, you will find the commands to allow 'ping' and 'ssh' to your VMs::
 
 
 
    euca-authorize -P icmp -t -1:-1 default
 
    euca-authorize -P tcp -p 22 default
 
 
 
Another common issue is you cannot ping or SSH your instances after issusing the 'euca-authorize' commands.  Something to look at is the amount of 'dnsmasq' processes that are running.  If you have a running instance, check to see that TWO 'dnsmasq' processes are running.  If not, perform the following::
 
 
 
    killall dnsmasq
 
    service nova-network restart
 
 
 
</nowiki></pre>
 
 
 
 
 
With recent builds of Nova, IPv6 configuration is allowed, but if you cannot SSH to an image, add --use_ipv6=false to your nova.conf.
 
 
 
For example, when using flat manager networking, you do not have a dhcp server, and an ami-tiny image doesn't support interface injection so you cannot connect to it. The fix for this type of problem is to use an Ubuntu image, which should obtain an IP address correctly with [[FlatManager]] network settings. To troubleshoot other possible problems with an instance, such as one that stays in a spawning state, first check your instances directory for i-ze0bnh1q dir to make sure it has the following files:
 
 
 
* libvirt.xml
 
* disk
 
* disk-raw
 
* kernel
 
* ramdisk
 
* console.log (Once the instance actually starts you should see a console.log.)
 
 
 
Check the file sizes to see if they are reasonable. If any are missing/zero/very small then nova-compute has somehow not completed download of the images from objectstore.
 
 
 
Also check nova-compute.log for exceptions. Sometimes they don't show up in the console output.
 
 
 
Next, check the /var/log/libvirt/qemu/i-ze0bnh1q.log file to see if it exists and has any useful error messages in it.
 
 
 
Finally, from the instances/i-ze0bnh1q directory, try virsh create libvirt.xml and see if you get an error there.
 
 
 
Also, when setting up nodes in [[FlatManger]], be sure to enable ipforward or none your node instances will be able to ping out.
 
 
 
 
 
<pre><nowiki>
 
/etc/sysctl.conf
 
Net ipv4 ip_forward = 1
 
</nowiki></pre>
 
 
 
 
 
=== Slow Running VMs ===
 
 
 
Why is my VM running slow as molasses?  There is a permissions issue where /dev/kvm is not writable, and the VM's drop into qemu mode.  This causes the slowdown, and can be addresses by the following:
 
 
 
Verify with:
 
 
 
 
 
<pre><nowiki>
 
cat /var/log/libvirt/qemu/(internal_id).log | grep -i kvm</nowiki></pre>
 
 
 
 
 
In the results, look for:
 
 
 
 
 
<pre><nowiki>
 
open /dev/kvm: Permission denied
 
Could not initialize KVM, will disable KVM support
 
</nowiki></pre>
 
 
 
 
 
Fix with:
 
 
 
 
 
<pre><nowiki>
 
chgrp kvm /dev/kvm
 
chmod g+rwx /dev/kvm
 
</nowiki></pre>
 
 
 
 
 
=== Filesystem Not Found on Custom-created Images ===
 
 
 
When creating custom images, I get a / (root) filesystem not found error.
 
 
 
This is likely due to your image having / (root) as /dev/sda1, etc…..this needs to be /dev/vda1 for the image to boot properly.
 
 
 
Why /dev/vda1?
 
 
 
In the libvirt xml templates it specifies the virtual drives be made as "vdX" only.
 
 
 
So, vda1 would be the first one generated if you don't have advanced partitioning, and OpenStack can't boot the image.
 
 
 
=== Nova services (nova-api) not starting ===
 
 
 
Nova-api is not starting, so therefore, I can't start nova-compute.
 
 
 
When I try a restart of the nova-api service, such as:
 
 
 
<pre><nowiki>
 
root@ubuntu2:~# service nova-api restart
 
</nowiki></pre>
 
 
 
 
 
I get a response like so:
 
 
 
 
 
<pre><nowiki>
 
pidfile /var/run/nova/nova-api.pid does not exist. Daemon not running?
 
Initialized with method overriding = True, and path info altering = True
 
DEBUG:routes.middleware:Initialized with method overriding = True, and path info altering = True
 
Initialized with method overriding = True, and path info altering = True
 
DEBUG:routes.middleware:Initialized with method overriding = True, and path info altering = True
 
Initialized with method overriding = True, and path info altering = True
 
DEBUG:routes.middleware:Initialized with method overriding = True, and path info altering = True
 
Initialized with method overriding = True, and path info altering = True
 
DEBUG:routes.middleware:Initialized with method overriding = True, and path info altering = True
 
(7030) wsgi starting up on http://0.0.0.0:8774/
 
(7030) wsgi starting up on http://0.0.0.0:8773/
 
</nowiki></pre>
 
 
 
 
 
In this case, you should set the daemonize flag to 1 (true) in /etc/nova/nova.conf, like so:
 
--daemonize=1
 
 
 
Restart the services with this setting enabled and you should be able to start all the services again.
 
 
 
=== Errors with IP addresses for VMs ===
 
 
 
When you see that nova-network or nova-compute shows errors with the IP assigned to the virtual machines themselves, you see something like this:
 
 
 
 
 
<pre><nowiki>
 
2010-12-05 21:48:40-0600 [-] nova.exception.ProcessExecutionError: Unexpected error while running command.
 
</nowiki></pre>
 
 
 
 
 
You can verify with this command:
 
 
 
 
 
<pre><nowiki>
 
sudo -E dnsmasq --strict-order --bind-interfaces --conf-file= --pid-file=/var/lib/nova/networks/nova-br100.pid --listen-address=192.168.0.1 --except-interface=lo --dhcp-range=192.168.0.3,static,120s --dhcp-hostsfile=/var/lib/nova/networks/nova-br100.conf --dhcp-script=/usr/bin/nova-dhcpbridge --leasefile-ro
 
 
 
</nowiki></pre>
 
 
 
 
 
where the listen-address is the address that you see errors when sending commands to it.
 
 
 
To verify that this is your error, you should see this type of response after sending the dnsmasq command:
 
 
 
 
 
<pre><nowiki>
 
2010-12-05 21:48:40-0600 [-] Exit code: 2
 
2010-12-05 21:48:40-0600 [-] Stdout: ''
 
2010-12-05 21:48:40-0600 [-] Stderr: '\ndnsmasq: failed to bind listening socket for 192.168.0.1: Address already in use\n'
 
</nowiki></pre>
 
 
 
 
 
Fix with these commands:
 
 
 
{{
 
--killall dnsmasq
 
--service nova-network restart
 
</nowiki></pre>
 
 
 
 
 
Reboot instances if problem persists.
 
 
 
=== Database Locks ===
 
 
 
If you are not running Nova as a root user, you may get errors that the database is locked, first from nova-scheduler, next from nova-compute. Here is an example:
 
 
 
 
 
<pre><nowiki>
 
OperationalError: (OperationalError) database is locked
 
</nowiki></pre>
 
 
 
 
 
Check your permissions - if you installed as root but are trying to run Nova as another user, these errors may appear.
 

Latest revision as of 19:05, 11 April 2013

Troubleshooting Tips for OpenStack Compute (Nova)

Starting Points

Most of the installation instructions are written for Ubuntu, but there is also information about Centos and Fedora.

You can also check the Ask OpenStack site to find asked and answered questions.

Log Files

Log files are stored in /var/log/nova and there is a log file for each service, for example nova-compute.log. You can format the log strings using flags for the nova.log module. The flags used to set format strings are: logging_context_format_string and logging_default_format_string. If the log level is set to debug, you can also specify logging_debug_format_suffix to append extra formatting. For information about what variables are available for the formatter see: http://docs.python.org/library/logging.html#formatter

You have two options for logging for OpenStack Compute based on configuration settings. In nova.conf, include the --logfile flag to enable logging. Alternatively you can set --use_syslog=1, and then the nova daemon logs to syslog.

Common Errors

The Nova FAQs on Ask OpenStack site offers a place to ask and answer questions to common errors. You'll find there some errors people have posted and common solutions. We are constantly fixing bugs, so online resources are a great way to get the most up-to-date errors and fixes.