Difference between revisions of "OpsGuide/Maintenance, Failures, and Debugging"
< OpsGuide
(Created page with "* Cloud Controller and Storage Proxy Failures and Maintenance ** Planned Maintenance **...") |
m |
||
(4 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | * [[Cloud Controller and Storage Proxy Failures and Maintenance]] | + | * [[OpsGuide/Cloud Controller and Storage Proxy Failures and Maintenance|Cloud Controller and Storage Proxy Failures and Maintenance]] |
− | ** [[Cloud Controller and Storage Proxy Failures and Maintenance#planned-maintenance|Planned Maintenance]] | + | ** [[OpsGuide/Cloud Controller and Storage Proxy Failures and Maintenance#planned-maintenance|Planned Maintenance]] |
− | ** [[Cloud Controller and Storage Proxy Failures and Maintenance#rebooting-a-cloud-controller-or-storage-proxy|Rebooting a Cloud Controller or Storage Proxy]] | + | ** [[OpsGuide/Cloud Controller and Storage Proxy Failures and Maintenance#rebooting-a-cloud-controller-or-storage-proxy|Rebooting a Cloud Controller or Storage Proxy]] |
− | ** [[Cloud Controller and Storage Proxy Failures and Maintenance#total-cloud-controller-failure|Total Cloud Controller Failure]] | + | ** [[OpsGuide/Cloud Controller and Storage Proxy Failures and Maintenance#total-cloud-controller-failure|Total Cloud Controller Failure]] |
− | * [[Compute Node Failures and Maintenance]] | + | * [[OpsGuide/Compute Node Failures and Maintenance|Compute Node Failures and Maintenance]] |
− | ** [[Compute Node Failures and Maintenance#planned-maintenance|Planned Maintenance]] | + | ** [[OpsGuide/Compute Node Failures and Maintenance#planned-maintenance|Planned Maintenance]] |
− | ** [[Compute Node Failures and Maintenance#after-a-compute-node-reboots|After a Compute Node Reboots]] | + | ** [[OpsGuide/Compute Node Failures and Maintenance#after-a-compute-node-reboots|After a Compute Node Reboots]] |
− | ** [[Compute Node Failures and Maintenance#instances|Instances]] | + | ** [[OpsGuide/Compute Node Failures and Maintenance#instances|Instances]] |
− | ** [[Compute Node Failures and Maintenance#inspecting-and-recovering-data-from-failed-instances|Inspecting and Recovering Data from Failed Instances]] | + | ** [[OpsGuide/Compute Node Failures and Maintenance#inspecting-and-recovering-data-from-failed-instances|Inspecting and Recovering Data from Failed Instances]] |
− | ** [[Compute Node Failures and Maintenance#managing-floating-ip-addresses-between-instances|Managing floating IP addresses between instances]] | + | ** [[OpsGuide/Compute Node Failures and Maintenance#managing-floating-ip-addresses-between-instances|Managing floating IP addresses between instances]] |
− | ** [[Compute Node Failures and Maintenance#volumes|Volumes]] | + | ** [[OpsGuide/Compute Node Failures and Maintenance#volumes|Volumes]] |
− | ** [[Compute Node Failures and Maintenance#total-compute-node-failure|Total Compute Node Failure]] | + | ** [[OpsGuide/Compute Node Failures and Maintenance#total-compute-node-failure|Total Compute Node Failure]] |
− | ** [[Compute Node Failures and Maintenance#var-lib-nova-instances|/var/lib/nova/instances]] | + | ** [[OpsGuide/Compute Node Failures and Maintenance#var-lib-nova-instances|/var/lib/nova/instances]] |
− | * [[Storage Node Failures and Maintenance]] | + | * [[OpsGuide/Storage Node Failures and Maintenance|Storage Node Failures and Maintenance]] |
− | ** [[Storage Node Failures and Maintenance#rebooting-a-storage-node|Rebooting a Storage Node]] | + | ** [[OpsGuide/Storage Node Failures and Maintenance#rebooting-a-storage-node|Rebooting a Storage Node]] |
− | ** [[Storage Node Failures and Maintenance#shutting-down-a-storage-node|Shutting Down a Storage Node]] | + | ** [[OpsGuide/Storage Node Failures and Maintenance#shutting-down-a-storage-node|Shutting Down a Storage Node]] |
− | ** [[Storage Node Failures and Maintenance#replacing-a-swift-disk|Replacing a Swift Disk]] | + | ** [[OpsGuide/Storage Node Failures and Maintenance#replacing-a-swift-disk|Replacing a Swift Disk]] |
− | * [[Handling a Complete Failure]] | + | * [[OpsGuide/Handling a Complete Failure|Handling a Complete Failure]] |
* [[Configuration Management]] | * [[Configuration Management]] | ||
* [[Working with Hardware]] | * [[Working with Hardware]] |
Latest revision as of 02:45, 14 November 2017
- Cloud Controller and Storage Proxy Failures and Maintenance
- Compute Node Failures and Maintenance
- Storage Node Failures and Maintenance
- Handling a Complete Failure
- Configuration Management
- Working with Hardware
- Databases
- RabbitMQ troubleshooting
- HDWMY
- Determining Which Component Is Broken
- What to do when things are running slowly
- Uninstalling
Downtime, whether planned or unscheduled, is a certainty when running a cloud. This chapter aims to provide useful information for dealing proactively, or reactively, with these occurrences.