Jump to: navigation, search

Difference between revisions of "OpsGuide/Maintenance, Failures, and Debugging"

m (David.desrosiers moved page Maintenance, Failures, and Debugging to OpsGuide/Maintenance, Failures, and Debugging without leaving a redirect)
m
 
(3 intermediate revisions by the same user not shown)
Line 1: Line 1:
* [[Cloud Controller and Storage Proxy Failures and Maintenance]]
+
* [[OpsGuide/Cloud Controller and Storage Proxy Failures and Maintenance|Cloud Controller and Storage Proxy Failures and Maintenance]]
** [[Cloud Controller and Storage Proxy Failures and Maintenance#planned-maintenance|Planned Maintenance]]
+
** [[OpsGuide/Cloud Controller and Storage Proxy Failures and Maintenance#planned-maintenance|Planned Maintenance]]
** [[Cloud Controller and Storage Proxy Failures and Maintenance#rebooting-a-cloud-controller-or-storage-proxy|Rebooting a Cloud Controller or Storage Proxy]]
+
** [[OpsGuide/Cloud Controller and Storage Proxy Failures and Maintenance#rebooting-a-cloud-controller-or-storage-proxy|Rebooting a Cloud Controller or Storage Proxy]]
** [[Cloud Controller and Storage Proxy Failures and Maintenance#total-cloud-controller-failure|Total Cloud Controller Failure]]
+
** [[OpsGuide/Cloud Controller and Storage Proxy Failures and Maintenance#total-cloud-controller-failure|Total Cloud Controller Failure]]
* [[Compute Node Failures and Maintenance]]
+
* [[OpsGuide/Compute Node Failures and Maintenance|Compute Node Failures and Maintenance]]
** [[Compute Node Failures and Maintenance#planned-maintenance|Planned Maintenance]]
+
** [[OpsGuide/Compute Node Failures and Maintenance#planned-maintenance|Planned Maintenance]]
** [[Compute Node Failures and Maintenance#after-a-compute-node-reboots|After a Compute Node Reboots]]
+
** [[OpsGuide/Compute Node Failures and Maintenance#after-a-compute-node-reboots|After a Compute Node Reboots]]
** [[Compute Node Failures and Maintenance#instances|Instances]]
+
** [[OpsGuide/Compute Node Failures and Maintenance#instances|Instances]]
** [[Compute Node Failures and Maintenance#inspecting-and-recovering-data-from-failed-instances|Inspecting and Recovering Data from Failed Instances]]
+
** [[OpsGuide/Compute Node Failures and Maintenance#inspecting-and-recovering-data-from-failed-instances|Inspecting and Recovering Data from Failed Instances]]
** [[Compute Node Failures and Maintenance#managing-floating-ip-addresses-between-instances|Managing floating IP addresses between instances]]
+
** [[OpsGuide/Compute Node Failures and Maintenance#managing-floating-ip-addresses-between-instances|Managing floating IP addresses between instances]]
** [[Compute Node Failures and Maintenance#volumes|Volumes]]
+
** [[OpsGuide/Compute Node Failures and Maintenance#volumes|Volumes]]
** [[Compute Node Failures and Maintenance#total-compute-node-failure|Total Compute Node Failure]]
+
** [[OpsGuide/Compute Node Failures and Maintenance#total-compute-node-failure|Total Compute Node Failure]]
** [[Compute Node Failures and Maintenance#var-lib-nova-instances|/var/lib/nova/instances]]
+
** [[OpsGuide/Compute Node Failures and Maintenance#var-lib-nova-instances|/var/lib/nova/instances]]
* [[Storage Node Failures and Maintenance]]
+
* [[OpsGuide/Storage Node Failures and Maintenance|Storage Node Failures and Maintenance]]
** [[Storage Node Failures and Maintenance#rebooting-a-storage-node|Rebooting a Storage Node]]
+
** [[OpsGuide/Storage Node Failures and Maintenance#rebooting-a-storage-node|Rebooting a Storage Node]]
** [[Storage Node Failures and Maintenance#shutting-down-a-storage-node|Shutting Down a Storage Node]]
+
** [[OpsGuide/Storage Node Failures and Maintenance#shutting-down-a-storage-node|Shutting Down a Storage Node]]
** [[Storage Node Failures and Maintenance#replacing-a-swift-disk|Replacing a Swift Disk]]
+
** [[OpsGuide/Storage Node Failures and Maintenance#replacing-a-swift-disk|Replacing a Swift Disk]]
* [[Handling a Complete Failure]]
+
* [[OpsGuide/Handling a Complete Failure|Handling a Complete Failure]]
 
* [[Configuration Management]]
 
* [[Configuration Management]]
 
* [[Working with Hardware]]
 
* [[Working with Hardware]]

Latest revision as of 02:45, 14 November 2017

Downtime, whether planned or unscheduled, is a certainty when running a cloud. This chapter aims to provide useful information for dealing proactively, or reactively, with these occurrences.