Jump to: navigation, search

Fencing Instances of an Unreachable Host

Revision as of 10:25, 23 January 2014 by Ehud Trainin (talk | contribs)

Abstract

When an OpenStack controller determines that a connection to a physical host is broken, it is possible to restart some or all of its instances on other hosts. The new instance (the instance restarted on another host) takes over the identity of the obsolete instance (the instance on the unreachable host), thus it has the same volumes attached, IP and MAC addresses. OpenStack supports this remote restart operation through a Nova API command called "evacuate" (the Nova "evacuate" API is referred in this document as remote restart).

It is important to note that the remote restart may be done, whenever the OpenStack controller decides the host's connectivity is broken. This neither implies the host's connectivity is broken for sure from its entire environment nor it is broken forever. When the perceived disconnection is due to some transient or partial failure, the OpenStack remote restart might lead into two identical instances running together and having a dangerous conflict. For example, the obsolete instance may access the application storage, causing data corruption, create an IP address conflict or communicate with other nodes, in a way that may disrupt the new instance communications or create inconsistent states.

In order safely remote restart, the obsolete instance must first be fenced, i.e. shut down or isolated.

The following table shows three fencing approaches. These methods address the case in which not only the instances are unreachable, but also their host is unreachable.

Approach Initiated by Method
Power fencing OpenStack Controller Shut down the instances by a power off or a hard/cold reboot of the host
Resource fencing OpenStack Controller Isolate the instances from the application storage and from the network
Self fencing Nova Compute service on the host Shut down the instances

For each of these methods, the document elaborates how it can be implemented in OpenStack, requirements and recommendations of the underlying infrastructure (e.g. recommendations for the power system), analysis of the fencing method advantages and drawbacks.
Due to infrastructure limitations, it might be that only some of the three fencing methods may be used for a given system. Yet, it is recommended to combine all of them for the following reasons:

  1. Reducing the probability the fencing would fail (especially, this may happen by the same failure caused to the host disconnection).
  2. Reducing the fencing time.


Based on the three fencing approaches, a design of fencing support in OpenStack is given, covering Nova, Cinder and Neutron. This includes

  1. Power fencing
  2. Fencing from storage in Cinder
  3. Fencing from network in Neutron
  4. Self fencing by Nova Compute
  5. Fencing awareness and management in Nova controller


It should be noted that this is a working document, thus while some of the items above (2 and 5) are detailed designs, others (1, 3 and 4) require a further elaboration.


Background

Scope

Handling a host disconnection event requires a variety of capabilities:

  1. Fault detection: a mechanism for monitoring, detecting and alerting disconnected-host events. Such a mechanism may be based, for example, on Nova Compute heartbeats and/or Ceilometer.
  2. Fault management: listening to disconnected-host events, choosing and initiating response actions. This may be done by Heat and/or administrator.
  3. Correction capabilities in Nova, Cinder, and Neutron:
    1. Fence the instances of the unreachable host.
    2. Remote restart of an instance, or alternatively, start a standby instance (an instance which was kept updated by micro check pointing).
    3. Recover management operations, which were disrupted by the host disconnection. For example, if an instance creation was disrupted by the host failure, it is required to clean up some of the changes already done and to create the instance on another host.


Remote restart is supported today in OpenStack through a special Nova API method, called "evacuate". As elaborated in another document, there is a lot of place for fixes and enhancements of the remote restart ("evacuate") method. There is also a heart beat infrastructure in Nova, which may be leveraged to provide disconnected-host events. All other items in the list above are not supported at all.

While the goal is completing all items, this document focuses only in fencing. As later described, fencing itself is quite an extensive and non trivial functionality. It should be noted that data center managers do not rely solely on OpenStack. They have their own manual procedures and/or scripts to deal with host disconnection scenarios. Thus, adding a fencing mechanism, before completing all the other items, would already be useful.

Another related issue, which is not addressed by this document, is the fencing of an unreachable instance, in case the host is reachable. This may be done by less aggressive means than those elaborated in this document, for example, by using the fence_virsh agent for fencing an instance managed by libvirt.

While fencing and remote restart are necessary for applications high availability, they are not sufficient guarantee for that. When a host crashes, the state of the instance may be lost. In order to avoid that the application state must be persisted and/or have a live copy. These issues are beyond the scope of this document.

Scope

Remote restart

Normally, when it is desired to move an instance from one host to another (for example in a planned maintenance operation, due to power consumption optimization, du to load balancing, etc.) it is best to use live migration, which is transparent to the application.

Sometimes live-migration is not possible due to some incompatibly between the source and the target host. In such cases, a cold-migration should be used. Cold migration flushes all state and data to the disk and then either copies the disk to the destination, or points the new instance to a shared storage.

In the current case, the host is unreachable for instances management operations (though the host may still be reachable for power operations). Thus neither life migration nor cold migration is possible. It is only possible to remote restart the instance.

The problem

While the remote restart support (the "evacuate" API in Nova) is a good start, it is an incomplete and a dangerous solution, since the obsolete instances may attempt to access the application storage causing a data corruption, create an IP address conflict or disrupt the new instance communications.

Note that the only thing we know for sure is that the connection to the host was lost. This is indeed very often an indication that the host had been crashed, but not always. A connection to a host may be lost for other reasons as well. In some of these cases a conflict may happen. Specifically, the following causes may lead into a conflict:

  1. A transient failure of the network would lead to a conflict once the host is reconnected.
  2. A partial network failure, in which the OpenStack controller lost its connection to the host, yet, the host's instances can access the application storage and/or the data network.
  3. A partial failure of the host may disrupt its connection with the controller without killing host's instances.