Difference between revisions of "TaskFlow"

Revision as of 22:03, 21 June 2013

Contributors: Keith Bray, Adrian Otto, Jessica Lucci, Joshua Harlow Revised on: 6/21/2013 by Adrian Otto

Executive Summary

Taskflow is a Python library for OpenStack that helps make task execution easy, consistent, and reliable. It allows the creation of lightweight task objects and/or functions that are combined together into flows (aka: workflows). It includes components for running these flows in a manner that can be stopped, resumed, and safely reverted. Projects implementing the Taskflow library enjoy added state resiliency, and fault tolerance. It simplifies crash recovery. Think of it as a way to protect an action, similar to the way transactions protect operations in a RDBMS. If a manager process is terminated while an action was in progress, there is a risk that unprotected code would leave the system in a degraded or inconsistent state. With Taskflow, interrupted actions may be resumed or rolled back automatically when a manager process is resumed.

Using Taskflow to organize actions into lightweight task objects makes atomic code sequences easily testable. Lightweight tasks are arranged into flows (aka: workflows). A flow facilitates the execution of a defined sequence of ordered tasks. A flow is a structure (a set of tasks linked together), so it allows the calling code and the workflow to be disconnected so flows can be reused. Taskflow provides a few mechanisms for running flows and lets the developer pick and choose which one will work for their needs.

Conceptual Example

This pseudo code illustrates what how a flow would work for those who are familiar with SQL transactions.

START TRANSACTION
   task1: call nova API to launch a server || ROLLBACK
   task2: when task1 finished, call cinder API to attach block storage to the server || ROLLBACK
   ...perform other tasks...
COMMIT

The above flow could be used by Heat as part of an orchestration to add a server with block storage attached. It may launch several of these in parallel to prepare a number of identical servers.

Why

OpenStack code has grown organically, and does not have a standard and consistent way to perform sequences of code in a way that can be safely resumed or rolled back if the calling process is unexpectedly terminated while the code is busy doing something. Most projects don't even attempt to make tasks restartable, or revertible. There are numerous failure scenarios that are simply skipped and/or recovery scenarios which are not possible in today's code. Taskflow makes it easy to address these concerns. With widespread use of Taskflow, OpenStack can become very predictable and reliable, even in situations where it's not deployed in high availability configurations.

Design

Key primitives: https://wiki.openstack.org/wiki/StructuredWorkflowPrimitives

Tasks

Flows

Activation

Distributed:

Traditional:

Reversion

Resumption

Examples

History

Taskflow started as a prototype with the NTTdata corporation along with Yahoo! for nova and has moved into a more general solution/library that can form the underlying structure of multiple OpenStack projects at once.

Wiki with requirements and more background:

https://wiki.openstack.org/wiki/StructuredStateManagement

Join us!

https://launchpad.net/taskflow

@@ Line 1: / Line 1: @@
+Contributors: Keith Bray, Adrian Otto, Jessica Lucci, Joshua Harlow
 Revised on: {{REVISIONMONTH1}}/{{REVISIONDAY}}/{{REVISIONYEAR}} by {{REVISIONUSER}}
@@ Line 10: / Line 11: @@
 ===Conceptual Example===
-This pseudo code illustrates what how a taskflow would work for those who are familiar with SQL transactions.
+This pseudo code illustrates what how a <code>flow</code> would work for those who are familiar with SQL transactions.
   START TRANSACTION
-     task1: call nova API to launch a server
+     task1: call nova API to launch a server || ROLLBACK
-     task2: when task1 finished, call cinder API to attach block storage to the server
+     task2: when task1 finished, call cinder API to attach block storage to the server || ROLLBACK
      ...perform other tasks...
   COMMIT
 The above <code>flow</code> could be used by Heat as part of an orchestration to add a server with block storage attached. It may launch several of these in parallel to prepare a number of identical servers.
-== History ==
-Taskflow started as a prototype with the ''NTTdata'' corporation along with ''Yahoo!'' for nova and has moved into a more general solution/library that can form the underlying structure of multiple OpenStack projects at once.
-'''Wiki with requirements and more background:'''
-* https://wiki.openstack.org/wiki/StructuredStateManagement
 == Why ==
-OpenStack code is highly unstructured (likely due to organic growth of code and architecture) and even though the unstructured code does ''work'' there are many failure scenarios skipped and/or recovery scenarios which are not possible in unstructured code. With the aid of taskflow and the structuring/organization it brings it allows for these scenarios to be a non-problem and allows for new & very desirable functionality to be introduced (resumption, reversion).
+OpenStack code has grown organically, and does not have a standard and consistent way to perform sequences of code in a way that can be safely resumed or rolled back if the calling process is unexpectedly terminated while the code is busy doing something. Most projects don't even attempt to make tasks restartable, or revertible. There are numerous failure scenarios that are simply skipped and/or recovery scenarios which are not possible in today's code. Taskflow makes it easy to address these concerns. With widespread use of Taskflow, OpenStack can become very predictable and reliable, even in situations where it's not deployed in high availability configurations.
 == Design ==
@@ Line 55: / Line 48: @@
 == Examples ==
+== History ==
+Taskflow started as a prototype with the ''NTTdata'' corporation along with ''Yahoo!'' for nova and has moved into a more general solution/library that can form the underlying structure of multiple OpenStack projects at once.
+'''Wiki with requirements and more background:'''
+* https://wiki.openstack.org/wiki/StructuredStateManagement
 == Join us! ==
 * https://launchpad.net/taskflow