Jump to: navigation, search

Difference between revisions of "NovaOrchestration/WorkflowEngines/SpiffWorkflow"

 
(No difference)

Revision as of 19:29, 6 April 2012

SpiffWorkflow notes

Summary

SpiffWorkflow is a pure python workflow framework based on some thorough academic work documented here http://www.workflowpatterns.com.

The code is available here http://github.com/knipknap/SpiffWorkflow.

It seems to have had three spurts of activity since creation in mid-2008 (one of which is recent and driven by me). The author (Samuel, a.k.a. knipknap) is extremely responsive in github and on the mailing list: http://groups.google.com/group/spiff-devel

Documentation of the code is primarily in the code, and we've been adding to that recently. The concepts behind SpiffWorkflow are very well documented here http://www.workflowpatterns.com.

The code is clean (IMHO), but needs more tests.

Licensing:: lGPLv2 https://github.com/knipknap/SpiffWorkflow/blob/master/COPYING

Packaging 
Packages are listed in the Python package index, and
installable with pip and easy_install
http://pypi.python.org/pypi/SpiffWorkflow/0.3.0.
Python Versions 
I used it with Python 2.7 with no issues and didn't see anything that seems version specific.

Documentation:: http://github.com/knipknap/SpiffWorkflow]] and [[https://github.com/knipknap/SpiffWorkflow/

Functionality

One critical concept to know about SpiffWorkflow that helps understand the code is the difference between a TaskSpec and Task and the difference between a WorkflowSpec and Workflow.

A WorkflowSpec and TaskSpec are used to define your workflow. All types of tasks (Join, Split, Execute, Wait, etc…) are derived from TaskSpec. The Specs can be deserialized from known formats like OpenWFE.

When you want to actually run the process, you create a Workflow instance from the WorkflowSpec (pass the spec to the Workflow initializer).

How this works from there is based on the principles of computer programming (remember, this project comes from the academic world). A [derivation tree[1]] is created based off the spec using a hierarchy of Task objects (not TaskSpecs - but each Task points to the TaskSpec that created it). Each Task object is basically a node in the derivation tree. Each task in the tree links back to it's parent (there are no connection objects). The processing is done by walking down the derivation tree one Task at a time and moving the task (and it's children) through the state sequence. The states are documented here: https://github.com/knipknap/SpiffWorkflow/blob/master/SpiffWorkflow/Task.py

The Workflow and Task classes are in the root of the project. All the specs (TaskSpec, WorkflowSpec, and all derived classes) are in the specs subdirectory.

You can serialize/deserialize specs and open standards like OpenWFE are supported (and others can be coded in easily). You can also serialize/deserialize a running workflow (it will pull in its spec as well).

Another important distinction is between properties and attributes. Properties belong to TaskSpecs. They are static at run-time and belong to the design of the workflow. Attributes are dynamic and assigned to Tasks (nodes in the execution path).

There's a decent eventing model that allows you to tie in to and receive events (for each task, you can get event notifications from its TaskSpec). The events correspond with how the processing is going in the derivation tree, not necessarily how the workflow as a whole is moving. See https://github.com/knipknap/SpiffWorkflow/blob/master/SpiffWorkflow/specs/TaskSpec.py for docs on events.

Understanding FUTURE, WAITING, READY, and COMPLETE states:

- FUTURE means the processor has predicted that this this path will be taken and this task will definitely run. - If a task is waiting on predecessors to run then it is in FUTURE state (not WAITING). - READY means "preconditions are met for marking this task as complete". - You can try to complete a task at any point. If it is in FUTURE state and does not complete, it can fall back to READY state.

Waiting can be confusing: - WAITING means "I am in the process of doing my work and have not finished. When the work is finished, then I will be READY for completion and will go to READY state." - WAITING always comes after FUTURE and before READY. - WAITING is an optional STATE.

Reached is confusing unless you remember that it means that the processor has now 'reached' this task in the execution path: - REACHED means processing has reached this task in the derivation tree.This is not a state, but an event. - A task is always reached before it becomes READY.

General comments

You can nest workflows (using the SubWorkflowSpec).

The serialization code is done well which makes it easy to add new formats if we need to support them.

More tests and documentation are needed, but the project looks to be well thought-out and organized to me. Some things I was stuck on turned out to be quite elegantly worked through once I talked to the author.

The documentation on http://github.com/knipknap/SpiffWorkflow is great; especially the flash animations showing how each type of task works.

The tasks labelled "ThreadXXXX" create logical threads based on the model in http://github.com/knipknap/SpiffWorkflow. There is no python threading implemented. However, there is some locking and mutex code in place.

There's a decent eventing model that allows you to tie in to and receive events.

There's no GUI or graphical tools for workflows, but the author has just imported in a javascript wire diagramming library…