Jump to: navigation, search

Difference between revisions of "TaskFlow/Engines"

(Engine)
(Supported Types)
 
(30 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 +
'''Revised on:''' {{REVISIONMONTH1}}/{{REVISIONDAY}}/{{REVISIONYEAR}} by {{REVISIONUSER}}
 +
 
== Engine ==
 
== Engine ==
  
Engine is what really runs the tasks.
+
Engines are what ''really'' runs your <code>tasks</code> and <code>flows</code>.
 +
 
 +
[[File:4StrokeEngine_Ortho_3D_Small_Mini.gif|frame]]
  
It should takes a flow structure (described by patterns) and uses it to decide which task to run and when.
+
An engine takes a <code>flow</code> structure (described by patterns) and uses it to decide which <code>task</code> to run and when.
  
[[File:Felipecaparelli_Gears_1.png|frame]]
+
There may be different implementation of engines. Some may be easier to use (ie, require no additional infrastructure setup) and understand, others might require more complicated setup but provide better scalability. The idea and ''ideal'' is that deployers/developers of a service that uses taskflow can select an engine that suites their setup best without modifying the code of said service. This allows for that deployer/developer to start off using a simpler  implementation and scaling out the service that is powered by taskflow as the service grows. In concept, all engines should implement the same interface to make it easy to replace one engine with another, and provide the same guarantees on how patterns are interpreted -- for example, if an engine runs a linear flow, the tasks should be run one after another in order no matter what type of engine is actually running that linear flow.
  
There may be different implementation of engines. Some may be easier to use (like, require no setup) and understand, others might require more complicated setup but provide better scalability. The idea and ideal is that deployers of a service that uses taskflow can select an engine that suites their setup best without modifying code of said service. This allows for starting off using a simpler  implementation and scaling out the service that is powered by taskflow as the service grows.
+
'''Note:''' engines might have different capabilities/configuration but overall the interface '''will''' remain the same and should be transparent to developers and users using taskflow.
  
'''Note:''' Engines might have different capabilities and different configuration but overall the interface should remain the same.
+
=== Supported Types ===
  
In concept, all engines should implement same interface to make it easy to replace one engine with another, and provide same guaranties on how patterns are interpreted -- for example, if an engine runs a linear flow, the tasks should be run one after another in order.
+
==== Distributed ====
  
Possible engines include:
+
When you want your applications <code>tasks</code> and <code>flows</code> to be performed in a system that is highly available & resilient to individual failure.
  
* Simple -- just takes e.g. linear flow and runs tasks from it one after another -- should be useful for debugging tasks and simple use cases;
+
[[Distributed_Task_Management_With_RPC|Distributed via RPC]]
* Threaded -- Runs tasks in separate threads enabling them to run in parallel (even several implementations are possible);
 
* Distributed -- loads tasks to celery (or some other external service) that uses tasks dependencies to determine ordering;
 
  
== How ==
+
Supports the following:
  
'''Blueprint:''' https://blueprints.launchpad.net/taskflow/+spec/patterns-and-engines
+
* Remote workers that connect over [http://kombu.readthedocs.org/ kombu] supported transports.
 +
* Combined with jobboards, provides a high-available engine ''orchestrator'' and worker combination.
 +
* ''And more...''
  
== Storage ==
+
==== Traditional ====
  
Storage is out of scope of [https://blueprints.launchpad.net/taskflow/+spec/patterns-and-engines the blueprint], but it is still worth to point out its role here.
+
When you want your <code>tasks</code> and <code>flows</code> to just run inside your applications existing framework and still take advantage of the functionality  offered.
  
We already have storage in taskflow -- that's logbook. But it should be emphasized that logbook should become the authoritative, and, preferably, the '''only''' source of runtime state
+
Supports the following:
information. When task returns result, it should be written directly to logbook. When task or flow state changes in any way, logbook is first to know. Flow should '''not''' store task results -- there is logbook for that.
 
  
Logbook and a backend are responsible to store the actual data -- these together specify the persistence mechanism (how data is saved and where -- memory, database,
+
* Threaded engine using a thread based [http://docs.python.org/dev/library/concurrent.futures.html#executor-objects executor].
whatever), and persistence policy (when data is saved -- every time it changes or at some particular moments or simply never).
+
* Threaded engine using a provided [http://eventlet.net/ eventlet] greenthread based [http://docs.python.org/dev/library/concurrent.futures.html#executor-objects executor].
 +
* Single threaded engine using no threads.
 +
* ''And more...''

Latest revision as of 00:45, 15 March 2014

Revised on: 3/15/2014 by Harlowja

Engine

Engines are what really runs your tasks and flows.

4StrokeEngine Ortho 3D Small Mini.gif

An engine takes a flow structure (described by patterns) and uses it to decide which task to run and when.

There may be different implementation of engines. Some may be easier to use (ie, require no additional infrastructure setup) and understand, others might require more complicated setup but provide better scalability. The idea and ideal is that deployers/developers of a service that uses taskflow can select an engine that suites their setup best without modifying the code of said service. This allows for that deployer/developer to start off using a simpler implementation and scaling out the service that is powered by taskflow as the service grows. In concept, all engines should implement the same interface to make it easy to replace one engine with another, and provide the same guarantees on how patterns are interpreted -- for example, if an engine runs a linear flow, the tasks should be run one after another in order no matter what type of engine is actually running that linear flow.

Note: engines might have different capabilities/configuration but overall the interface will remain the same and should be transparent to developers and users using taskflow.

Supported Types

Distributed

When you want your applications tasks and flows to be performed in a system that is highly available & resilient to individual failure.

Distributed via RPC

Supports the following:

  • Remote workers that connect over kombu supported transports.
  • Combined with jobboards, provides a high-available engine orchestrator and worker combination.
  • And more...

Traditional

When you want your tasks and flows to just run inside your applications existing framework and still take advantage of the functionality offered.

Supports the following:

  • Threaded engine using a thread based executor.
  • Threaded engine using a provided eventlet greenthread based executor.
  • Single threaded engine using no threads.
  • And more...