Difference between revisions of "TaskFlow/Patterns and Engines/Persistence"

Revision as of 11:36, 25 September 2013

How can we persist the flow? Here is informal description.

Big Picture

Flow is set of tasks and relations between tasks. When flow is loaded into engine, there is translation step that associates flow with persistent data in storage. In particular, for each flow there is FlowDetail record, and for each task there is TaskDetail record.

To allow farther resumption taskflow must be able to re-create the flow. For that, a factory function should be provided. Fully qualified name of the function should be saved in FlowDetail; then, we'll be able to import and call this function again, e. g. after service restart.

Putting flow creation to separate function should not be to much burden -- the pattern is already used, look for example at [cinder code].

Task is associated with TaskDetail by name. This requires that task names are unique within particular flow.

First-Time Loading

When new flow is loaded into engine, there are no persistend data for it yet, so corresponding FlowDetail object should be created, as well as TaskDetail object for each task.

Resuming Same Flow

When we resume the flow from database (for example, if flow was interrupted and engine destroyed to save resources or if service was restarted), we need to re-create the flow. For that, we call function that were saved on first-time loading that builds the flow for us.

Then, we load the flow into engine. For each task, there must already be TaskDetail object. We associate task with TaskDetail by trask using task names.

Task states and results should be only infromation needed to resume the flow.

Resuming Changed Flow

Upgrade use-case is probably most interesting and challenging. While there are several options, the best (and recommended) practice for upgrades should be loading new (chaged) flows with state of older flows.

This is done with the same process as loading same, unchaged flow: tasks are associated with saved state by names when flow is loaded into engine.

This will not work out-of-the box for all use cases: sometimes, data migrations (much like database migrations, but on higher level) will be needed.

Let's consider several use-cases.

Task was rdded

This is the simplest use case. As there is no state for the new task, new TaskDetail record will be created for it automatically.

Task was removed

Task code was changed

Task was split in two

Flow structure were changed

Design rationales

Flow Factory

How do we upgrade? We change the code that creates the flow, then we restart the service.

Then, when flow is restored from storage, what we really want is to load new flow, with updated structure and updated tasks, but preserve task states and results. At least, that is most common case.

So, the code should be run to put the tasks into patterns and re-create the flow, but logbook should be considered to load state and results of tasks.

So, creation of the flow should be put into separate function.

Using Names To Associate Tasks

The engine gets a flow and flow details, and should reconstruct its internal state.

A task should be somehow matched with TaskDetails. The match should be:

stable if tasks are added or removed;
should not change when service is restarted, upgraded;
should be the same across all server instances in HA setups.

One option is that tasks shuld be matched with TaskDetails by task name. This has several implications:

the names of tasks should be unique in flow;
it becomes too hard to change the name of task.

Open Questions

Are there any good alternatives to using task names?

@@ Line 24: / Line 24: @@
 === Resuming Changed Flow ===
+Upgrade use-case is probably most interesting and challenging. While there are
+several options, the best (and recommended) practice for upgrades should be
+loading new (chaged) flows with state of older flows.
+This is done with the same process as loading same, unchaged flow: tasks are
+associated with saved state by names when flow is loaded into engine.
+This will not work out-of-the box for all use cases: sometimes, data migrations
+(much like database migrations, but on higher level) will be needed.
+Let's consider several use-cases.
+==== Task was rdded ====
+This is the simplest use case. As there is no state for the new task, new
+TaskDetail record will be created for it automatically.
+==== Task was removed ====
+==== Task code was changed ====
+==== Task was split in two ====
+==== Flow structure were changed ====
 ==  Design rationales ==