Difference between revisions of "TaskFlow/Persistence"

Revision as of 20:45, 26 April 2014

Revised on: 4/26/2014 by Harlowja

Overview

A persistence API as well as base persistence types are provided with taskflow for the purpose of ensuring that jobs, flows, and there associated tasks can be backed up in a database or in memory (or elsewhere). The user, when configuring the persistence API, has the option to specify which backend is desired and subsequently store and retrieve the data associated with the jobs, flows, and tasks in use.

For in-depth design details see: Details

Why?

Allows for reconstruction and resumption of flows and there associated tasks.
Allows for redundant checks that expected data is provided.
Allows for the user to view the history of a jobs, flows and there associated tasks.
Facilitates debugging of taskflow usage and integration (and runtime/post-runtime analysis).

Checkpointing

A WIP topic/discussion is the concept of check-pointing.

See: Checkpointing

@@ Line 15: / Line 15: @@
 * Allows for the user to view the history of a jobs, flows and there associated tasks.
 * Facilitates debugging of taskflow usage and integration (and runtime/post-runtime analysis).
-== Storage ==
-Now that we already have storage in taskflow -- that is the logbook (which is itself connected or derived/saved to a given backend). It should be emphasized that logbook  is  the authoritative, and, preferably, the '''only''' source of run-time state information. When task returns result, it should be written directly to logbook. When task or flow state changes in any way, logbook is first to know. Flow should '''not''' store task results -- there is logbook for that.
-Logbook and a backend are responsible to store the actual data -- these together specify the persistence mechanism (how data is saved and where -- memory, database,
-whatever), and persistence policy (when data is saved -- every time it changes or at some particular moments or simply never). To make these components simpler to use we have come up with the concept of a storage API; this API allows engines to easily call into the storage layer and avoid the details about logbooks, flowdetails, taskdetails and backends.
 == Checkpointing ==