Jump to: navigation, search


This page is an attempt to document the requirements for any task/workflow system to be used by Heat for orchestration.

Existing requirements

These requirements are met by the current task system in Heat, which uses coroutine-based tasks similar to the ones in Tulip.

Task requirements

Tasks are implemented in the TaskRunner class in scheduler.py. It should be possible to completely swap out the task system by replacing TaskRunner with another task implementation (though it will have to cope with coroutines).

[jh]: that's heat specific code, but the concept of swapping out different backends will be provided

[zb]: yeah, this is not a requirement so much as a bit of documentation about how to test a prototype of any alternate implementation.

Tasks run in parallel

A lot of resources are slow to start so, where there are no dependencies preventing it, tasks must run in parallel.

[jh]: sure

Tasks can spawn other tasks

For example, creating a Nova server may require attaching multiple volumes to it. These volume attachments need to happen in parallel with each other, and with other tasks that are running at the same time.

We also support resources that are themselves stacks, and operations performed on the parent stack are typically mirrored on the resource-stacks. Once again, these must happen in parallel with other resources in the parent stack.

[jh]: are the stacks modified/added while running, or are all stacks 'compiled' and a fixed set of stacks is ran?

[zb]: They're not predefined; users can define arbitrary stacks in their templates and we figure out the correct workflow for lifecycle operations based on that.

[jh]: ya, not predefined, but heat generates them into structures which don't modify themselves after heat creates the stacks from the template. so no post 'translation' modification occurs.

Tasks propagate exceptions

Errors are returned in the form of exceptions (not return values). Wrapping exceptions is OK though.

[jh]: if they run in parallel can't you have multiple exceptions being thrown at once (from different paths), how is this handled? which exception finally exits the task system?

[zb]: It's up to the workflow. Our current workflows cancel all of the other tasks and then report the error to the code that ran the workflow. Just reporting the first error would be fine. (Note that only one error can happen at a time in the current system, so multiple simultaneous exceptions isn't an issue for us at present.)

[jh]: seems ok, agree that it should be up to the workflow, if its a parallal via threads workflow, then it will have the multiple possible exceptions, if its a yielding type of pattern, then it might not

Tasks can time out

A task can (optionally) be cancelled if it fails to complete within a given time. This should be (easily) configurable dynamically.

[jh]: sure

Task cancellation should cancel any subtasks

If a task is cancelled, any tasks that it has spawned also need to be cancelled.

[jh]: depending on answer to spawning question this seems ok.

Tasks can clean up after themselves

If a task is cancelled or timed out, it should have the chance to clean up (e.g. by catching an exception) before exiting.

[jh]: so this brings up a question of what is a task, and if a task needs to cleanup before it runs most of its code, why is it 1 instead of N tasks to begin with? and why is it running and then its cancelled, if something can run then be cancelled shouldn't the thing that ran before being cancelled be its own task and the thing that runs after be its own task as well?

[zb]: the point is that tasks shouldn't just die. e.g. if you send SIGINT to a (Python) program it doesn't just exit, the interpreter raises KeyboardInterrupt and you can handle it and clean up nicely if necessary; try/finally works, you can close open resources, log the error, dump cached data to the DB, &c. We need tasks to work the same way.

[jh]: i think this is manageable, providing a exit/cleanup strategy for tasks seems ok.

Tasks don't make debugging unnecessarily difficult

Given a debug log containing a stack trace, it should be easy to work out in which of several tasks running in parallel an error has occurred. Note that most tasks running in parallel are probably identical except for their arguments (e.g. they're all Resource.create(), but for different Resource objects).

[jh]: this seems like an application choice, not something a library should necessarily prescribe, if said library wants to run in a distributed manner then it may have to take hits with the ease of debugging. if it doesn't want to run in a distributed manner then it doesn't (or may not) have to take that hit.

[zb]: this is not talking about pdb-type debugging, it's more about making sure it is possible to e.g. identify the source of any log messages/errors. For example, the current system logs information of the following form whenever a task resumes (in this example, one task is calling a subtask):

heat.engine.scheduler: DEBUG: Task create_task from Stack "test_stack" running
heat.engine.scheduler: DEBUG: Task create from WaitConditionHandle "WaitHandle" running

And this information is inferred automatically using various heuristics - the application basically gets it for free. We also log when tasks start, complete, time out or are cancelled.

[jh]: sound good to me. although it still depends on people behaving nicely when using the library in the first place, the library can provide the hooks to do this, but can't really enforce them in the end :-)

Tasks can modify state

When a task modifies some piece of application state, it shouldn't be necessary to reload everything from the database in order for that to be reflected in other tasks and in the caller. (This implies that tasks run in the same process.)

[jh]: this seems like an application choice, not something a library should necessarily prescribe, if said library wants to run in a distributed manner then it may have to take hits with reloading data. if it doesn't want to run in a distributed manner then it doesn't (or may not) have to take that hit.

[zb]: Agreed that this is an application problem. I'm happy to take the performance hit, the issue is that I'm not sure that reloading data from the DB is even feasible in Heat as it is currently written. We still rely on too much application state :(

Tasks can write to the database

Preferably without opening a new connection, or creating the risk of stale DB caches (for other tasks/the caller) in sqlalchemy.

[jh]: sure

Tasks should be inherently thread-safe

Eventlet-style hackery doesn't count.

[jh]: unsure what this means, tasks are developer provided code, its not up to a task system to enforce that tasks should be thread-safe (how can it).

[zb]: Well, if only one thread is running, then that is inherently thread-safe. If tasks are running in multiple threads (or, to some extent, even greenthreads - although we have that problem anyway) then the developer of the code (which may well mean a third-party plugin author) has to start worrying about thread-safety. That's what we don't want.

[jh]: It seems like the pattern should be, if u want to use threads, use X pattern, it will use threads and your tasks will need to be thread-safe, if u don't fit this pattern, then use this other Y pattern instead. It doesn't feel right to try to force the usage of X or Y on developers. Right? I understand the need to isolate threads and to avoid them, just I feel like a library shouldn't say no threads for u, but should provide options for those that are ok with it and options for those that aren't.

[zb]: That sounds reasonable. In the case of Heat, the tasks are mainly running plugin code which (a) isn't particularly thread-safe now, and (b) will possibly be written by third-parties and therefore need to run in the simplest possible environment.

Existing integration tests must work

Most "unit" tests in Heat (and almost all of the important ones) are integration tests that typically involve running a whole workflow (not just a single task), e.g. creating a whole stack. These tests need to run in such a way that:

  • They are fairly representative of real use (e.g. tasks still run in parallel)
  • Mock objects are preserved
  • Calls to mock objects are recorded and ordered correctly, even across tasks
  • Exceptions are propagated back to the test

[jh]: ok, this seems like heat integration requirements

[zb]: yes, but this is also the most critical thing to us being able to adopt another solution in Heat.

Workflow requirements

There are two basic workflows implemented in Heat:

  1. PollingTaskGroup: starts a number of tasks in parallel and waits until they are all complete.
  2. DependencyTaskGroup: runs a parallelised workflow for an arbitrary dependency graph.

Both workflows are implemented in scheduler.py.

[jh]: ok, this is implemented in taskflow.

Workflows are runtime configurable

Dependency graphs are built at runtime based on the user's template, so it must be easy to dynamically put together a workflow.

[jh]: so this is not really runtime? this is arbitrary flow construction while the application is running, but not modification while the flow is running?

[zb]: Correct. We don't need to modify the workflow after it's constructed, but it is not hard-coded in the source.

Task dependencies can be arbitrarily complex

Parallel tasks should respect dependencies in the form of an arbitrary directed acyclic graph.

[jh]: sure

All workflow tasks are cancelled on failure

In the event of a task reporting a failure, all other tasks in the same workflow need to be cancelled.

[jh]: so this is a cancellation policy that we can support (not all users want to cancel all other tasks in the flow when one task fails)

[zb]: +1

Workflows can be cancelled

The user may decide to stop a workflow that is in progress; this should cancel all of the tasks within the workflow.

[jh]: sure

Future requirements

The are some features that we either don't have or don't use yet, but would like to use in the future.

Task requirements

Tasks have synchronisation points

Currently we add a dependency between any two resources where one gets data from the other (so the data reader can only start once the data source has finished). However, in most cases the data is actually available much earlier.

This is fairly easy to implement in the current coroutine system, since the explicit "yield" acts as a synchronisation point.

[jh]: i think something like this can be accommodated (without needing yield concepts), although needs more investigation.

Tasks can be retried

If a task fails, we want to be able to (optionally) retry it. This should be (easily) configurable dynamically.

[jh]: sure

Workflow requirements

Workflows can be rolled back

If a task in the workflow fails then, as well as cancelling all tasks, any tasks that have been started should be rolled back.

[jh]: sure, although this is only one of many rollback strategies (rolling back all started when any task fails)

[zb]: +1

Workflows report their state

Before a workflow begins rolling back, the caller needs to have the opportunity to:

  1. record this; and
  2. optionally, cancel it

So if e.g. a resource fails, we want to put the stack into the rollback state before rolling back.

(This is the hardest part to implement nicely in the current system.)

[jh]: sure

(asalkeld) Workflow Logging

I guess only partly related to workflow, but... If you have multiple workflows running concurrently the logging is really confusing and never available to the user. Suggestion:

  1. when a workflow starts create a log destination
  2. all tasks in the workflow log to that destination
  3. we can then return that / save it to db/swift

Neat as the user can retrieve it as a single entity, basically a nice record of how there request was handled/or failed.

[jh]: seems possible, might not be that hard?

[zb]: agreed, although there is probably a bunch of stuff we need to put in place first to do this, and it may be possible/desirable to implement it independently of the workflow system. For example, this shouldn't just be a straight store of logs: they should be localised in the user's language, not the system language.