Jump to: navigation, search

Convection

Revision as of 07:44, 17 April 2013 by Kebray (talk | contribs)

PROPOSAL ONLY: Workflow-as-a-Service (Convection)

Please note that this is a PROPOSAL ONLY. This is not yet implemented.

What is Convection

Convection is a proposal for a new Workflow-as-a-Service project for OpenStack clouds. Convection could be a public facing API service that provides task and state management capabilities, enabling OpenStack API consumers to build complex multi-step applications running on an OpenStack cloud which could be a public cloud, private cloud, or a hybrid cloud. Convection could also be a service that other OpenStack projects leverage to perform work. e.g. One possible method for Heat to perform orchestration of standing up cloud stacks could be to leverage a Workflow service for task oriented steps of spinning up and connecting cloud resources. Conversely, customers wanting to run meta-workflows could leverage Heat as one task in their meta-workflow where orchestration of a stack is one task in a larger meta-workflow.

Why the name Convection?

Convection was a name proposed by Tim Simpson (Reddwarf developer). The idea is that (1) Convection "conveys," implying organization of order; (2) Convection is often thought of in context of ovens which produce heat, and the OpenStack project Heat could be one possible consumer of Workflow where task flows could be analogous to air flow in a convection oven.

What is a Workflow?

Definition Note: There are static workflows and dynamic workflows.

Isn't Workflow an overloaded term? YES! There are misconceptions about what the term Workflow actually means, and it is often used to mean things different from the definition above. For Convection conversation purposes, let's define the following terminology:

Workflow Terms

  1. Deterministic (Static) Workflow: In an academic context, a workflow is sometimes described as a collection of ordered tasks that occur with a defined start, order, and end. Some tasks may be able to execute in parallel, but a pre-determined tree of workflow steps (and parellel branches) is known before runtime, and the flow of the tree is followed upon every execution of the workflow.
  2. Event Based (Dynamic) Workflow: A collection of tasks, some of which may or may not have a required order of execution, where task execution is coordinated through communication of events by individual task start/stop/status notifications. In an event based workflow system, there could be a central task execution coordinator that handles listening for events of task completion and sending events for new tasks to start. Or, code that executes an individual task can encode its own logic to know when to execute based off events directly sent from other tasks.


I do not wish to specify the idealistic implementation here in this PROPOSAL. I simply want to document some workflow concepts, and leverage the community for collaborative design of a useful Workflow system for OpenStack based workloads.

Use Cases for Workflow-as-a-Service

We see merit in a standalone workflow service that would allow for a variety of functionality to be carried out by other services (e.g. Heat could be one service to make use of Workflow). While the OpenStack project Heat focuses on orchestration of resources and resource connections, workflows could be responsible for:

  • A sequence of tasks that have a start and end
  • Batch processes (multiple sets of sequences of tasks with starts and ends)
  • A persistent job/process (for example an Auto-Scale policy) that remains running until manually terminated
  • A job to run for a specified duration (such as run this automated stress test for 2 days, then exit).


At a high level, one can consider workflows as being "batch" (with start/end) and "long running" which execute for some duration or until some triggering event occurs.

Potential Workflow-as-a-Service Capabilities

The following is a list of PROPOSED capabilities for Convection. These are not necessarily required for a minimum viable service and are just ideas of what a workflow service might entail:

Conceptual Components

  • Workflow Engine: A workflow engine could provide generic task and state management capabilities. A workflow engine could act as a central state coordinator, enabling workflow client applications to be distributed across public cloud and on-premise deployments. Workflow clients offload state management to the Workflow service thereby allowing the workflow clients to be stateless, scalable, and tolerant of process and client failures. The workflow engine could support configurable constraints at both the workflow and task level, e.g. timeouts, retry count, retry intervals, etc.
  • DSL to encapsulate workflow logic:A workflow system does not need to execute workflow logic, but it could as a value added enhancement. For example, in a simplistic implementation of a Workflow service, the service itself could maintain task state and leave it up to the clients of workflows to implement the business logic of workflow execution. An enhanced version of a Workflow service could allow a client to provide workflow business logic to the workflow service in a declarative DSL and the workflow engine could execute enforcement of the workflow business logic (e.g. notifying tasks when to run, stop, restart, etc.).
  • Command Line Tool / Dashboard: Since OpenStack is a cloud operating system, some operating system tools like top to see a list of running jobs in the cloud could be very useful. Tools could provide a drill down of existing workflows, currently running workflows, workflows in states of various execution: running, completed, failed, ready-to-run... and provide the ability to resubmit/retry failed workflow jobs. Workflow tools could also provide analytics -- metrics which could help identify performance bottlenecks or common areas of failures in a workflow that is repeated over and over. Some possible metrics could be: average execution time for a workflow, average execution time for individual workflow tasks, task/workflow failure rates, etc.
  • Workflow Repository: A workflow repository could expose a set of pre-determined common task flows (e.g. spin up a server and add it to a load balancer). The Workflow Repository facilitates reuse and makes available a compelling set of pre-defined workflows.


One proposal for a Workflow service could be that it not require clients to upload code to the workflow service. Clients would have full flexibility in the language/execution/deployment for the workflow tasks. The only requirement is that the task workers are able to access the REST API’s exposed by the Workflow service and/or receive notifications from the Workflow system (e.g. via webhooks or some other mechanism).

Workflow Engine

Conceptually, a Workflow consists of a set of tasks that need to execute in a certain order. The order in which the tasks execute could be pre-determined; the ordering could also be determined dynamically based on execution results of a previous task.

Capabilities

A Workflow Engine could provide the following features:

  1. Register a Workflow and the tasks associated with the workflow via REST API calls
  2. Ability to specify configurable constraints at the workflow and the task level i.e. timeouts, retry count, retry interval, etc.
  3. Invoke workflow instances
  4. Query the state of a Workflow instance
  5. Query for a list of all the running workflow instances for a given workflow

definition

  1. Support versioning of Workflow definitions
  2. Cancel a workflow instance
  3. Support multiple, parallel invocations of workflows
  4. A workflow instance could invoke another workflow instance [Master-child

workflows]

Datastore

The following information could be stored in the Workflow service datastore:

  1. List of registered workflows, tasks, and the associated constraints like timeouts, retries
  2. Execution state for the workflow instances (completed, running, error, ready to run)
  3. Scheduled Task Queues. The workflow engine could maintain a task queue for each of the registered task types. The workflow engine could publish task items to the task queues when a task needs to be scheduled for execution
  4. Workflow Process Context containing the runtime information associated with a given workflow instance i.e. the input data that came from the application that invoked the workflow, the output data generated by the workflow tasks, and any other data needed for administering the workflow instance, like the start time, running duration, etc.


Conceptual Diagram

The diagram below depicts a possible interaction between the Workflow engine and a workflow client making use of the workflow service. The green boxes are implemented by the workflow client.

Workflow.png