Difference between revisions of "Distributed Task Management With RPC"

Revision as of 21:34, 10 March 2014

Distributed task management built on RPC can be easily integrated with OpenStack (oslo.messaging.rpc can be used). RPC allows to build reliable and scalable system. This document proposes an architecture of distributed flow that can run tasks simultaneously on multiple workers. The main goal is to provide a such architecture that will allow to replace a simple flow with a distributed flow without changing a code.

See more details @ https://etherpad.openstack.org/p/TaskFlowWorkerBasedEngine

Architecture

Client is a machine that runs a distributed flow.
Worker is a server that executes distributed flows’ tasks.
Distributed task is a flow task that performs a remote procedure call.
Remote task is a task that runs on the worker side and executes some code to make a flow progress.

Distributed flow consists of client and workers. Client runs a flow. When the flow want to start a new task it makes RPC call to workers and passes client's ednpoint and task's arguments. One of the workers accepts the task and send a confirmation to the client. Then it starts to execute the task and sends heartbeats to the client. Client listens the worker. When the task is done the worker sends a result. The client considers worker as failed if it hadn't been receiving a task status message for a timeout period. There is a flow architecture on the following image.

Distributed Flow implementation

We don't want to make any difference between distributed and non-distributed flow descriptions. The difference should be only in flow engines. Distributed engine should work very similar to SingleThreadedEngine. But instead of TaskAction the DistributedTaskAction should be used. The DistributedTaskAction should make an RPC call instead of running Task.execute() method. On the worker's side a special interface should be implemented for each task.

   class MyTaskEndpoint(object):
       target = messaging.Target(topic='MyTask', server='server1')
       def execute(self, client_endpoint, task_args):
            RPCClient.register_listener(client_endpoint)
            task = tasks.MyTask()
            result = task.execute(**task_args)
            RPCClient.send_result(result, client_endpoint)
       def revert(self, client_endpoint, task_args):
            heartbeat.register_listener(client_endpoint)
            task = tasks.MyTask()
            task.revert(**task_args)
            RPCClient.notify_reverted(client_endpoint)

This interface has its unique topic that corresponds to the name of the task. It contains execute and revert methods. These methods accept a client endpoint to send a heartbeats to the client, and task args required to execute the task.

Revision as of 21:34, 10 March 2014 (view source) Harlowja (talk \| contribs) ← Older edit		Revision as of 21:34, 10 March 2014 (view source) Harlowja (talk \| contribs) Newer edit →
Line 4:		Line 4:

	<br />		<br />
−
−
	'''Architecture'''		'''Architecture'''