Jump to: navigation, search

Mistral/Cloud Cron details

< Mistral
Revision as of 05:59, 30 October 2013 by Rakhmerov (talk | contribs)

Tasks Scheduling - Cloud Cron

Problem Statement

Pretty often while administering a network of computers there’s a need to establish periodic on-schedule execution of maintenance jobs for doing various kinds of work that otherwise would have to be started manually by a system administrator. The set of such jobs ranges widely from cleaning up needless log files to health monitoring and reporting. One of the most commonly known tools in Unix world to set up and manage those periodic jobs is Cron. It perfectly fits the uses cases mentioned above. For example, using Cron we can easily schedule any system process for running every even day of week at 2.00 am. For a single machine it’s fairly straightforward how to administer jobs using Cron and the approach itself has been adopted by millions of IT folks all over the world. Now what if we want to be able to set up and manage on-schedule jobs for multiple machines? It would be very convenient to have a single point of control over their schedule and themselves (i.e. “when” and “what”). Furthermore, when it comes to a cloud environment the cloud provides additional RESTful services (and not only RESTful) that we may also want to call in on-schedule manner along with operating system local processes.

Picture 1. Managing on-schedule local jobs manually.


Mistral service for OpenStack cloud addresses this demand naturally. Its capabilities allow configuring any number of tasks to be run according to a specified schedule in a scale of a cloud. Here’s the list of some typical jobs we can choose from: Run a shell script on specified virtual instances (e.g. VM1, VM3 and VM27). Run an arbitrary system process on specified instances. Start/Reboot/Shutdown instances. Call an accessible cloud services (e.g. Trove). Add instances to a load balancer. Deploy an application on specified instances. This list is not full and any other user meaningful jobs can be added. To make it possible Mistral provides a plugin mechanism so that it’s pretty easy to add new functionality via supplying new Mistral plugins.

Picture 2. Mistral provides a single point of control over on-schedule cloud jobs.

Basically, Mistral acts as a mediator between a user, virtual instances and cloud services in a sense that it brings capabilities over them like task management (start, stop etc.), task state and execution monitoring (success, failure, in progress etc.) and task scheduling. Since Mistral is a distributed workflow engine those types of jobs listed above can be combined in a single logical unit, a workflow. For example, we can tell Mistral to take care of the following workflow for us: On every Monday at 1.00 am start grepping phrase “Hello, Mistral!” from log files located at /var/log/myapp.log on instances VM1, VM30, VM54 and put the results in Swift. On success: Generate the report based on the data in Swift. On success: Send the generated report to an email address. On failure: Send an SMS with error details to a system administrator. On failure: Send an SMS with error details to a system administrator. A workflow similar to the one described above may be of any complexity but still considered a single task from a user perspective. However, Mistral is smart enough to analyze the workflow and identify individual sequences that can be run in parallel thereby taking advantage of distribution and load balancing under the hood. It is worth noting that Mistral is nearly linearly scalable and hence is capable to schedule and process virtually any number of tasks simultaneously.


So in this use case description we tried to show how Mistral capabilities can be used for scheduling different user tasks in a cloud scale. Semantically it would be correct to call this use case Distributed Con or Cloud Cron. One of the advantages of using a service like Mistral in case like this is that along with base functionality to schedule and execute tasks it provides additional capabilities like navigating over task execution status and history (using web UI or REST API), replaying already finished tasks, on-demand task suspension and resumption and many other things that are useful for both system administrators and application developers.