RACK
Contents
RACK (Real Application Centric Kernel)
Project Resources
Resource | English | Japanese |
Wiki | https://wiki.openstack.org/wiki/RACK | https://wiki.openstack.org/wiki/RACK/ja |
Source code | https://github.com/stackforge/rack | |
Deployment guide | https://github.com/stackforge/rack/tree/master/tools/setup | https://github.com/stackforge/rack/blob/master/tools/setup/README_ja.md |
Sample applications | ||
Python RACK client | https://github.com/stackforge/python-rackclient |
OpenStack Native Application
The present applications were designed at "before the cloud". As those applications are not intended for cloud, they need other tools such as Chef, Puppet, Ansible, Serf, etc to utilize the cloud power. Thus the systems tend to become more complex on the cloud. We must think the "after the cloud" application now.
The “after the cloud” application we think satisfies below.
- The application logic determines amount of required resources by itself.
- The application logic iterates allocations/releases of resources from cloud.
- Those allocations/releases behave like creation/deletion of process.
We named the application with such behavior “OpenStack Native Application”.
What is RACK?
RACK enables an application to control VMs like a Linux process. It provides an application with PID, parent-child relationship, “Fork” / “Kill” VMs capabilities, interVM communication and message exchange without VM’s IP address. Ultimately, RACK enables you to implement a large scale distributed system in a variety of programming languages on OpenStack.
VM like a Linux process?
Look at the picture below. It indicates the image of Linux process inside RACK. Generally, an executable is made through the process of compile and link, and operating system manages it as process when it is loaded into memory. Inside RACK, VM image represents executable and it includes OS and middleware as well as an application. When the executable (actually VM image) is launched on OpenStack, RACK gives it a process ID (PID). Additionally, when the process executes "Fork", a child process is launched, and given PID and a parent PID (PPID).
Features
RACK provides some features as follows.
- Data Structure
In order to manage a VM like a Linux process, RACK adds some additional attributes such as PID(process ID), PPID(parent's process ID) and GID(process group ID) to it.
- Interprocess Communication
Processes can send a message with each other without knowing each other's IP address.
- Shared Memory
Processes can share the data such as the data to process and their outputs.
- File System
There are multiple ways to use this, storing the data file to process and the output file, and sharing the data file between some processes.
In near future we plan to provide following features.
- Pipeline
The feature that chains a process to other by their standard streams just like Unix pipeline.
- Zombie process Collector
When a process failed to be killed, it executes an endless loop processing, or its parent process is killed, it becomes a zombie process. Zombie process collector detects these processes and clean up.
- Compiler
VM image creation automation tool. Compiler assists you to create VM image adapted RACK.
- Debugger
This is useful for developing an application. Process is typically killed soon when it finishes job, so you can't examine program state and track down the origin of the problem. Debugger supports these tasks.
Architecture Overview
RACK provides above features as restful service. Application only communicates with RACK via useful library we provide.
Pseudo Code
This is the image of program adapted RACK. You can simply develop a distributed application as below. You don't need to write complex and unique code. Especially, you don't need to write cloud-aware code, that is you don't need to know IP address.
Use Cases
You can use RACK in many cases. Followings are some examples.
- You can implement a new architecture application. For example, you can build an application that calculates the necessary amount of computing resource(i.e. instance) depending on the data to process and launches additional instances dynamically. Then, the data will be processed very quickly since these instances work in parallel. This new architecture application is suitable for processing a large amount of data.
- You can integrate existing system such as batch system with Hadoop and Web application using RACK. For example, RACK enables you to deploy Hadoop cluster easily and add autoscale function to your Web applications.
Benchmark of OpenStack Native Application
We conducted a benchmark test of OpenStack Native Application that behaves like the first example of above usecases. The following graph shows the result. We can see one of the characteristics of OpenStack Native Application from this graph. This application scales worker processes out depending on the number of dataset, and these processes work in parallel, so the total execution time is held constant. Theoretically, the more it scales out, the shorter the execution time is.
It's good for both customers and cloud providers. For customers, it's less expensive to use a lot of low-spec instances to process data fast than a high spec instance. For cloud providers, a large amount of resources will come to be consumed.