Jump to: navigation, search

Difference between revisions of "Meteos/Usecase"

(3. Upload Raw Data)
(Predict sales using Meteos)
Line 8: Line 8:
 
In this example, sales is the desired output value.
 
In this example, sales is the desired output value.
  
=== 1. Create Template of Experiment ===
+
=== 1. Create a template of experiment ===
  
Create Template of Experiment.
+
Create template of experiment.
 
Experiment is a workspace of Machine Learning.
 
Experiment is a workspace of Machine Learning.
  
Line 38: Line 38:
 
  +-------------+-----------------------------------------+
 
  +-------------+-----------------------------------------+
  
=== 2. Create Experiment from Template ===
+
=== 2. Create a experiment from template ===
  
Create Experiment using template created in the above step.
+
Create a experiment using template created in the above step.
  
 
'''$ cat json/create_experiment.json'''
 
'''$ cat json/create_experiment.json'''
Line 64: Line 64:
 
  +-------------+--------------------------------------+
 
  +-------------+--------------------------------------+
  
Meteos creates a Experiment using OpenStack Sahara spark plugin.
+
Meteos creates a experiment using OpenStack Sahara spark plugin.
 
User can see a sahara cluster and nova VMs created by Meteos.
 
User can see a sahara cluster and nova VMs created by Meteos.
  
Line 83: Line 83:
 
  +--------------------------------------+----------------------------+---------+------------+-------------+-----------------------------------+
 
  +--------------------------------------+----------------------------+---------+------------+-------------+-----------------------------------+
  
=== 3. Upload Raw Data ===
+
=== 3. Upload a raw data ===
  
Upload raw data to swift.
+
Upload a raw data (sales data) to OpenStack Swift.
 
Raw data shows "sales", "day", "month", "year", "day of week", "parameter which indicates weather", "degree", "humidity" from left.
 
Raw data shows "sales", "day", "month", "year", "day of week", "parameter which indicates weather", "degree", "humidity" from left.
  
Line 103: Line 103:
 
  meteos-test-data.txt
 
  meteos-test-data.txt
  
=== 4. Download Raw Data to Experiment ===
+
=== 4. Download a raw data to experiment ===
  
 
Download a raw data from swift to experiment.
 
Download a raw data from swift to experiment.
Downloaded data is distributed in HDFS (Hadoop Distributed File System).
+
Downloaded data is distributed in HDFS (Hadoop Distributed File System).
  
 
'''$ cat json/download_dataset.json'''
 
'''$ cat json/download_dataset.json'''
Line 133: Line 133:
 
  +-------------+--------------------------------------+
 
  +-------------+--------------------------------------+
  
=== 5. Parse Raw Data ===
+
=== 5. Parse a raw data ===
  
 
Parse a raw data to enable MLlib (Apache Spark's scalable machine learning library) to handle it.
 
Parse a raw data to enable MLlib (Apache Spark's scalable machine learning library) to handle it.
 
Requirement format depends on machine learning algorithms.
 
Requirement format depends on machine learning algorithms.
In this example, convert to a list format using map method and remove unused field using filter method.
+
In this example, convert to a list format using map method and remove unused fields using filter method.
  
 
'''$ cat json/parse_dataset.json'''
 
'''$ cat json/parse_dataset.json'''
Line 150: Line 150:
 
'''$ meteos dataset-parse --json json/parse_dataset.json'''
 
'''$ meteos dataset-parse --json json/parse_dataset.json'''
  
=== 6. Create Prediction Model ===  
+
=== 6. Create a prediction model ===  
  
 
'''$ cat json/create_model.json'''
 
'''$ cat json/create_model.json'''
Line 201: Line 201:
 
  +-------------+--------------------------------------+
 
  +-------------+--------------------------------------+
  
Retrieve a predicted Data as a stdout of job execution.
+
Retrieve a predicted data as a stdout of job execution.
  
 
'''$ meteos job-show c38b9c22-d72c-4255-8236-7e319d351fad'''
 
'''$ meteos job-show c38b9c22-d72c-4255-8236-7e319d351fad'''

Revision as of 05:41, 20 October 2016

Predict sales using Meteos

In this example, user create a prediction model which predict sales by using Linear Regression. Linear Regression is one of the algorithms in supervised learning.

"Supervised learning is the machine learning task of inferring a function from labeled training data. The training data consist of a set of training examples. In supervised learning, each example is a pair consisting of an input object (typically a vector) and a desired output value (also called the supervisory signal). (From Wikipedia https://en.wikipedia.org/wiki/Supervised_learning)

In this example, sales is the desired output value.

1. Create a template of experiment

Create template of experiment. Experiment is a workspace of Machine Learning.

$ cat json/create_template.json

{
   "display_name": "example-template",
   "display_description": "This is a sample template of experiment",
   "image_id" : "c48f5dba-45be-4165-9825-f4564fecebcd",
   "master_nodes_num": 1,
   "worker_nodes_num": 2,
   "spark_version": "1.6"
}

$ meteos template-create --json json/create_template.json

+-------------+-----------------------------------------+
| Property    | Value                                   |
+-------------+-----------------------------------------+
| cluster_id  | 0984a3e7-cd8b-4b7e-b4f0-f58dc4fc6c28    |
| created_at  | 2016-10-20T02:10:13.483231              |
| description | This is a sample template of experiment |
| id          | 26cbc33b-179e-4d22-8b1b-23c9a305f196    |
| image_id    | c48f5dba-45be-4165-9825-f4564fecebcd    |
| name        | example-template                        |
| project_id  | 5f011d076a6f4a328989c57ac7b4e501        |
| status      | available                               |
+-------------+-----------------------------------------+

2. Create a experiment from template

Create a experiment using template created in the above step.

$ cat json/create_experiment.json

{
   "display_name": "example-experiment",
   "display_description": "This is a sample experiment",
   "key_name": "key1",
   "net_id": "bc85cb2a-53ad-4375-801e-ed507d416e09",
   "template_id": "26cbc33b-179e-4d22-8b1b-23c9a305f196"
}

$ meteos experiment-create --json json/create_experiment.json

+-------------+--------------------------------------+
| Property    | Value                                |
+-------------+--------------------------------------+
| created_at  | 2016-10-20T02:11:51.778930           |
| description | This is a sample experiment          |
| id          | 0550a7db-b148-4319-bb35-16a9e4500d4a |
| name        | example-experiment                   |
| project_id  | 5f011d076a6f4a328989c57ac7b4e501     |
| status      | available                            |
+-------------+--------------------------------------+

Meteos creates a experiment using OpenStack Sahara spark plugin. User can see a sahara cluster and nova VMs created by Meteos.

$ sahara cluster-list

+------------------+--------------------------------------+--------+------------+
| name             | id                                   | status | node_count |
+------------------+--------------------------------------+--------+------------+
| cluster-0550a7db | 13b04f2e-5605-4b6b-b9fb-1ddf3dd85550 | Active | 3          |
+------------------+--------------------------------------+--------+------------+

$ nova list

+--------------------------------------+----------------------------+---------+------------+-------------+-----------------------------------+
| ID                                   | Name                       | Status  | Task State | Power State | Networks                          |
+--------------------------------------+----------------------------+---------+------------+-------------+-----------------------------------+
| f526ec88-9000-4ddd-8f3b-ab60abbf7ca6 | cluster-0550a7db-master-0  | ACTIVE  | -          | Running     | private=10.0.0.140, 192.168.0.99  |
| 6812aaa6-430d-4622-acd6-a558e1505e05 | cluster-0550a7db-workers-0 | ACTIVE  | -          | Running     | private=10.0.0.141, 192.168.0.92  |
| 825727cf-f115-4d50-af26-99224ec94f8f | cluster-0550a7db-workers-1 | ACTIVE  | -          | Running     | private=10.0.0.139, 192.168.0.100 |
+--------------------------------------+----------------------------+---------+------------+-------------+-----------------------------------+

3. Upload a raw data

Upload a raw data (sales data) to OpenStack Swift. Raw data shows "sales", "day", "month", "year", "day of week", "parameter which indicates weather", "degree", "humidity" from left.

$ head meteos-test-data.txt

500000,1,10,2016,6,0,68,50
550000,2,10,2016,0,1,68,90
300000,3,10,2016,1,0,60,55
350000,4,10,2016,2,2,58,87
0,5,10,2016,3,3,58,60 # a horiday
400000,6,10,2016,4,3,60,60
330000,7,10,2016,5,2,62,87
550000,8,10,2016,6,1,66,92
600000,9,10,2016,0,1,55,93
330000,10,10,2016,1,0,57,55

$ swift upload meteos meteos-test-data.txt

meteos-test-data.txt

4. Download a raw data to experiment

Download a raw data from swift to experiment. Downloaded data is distributed in HDFS (Hadoop Distributed File System).

$ cat json/download_dataset.json

{
   "display_name": "sample-data",
   "display_description": "This is a sample dataset",
   "account": "demo:user01",
   "password": "0251c36e80584efd",
   "authurl": "http://192.168.0.4:5000/v2.0",
   "container_name": "meteos",
   "object_name": "meteos-test-data.txt",
   "experiment_id": "0550a7db-b148-4319-bb35-16a9e4500d4a"
}

$ meteos dataset-download --json json/download_dataset.json

+-------------+--------------------------------------+
| Property    | Value                                |
+-------------+--------------------------------------+
| created_at  | 2016-10-20T02:17:27.831814           |
| description | This is a sample dataset             |
| id          | bd134722-22cd-4247-8e20-538933e0975d |
| name        | sample-data                          |
| project_id  | 5f011d076a6f4a328989c57ac7b4e501     |
| status      | available                            |
+-------------+--------------------------------------+

5. Parse a raw data

Parse a raw data to enable MLlib (Apache Spark's scalable machine learning library) to handle it. Requirement format depends on machine learning algorithms. In this example, convert to a list format using map method and remove unused fields using filter method.

$ cat json/parse_dataset.json

{
   "id": "bd134722-22cd-4247-8e20-538933e0975d",
   "method": "parse",
   "params": ["method": "map", "ars": "lambda l: l.split(',')",
              "method": "filter", "ars": "lambda l: l[0]! = '0'"],
   "experiment_id": "0550a7db-b148-4319-bb35-16a9e4500d4a"
}

$ meteos dataset-parse --json json/parse_dataset.json

6. Create a prediction model

$ cat json/create_model.json

{
   "display_name": "sample-lr-model",
   "display_description": "This is a sample model",
   "dataset_id": "bd134722-22cd-4247-8e20-538933e0975d",
   "method": "LinearRegression",
   "args": "{'numIterations': 10, 'desired_output':0}",
   "experiment_id": "c08027ab-2ea5-4b57-840c-563908cfca46"
}

$ meteos model-create --json json/create_model.json

+-------------+--------------------------------------+
| Property    | Value                                |
+-------------+--------------------------------------+
| created_at  | 2016-10-20T02:17:27.831814           |
| description | This is a sample model               |
| id          | c48f5dba-45be-4165-9825-f4564fecebcd |
| name        | sample-lr-mode                       |
| project_id  | 5f011d076a6f4a328989c57ac7b4e501     |
| status      | available                            |
+-------------+--------------------------------------+

7. Predict

$ cat json/predict_data.json

{
   "display_name": "predict job",
   "display_description": "This is a sample job",
   "model_id": "bd134722-22cd-4247-8e20-538933e0975d",
   "method": "predict",
   "args": "11,10,2016,2,0,57,58",
   "experiment_id": "c08027ab-2ea5-4b57-840c-563908cfca46"
}

$ meteos job-create --json json/predict_data.json

+-------------+--------------------------------------+
| Property    | Value                                |
+-------------+--------------------------------------+
| created_at  | 2016-10-20T02:17:37.644234           |
| description | This is a sample job                 |
| id          | c38b9c22-d72c-4255-8236-7e319d351fad |
| name        | predict_job                          |
| project_id  | 5f011d076a6f4a328989c57ac7b4e501     |
| status      | executing                            |
+-------------+--------------------------------------+

Retrieve a predicted data as a stdout of job execution.

$ meteos job-show c38b9c22-d72c-4255-8236-7e319d351fad

+----------------------------+-------------------------------------------------+
| Property                   | Value                                           | 
+----------------------------+-------------------------------------------------+
| stdout                     | Value : (325944.477851)                         |
| stderr                     |                                                 |
+----------------------------+-------------------------------------------------+