Jump to: navigation, search

Difference between revisions of "Meteos/Usecase"

(7. Predict)
(3. Upload Raw Data)
Line 88: Line 88:
 
Raw data shows "sales", "day", "month", "year", "day of week", "parameter which indicates weather", "degree", "humidity" from left.
 
Raw data shows "sales", "day", "month", "year", "day of week", "parameter which indicates weather", "degree", "humidity" from left.
  
'''$ cat meteos-test-data.txt'''
+
'''$ head meteos-test-data.txt'''
 
  500000,1,10,2016,6,0,68,50
 
  500000,1,10,2016,6,0,68,50
 
  550000,2,10,2016,0,1,68,90
 
  550000,2,10,2016,0,1,68,90

Revision as of 05:38, 20 October 2016

Predict sales using Meteos

In this example, user create a prediction model which predict sales by using Linear Regression. Linear Regression is one of the algorithms in supervised learning.

"Supervised learning is the machine learning task of inferring a function from labeled training data. The training data consist of a set of training examples. In supervised learning, each example is a pair consisting of an input object (typically a vector) and a desired output value (also called the supervisory signal). (From Wikipedia https://en.wikipedia.org/wiki/Supervised_learning)

In this example, sales is the desired output value.

1. Create Template of Experiment

Create Template of Experiment. Experiment is a workspace of Machine Learning.

$ cat json/create_template.json

{
   "display_name": "example-template",
   "display_description": "This is a sample template of experiment",
   "image_id" : "c48f5dba-45be-4165-9825-f4564fecebcd",
   "master_nodes_num": 1,
   "worker_nodes_num": 2,
   "spark_version": "1.6"
}

$ meteos template-create --json json/create_template.json

+-------------+-----------------------------------------+
| Property    | Value                                   |
+-------------+-----------------------------------------+
| cluster_id  | 0984a3e7-cd8b-4b7e-b4f0-f58dc4fc6c28    |
| created_at  | 2016-10-20T02:10:13.483231              |
| description | This is a sample template of experiment |
| id          | 26cbc33b-179e-4d22-8b1b-23c9a305f196    |
| image_id    | c48f5dba-45be-4165-9825-f4564fecebcd    |
| name        | example-template                        |
| project_id  | 5f011d076a6f4a328989c57ac7b4e501        |
| status      | available                               |
+-------------+-----------------------------------------+

2. Create Experiment from Template

Create Experiment using template created in the above step.

$ cat json/create_experiment.json

{
   "display_name": "example-experiment",
   "display_description": "This is a sample experiment",
   "key_name": "key1",
   "net_id": "bc85cb2a-53ad-4375-801e-ed507d416e09",
   "template_id": "26cbc33b-179e-4d22-8b1b-23c9a305f196"
}

$ meteos experiment-create --json json/create_experiment.json

+-------------+--------------------------------------+
| Property    | Value                                |
+-------------+--------------------------------------+
| created_at  | 2016-10-20T02:11:51.778930           |
| description | This is a sample experiment          |
| id          | 0550a7db-b148-4319-bb35-16a9e4500d4a |
| name        | example-experiment                   |
| project_id  | 5f011d076a6f4a328989c57ac7b4e501     |
| status      | available                            |
+-------------+--------------------------------------+

Meteos creates a Experiment using OpenStack Sahara spark plugin. User can see a sahara cluster and nova VMs created by Meteos.

$ sahara cluster-list

+------------------+--------------------------------------+--------+------------+
| name             | id                                   | status | node_count |
+------------------+--------------------------------------+--------+------------+
| cluster-0550a7db | 13b04f2e-5605-4b6b-b9fb-1ddf3dd85550 | Active | 3          |
+------------------+--------------------------------------+--------+------------+

$ nova list

+--------------------------------------+----------------------------+---------+------------+-------------+-----------------------------------+
| ID                                   | Name                       | Status  | Task State | Power State | Networks                          |
+--------------------------------------+----------------------------+---------+------------+-------------+-----------------------------------+
| f526ec88-9000-4ddd-8f3b-ab60abbf7ca6 | cluster-0550a7db-master-0  | ACTIVE  | -          | Running     | private=10.0.0.140, 192.168.0.99  |
| 6812aaa6-430d-4622-acd6-a558e1505e05 | cluster-0550a7db-workers-0 | ACTIVE  | -          | Running     | private=10.0.0.141, 192.168.0.92  |
| 825727cf-f115-4d50-af26-99224ec94f8f | cluster-0550a7db-workers-1 | ACTIVE  | -          | Running     | private=10.0.0.139, 192.168.0.100 |
+--------------------------------------+----------------------------+---------+------------+-------------+-----------------------------------+

3. Upload Raw Data

Upload raw data to swift. Raw data shows "sales", "day", "month", "year", "day of week", "parameter which indicates weather", "degree", "humidity" from left.

$ head meteos-test-data.txt

500000,1,10,2016,6,0,68,50
550000,2,10,2016,0,1,68,90
300000,3,10,2016,1,0,60,55
350000,4,10,2016,2,2,58,87
0,5,10,2016,3,3,58,60 # a horiday
400000,6,10,2016,4,3,60,60
330000,7,10,2016,5,2,62,87
550000,8,10,2016,6,1,66,92
600000,9,10,2016,0,1,55,93
330000,10,10,2016,1,0,57,55

$ swift upload meteos meteos-test-data.txt

meteos-test-data.txt

4. Download Raw Data to Experiment

Download a raw data from swift to experiment. Downloaded data is distributed in HDFS (Hadoop Distributed File System).

$ cat json/download_dataset.json

{
   "display_name": "sample-data",
   "display_description": "This is a sample dataset",
   "account": "demo:user01",
   "password": "0251c36e80584efd",
   "authurl": "http://192.168.0.4:5000/v2.0",
   "container_name": "meteos",
   "object_name": "meteos-test-data.txt",
   "experiment_id": "0550a7db-b148-4319-bb35-16a9e4500d4a"
}

$ meteos dataset-download --json json/download_dataset.json

+-------------+--------------------------------------+
| Property    | Value                                |
+-------------+--------------------------------------+
| created_at  | 2016-10-20T02:17:27.831814           |
| description | This is a sample dataset             |
| id          | bd134722-22cd-4247-8e20-538933e0975d |
| name        | sample-data                          |
| project_id  | 5f011d076a6f4a328989c57ac7b4e501     |
| status      | available                            |
+-------------+--------------------------------------+

5. Parse Raw Data

Parse a raw data to enable MLlib (Apache Spark's scalable machine learning library) to handle it. Requirement format depends on machine learning algorithms. In this example, convert to a list format using map method and remove unused field using filter method.

$ cat json/parse_dataset.json

{
   "id": "bd134722-22cd-4247-8e20-538933e0975d",
   "method": "parse",
   "params": ["method": "map", "ars": "lambda l: l.split(',')",
              "method": "filter", "ars": "lambda l: l[0]! = '0'"],
   "experiment_id": "0550a7db-b148-4319-bb35-16a9e4500d4a"
}

$ meteos dataset-parse --json json/parse_dataset.json

6. Create Prediction Model

$ cat json/create_model.json

{
   "display_name": "sample-lr-model",
   "display_description": "This is a sample model",
   "dataset_id": "bd134722-22cd-4247-8e20-538933e0975d",
   "method": "LinearRegression",
   "args": "{'numIterations': 10, 'desired_output':0}",
   "experiment_id": "c08027ab-2ea5-4b57-840c-563908cfca46"
}

$ meteos model-create --json json/create_model.json

+-------------+--------------------------------------+
| Property    | Value                                |
+-------------+--------------------------------------+
| created_at  | 2016-10-20T02:17:27.831814           |
| description | This is a sample model               |
| id          | c48f5dba-45be-4165-9825-f4564fecebcd |
| name        | sample-lr-mode                       |
| project_id  | 5f011d076a6f4a328989c57ac7b4e501     |
| status      | available                            |
+-------------+--------------------------------------+

7. Predict

$ cat json/predict_data.json

{
   "display_name": "predict job",
   "display_description": "This is a sample job",
   "model_id": "bd134722-22cd-4247-8e20-538933e0975d",
   "method": "predict",
   "args": "11,10,2016,2,0,57,58",
   "experiment_id": "c08027ab-2ea5-4b57-840c-563908cfca46"
}

$ meteos job-create --json json/predict_data.json

+-------------+--------------------------------------+
| Property    | Value                                |
+-------------+--------------------------------------+
| created_at  | 2016-10-20T02:17:37.644234           |
| description | This is a sample job                 |
| id          | c38b9c22-d72c-4255-8236-7e319d351fad |
| name        | predict_job                          |
| project_id  | 5f011d076a6f4a328989c57ac7b4e501     |
| status      | executing                            |
+-------------+--------------------------------------+

Retrieve a predicted Data as a stdout of job execution.

$ meteos job-show c38b9c22-d72c-4255-8236-7e319d351fad

+----------------------------+-------------------------------------------------+
| Property                   | Value                                           | 
+----------------------------+-------------------------------------------------+
| stdout                     | Value : (325944.477851)                         |
| stderr                     |                                                 |
+----------------------------+-------------------------------------------------+