Jump to: navigation, search

Difference between revisions of "Meteos/ExampleLinear"

(4. Parse a raw data)
(3. Upload a raw data)
Line 143: Line 143:
 
<pre>
 
<pre>
 
/sample/data$ head linear_data.txt
 
/sample/data$ head linear_data.txt
15000,1,10,2016,6,0,68,50
+
4500,1,1,2016,5,0,40,50
14000,2,10,2016,0,0,68,50
+
8000,2,1,2016,6,1,60,80
5500,3,10,2016,1,1,68,90
+
9500,3,1,2016,0,2,88,92
0,4,10,2016,2,1,68,90
+
5000,4,1,2016,1,3,90,90
4500,5,10,2016,3,0,60,55
+
0,5,1,2016,2,2,90,80
5000,6,10,2016,4,0,60,55
+
4500,6,1,2016,3,3,80,90
4800,7,10,2016,5,2,58,87
+
4000,7,1,2016,4,1,60,80
16000,8,10,2016,6,3,58,60
+
4500,8,1,2016,5,0,40,50
15000,9,10,2016,0,3,60,60
+
8000,9,1,2016,6,0,30,50
3300,10,10,2016,1,2,62,87
+
9500,10,1,2016,0,0,40,50
 
</pre>
 
</pre>
 
<pre>
 
<pre>

Revision as of 02:30, 8 December 2016

Predict a Sales Figures using Meteos

In this example, you create a prediction model which predict sales by using Linear Regression.

Linear Regression is one of the algorithms in supervised learning.

Usecase.png

1. Create a experiment template

Create template of experiment. Experiment is a workspace of Machine Learning.

You have to confirm a glance image id of meteos image, and a neutron network id before creating a template.

You can use a format located in python-meteosclient/sample/json/template.json

$ glance image-list | grep meteos
| 45de4bbd-8419-40ff-8ed7-fc065c05e34f | meteos                          |
$ neutron net-list | grep public
| 84c13e76-ced9-4142-a885-280784f1f7a3 | public  | a14de1c5-b8d4-434b-a056-9b0049b93402             |
$ vim sample/json/template.json
$ cat sample/json/template.json
{
    "display_name": "example-template",
    "display_description": "This is a sample template of experiment",
    "image_id" : "45de4bbd-8419-40ff-8ed7-fc065c05e34f",
    "master_nodes_num": 1,
    "master_flavor_id": "4",
    "worker_nodes_num": 2,
    "worker_flavor_id": "2",
    "spark_version": "1.6.0",
    "floating_ip_pool": "84c13e76-ced9-4142-a885-280784f1f7a3"
}
$ meteos template-create --json sample/json/template.json
+---------------+-----------------------------------------+
| Property      | Value                                   |
+---------------+-----------------------------------------+
| cluster_id    | None                                    |
| created_at    | 2016-12-04T07:16:29.000000              |
| description   | This is a sample template of experiment |
| id            | 8b7b9b89-f119-4b9b-b9b0-31598f819f1a    |
| master_flavor | 4                                       |
| master_nodes  | 1                                       |
| name          | example-template                        |
| project_id    | 67401cca74c2409b939e944bc6c8fcbe        |
| spark_version | 1.6.0                                   |
| status        | available                               |
| user_id       | 181b1caa9d5b470393ca66b9e511d5b0        |
| worker_flavor | 2                                       |
| worker_nodes  | 2                                       |
+---------------+-----------------------------------------+


2. Create a experiment from template

Create a experiment by using template created in the above step. You have to confirm a neutron private network id and create keypair before creating a template.

You can use a format located in python-meteosclient/sample/json/experiment.json

$ nova keypair-add key1 > ~/key1.pem && chmod 600 ~/key1.pem
$ neutron net-list | grep private
| 8abc626e-2b06-4c67-9b2c-0231f0cef5b8 | private | cb58940f-859b-48c6-b92a-3861470f1fc1 20.0.0.0/26 |
$ vim sample/json/experiment.json
$ cat sample/json/experiment.json
{
    "display_name": "example-experiment",
    "display_description": "This is a sample experiment",
    "key_name": "key1",
    "neutron_management_network": "8abc626e-2b06-4c67-9b2c-0231f0cef5b8",
    "template_id": "8b7b9b89-f119-4b9b-b9b0-31598f819f1a"
}
$ meteos experiment-create --json sample/json/experiment.json
+--------------------+--------------------------------------+
| Property           | Value                                |
+--------------------+--------------------------------------+
| created_at         | 2016-12-04T07:20:11.000000           |
| description        | This is a sample experiment          |
| id                 | 91504a65-01cf-428f-81aa-596be7ca8619 |
| key_name           | key1                                 |
| management_network | 8abc626e-2b06-4c67-9b2c-0231f0cef5b8 |
| name               | example-experiment                   |
| project_id         | 67401cca74c2409b939e944bc6c8fcbe     |
| status             | creating                             |
| user_id            | 181b1caa9d5b470393ca66b9e511d5b0     |
+--------------------+--------------------------------------+

Meteos creates a experiment using OpenStack Sahara spark plugin.

You can see a sahara cluster and nova VMs created by Meteos as below.

$ openstack dataprocessing cluster list (or sahara cluster-list)
+------------------+--------------------------------------+-------------+----------------+----------+
| Name             | Id                                   | Plugin name | Plugin version | Status   |
+------------------+--------------------------------------+-------------+----------------+----------+
| cluster-91504a65 | 13418fd9-5d2a-4ee6-b384-cb250b7e7714 | spark       | 1.6.0          | Spawning |
+------------------+--------------------------------------+-------------+----------------+----------+
$ openstack server list (or nova list)
+--------------------------------------+----------------------------+--------+----------+------------+
| ID                                   | Name                       | Status | Networks | Image Name |
+--------------------------------------+----------------------------+--------+----------+------------+
| 58818eb5-ade7-407c-8c76-9fd9809632b4 | cluster-91504a65-workers-1 | BUILD  |          | meteos     |
| a151dbd9-de51-43ca-afb8-1fdeecce2891 | cluster-91504a65-workers-0 | BUILD  |          | meteos     |
| d02d85c5-0960-4b7e-880c-26b73c5dd8ad | cluster-91504a65-master-0  | BUILD  |          | meteos     |
+--------------------------------------+----------------------------+--------+----------+------------+

3. Upload a raw data

Upload a raw data (in this example past sales figures data) to OpenStack Swift.

You can use a sample data located in python-meteosclient/sample/data/linear_data.txt

Raw data shows "sales figures", "day", "month", "year", "day of week", "parameter which indicates weather", "degree", "humidity" from left.

$ cd sample/data/
/sample/data$ head linear_data.txt
4500,1,1,2016,5,0,40,50
8000,2,1,2016,6,1,60,80
9500,3,1,2016,0,2,88,92
5000,4,1,2016,1,3,90,90
0,5,1,2016,2,2,90,80
4500,6,1,2016,3,3,80,90
4000,7,1,2016,4,1,60,80
4500,8,1,2016,5,0,40,50
8000,9,1,2016,6,0,30,50
9500,10,1,2016,0,0,40,50
/sample/data$ swift upload meteos linear_data.txt
linear_data.txt

4. Parse a raw data

Parse a raw data to enable Prediction Model to handle it.

You can use a sample format located in python-meteosclient/sample/json/dataset_parse.json

You can see the head data of parsed dataset by executing "meteos dataset-show <dataset-uuid>" command.

$ vim ../python-meteosclient/sample/json/dataset_parse.json
$ cat ../python-meteosclient/sample/json/dataset_parse.json
{
    "source_dataset_url": "swift://meteos/linear_data.txt",
    "display_name": "sample-data",
    "display_description": "user skill dataset",
    "method": "parse",
    "params": [{"method": "filter", "args": "lambda l: l.split(',')[0] != '0'"}],
    "experiment_id": "91504a65-01cf-428f-81aa-596be7ca8619",
    "swift_tenant": "demo",
    "swift_username": "demo",
    "swift_password": "nova"
}
$ meteos dataset-create --json sample/json/dataset_parse.json
+-------------+--------------------------------------+
| Property    | Value                                |
+-------------+--------------------------------------+
| created_at  | 2016-12-08T00:45:43.000000           |
| description | This is a sample dataset             |
| head        | None                                 |
| id          | b6054cab-77b8-4b2d-b6c8-861f69859107 |
| name        | sample-data                          |
| project_id  | 0e3e0f01952948848d8ae438279122fe     |
| status      | creating                             |
| stderr      | None                                 |
| user_id     | 2c3120e8228c4d0e9768f09346fe842d     |
+-------------+--------------------------------------+
$ meteos dataset-list
+--------------------------------------+-------------+--------------------------+-----------+--------------------------------+----------------------------+
| id                                   | name        | description              | status    | source_dataset_url             | created_at                 |
+--------------------------------------+-------------+--------------------------+-----------+--------------------------------+----------------------------+
| b6054cab-77b8-4b2d-b6c8-861f69859107 | sample-data | This is a sample dataset | available | swift://meteos/linear_data.txt | 2016-12-08T00:45:43.000000 |
+--------------------------------------+-------------+--------------------------+-----------+--------------------------------+----------------------------+
$ meteos dataset-show b6054cab-77b8-4b2d-b6c8-861f69859107
+-------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Property    | Value                                                                                                                                                                                                                                                                                                    |
+-------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| created_at  | 2016-12-08T00:45:43.000000                                                                                                                                                                                                                                                                               |
| description | This is a sample dataset                                                                                                                                                                                                                                                                                 |
| head        | [u'15000,1,10,2016,6,0,68,50', u'14000,2,10,2016,0,0,68,50', u'5500,3,10,2016,1,1,68,90', u'4500,5,10,2016,3,0,60,55', u'5000,6,10,2016,4,0,60,55', u'4800,7,10,2016,5,2,58,87', u'16000,8,10,2016,6,3,58,60', u'15000,9,10,2016,0,3,60,60', u'3300,10,10,2016,1,2,62,87', u'6000,12,10,2016,3,1,55,93'] |
|             |                                                                                                                                                                                                                                                                                                          |
| id          | b6054cab-77b8-4b2d-b6c8-861f69859107                                                                                                                                                                                                                                                                     |
| name        | sample-data                                                                                                                                                                                                                                                                                              |
| project_id  | 0e3e0f01952948848d8ae438279122fe                                                                                                                                                                                                                                                                         |
| status      | available                                                                                                                                                                                                                                                                                                |
| stderr      |                                                                                                                                                                                                                                                                                                          |
| user_id     | 2c3120e8228c4d0e9768f09346fe842d                                                                                                                                                                                                                                                                         |
+-------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

5. Create a prediction model

In this example, User creates a Linear Regression Model from parsed dataset.

Parsed dataset has been already distributed in hdfs of experiment environment.

So, you spefity the internal url (internal://<dataset-id>) in source_dataset_url parameter.

$ vim sample/json/model_linear.json
$ cat sample/json/model_linear.json
{
    "display_name": "sample-linear-model",
    "display_description": "LinearRegression Model",
    "source_dataset_url": "internal://b6054cab-77b8-4b2d-b6c8-861f69859107",
    "model_type": "LinearRegression",
    "model_params": "{'numIterations': 10}",
    "experiment_id": "fad1b912-8f7d-4600-b048-7db27aae1af3"
}
$ meteos model-create --json sample/json/model_linear.json
+-------------+-----------------------------------------+
| Property    | Value                                   |
+-------------+-----------------------------------------+
| created_at  | 2016-12-08T00:52:25.000000              |
| description | LinearRegression Model |
| id          | f9b20d78-6f2f-439d-ad71-3db34f3791c9    |
| name        | sample-linear-model                     |
| params      | eydudW1JdGVyYXRpb25zJzogMTB9            |
| project_id  | 0e3e0f01952948848d8ae438279122fe        |
| status      | creating                                |
| stderr      | None                                    |
| stdout      | None                                    |
| type        | LinearRegression                        |
| user_id     | 2c3120e8228c4d0e9768f09346fe842d        |
+-------------+-----------------------------------------+
$ meteos model-list
+--------------------------------------+---------------------+------------------------+-----------+------------------+-------------------------------------------------+----------------------------+
| id                                   | name                | description            | status    | type             | source_dataset_url                              | created_at                 |
+--------------------------------------+---------------------+------------------------+-----------+------------------+-------------------------------------------------+----------------------------+
| f9b20d78-6f2f-439d-ad71-3db34f3791c9 | sample-linear-model | LinearRegression Model | available | LinearRegression | internal://b6054cab-77b8-4b2d-b6c8-861f69859107 | 2016-12-08T00:52:25.000000 |
+--------------------------------------+---------------------+------------------------+-----------+------------------+-------------------------------------------------+----------------------------+