Jump to: navigation, search

Difference between revisions of "MagnetoDB/specs/bulkload-api"

m (URL)
Line 1: Line 1:
= [Draft] MagnetoDB Bulk Load workflow and API=
+
= [Draft] [Outdated] MagnetoDB Bulk Load workflow and API=
 +
 
 +
The page is outdated. For the bulk load API description please refer to [[MagnetoDB/streamingbulkload]]
 +
 
 +
 
 
=== Workflow ===
 
=== Workflow ===
 
This page describes process of loading large amounts of data into MagnetoDB.
 
This page describes process of loading large amounts of data into MagnetoDB.

Revision as of 08:45, 3 July 2014

[Draft] [Outdated] MagnetoDB Bulk Load workflow and API

The page is outdated. For the bulk load API description please refer to MagnetoDB/streamingbulkload


Workflow

This page describes process of loading large amounts of data into MagnetoDB.

Before uploading the data one should first make sure that destination table exists and then initiate new job via 'Create new..' command. The latter will rerurn an Id of created job. This Id will be used for all operations with the job.

Once the job is created one could upload one or more chunks of data into the job via 'Upload..' operation. Amount of data per chunk is not limited by any particular number but may vary depending on hardware capabilities, required operation control granularity etc. Depending on implementation, data may start being inserted immediately upon arrival (thru streaming HTTP request) or after data chunk was completely uploaded. This fact may also imply on data chunk size used.

One may get a list of all jobs per table via 'Cet..' request.

Status of each job can be obtained via 'Get..' request. Depending on implementation, insert operation result may be provided per item or per whole data chunk.

Operations

HTTP method URL pattern Operation description
POST v1/{project_id}/data/tables/{table_name}/bulk_load Creates a new bulk load job
POST v1/{project_id}/data/tables/{table_name}/bulk_load/{job_id} Uploads chunk of data
GET v1/{project_id}/data/tables/{table_name}/bulk_load/{job_id} Gets status of the bulk load job
GET v1/{project_id}/data/tables/{table_name}/bulk_load Gets list of bulk load job ids
DELETE v1/{project_id}/data/tables/{table_name}/bulk_load/{job_id} Stops the bulk load job

Headers

  • User-Agent
  • Content-Type: application/json, text/plain
  • Accept: application/json
  • X-Auth-Token keystone auth token

Common Errors

This section lists the common errors that all actions return. Any action-specific errors will be listed in the topic for the action.

TBD

Create new bulk load job

URL

POST v1/{project_id}/data/tables/{table_name}/bulk_load

Request Syntax
{
    "attribute_definitions": [
        {
            "attribute_name": "string",
            "attribute_type": "string"
        }
    ]
}
Request Parameters
attribute_definitions
An array of attributes that describe the schema for the upcoming data.
Type: array of AttributeDefinition objects
Required: Yes
Response Syntax
{
   "job_id": "string"
}
Response Elements
job_id
Id of the new created job
Type: string
Errors

TBD


Upload data chunk

URL

POST v1/{project_id}/data/tables/{table_name}/bulk_load/{job_id}

Request Syntax

Request body is a stream of '\n' separated lines. Each line is a series of '0x01' separated values. Number of values and their types should match attribute list provided during creating the bulk load job.

Request Parameters

N/A

Response Syntax
{
    "job_id": "string",   
    "chunk_id": "string"
    "count": number
}
Response Elements
job_id
Id of the bulk load job
Type: string
chunk_id
Id of the uploaded chunk
Type: string
count
Number of recieved items in chunk
Type: number
Errors

TODO


Get bulk load job status

URL

GET v1/{project_id}/data/tables/{table_name}/bulk_load/{job_id}

Request Syntax

This operation does not require a request body

Response Syntax
{
    "job_id": "string",   
    "chunks": [
        {
            "chunk_id": "string",
            "count": number,
            "status": "string"
        }
    ]
}
Response Elements
job_id
Id of the bulk load job
Type: string
chunk_id
Id of a chunk
Type: string
count
Number of recieved items in the chunk
Type: number
status
Status of processing of the chunk
Type: string
Errors

TBW


Get bulk load jobs list

URL

GET v1/{project_id}/data/tables/{table_name}/bulk_load

Request Syntax

This operation does not require a request body

Response Syntax

{

   "job_ids": [
       "string",
       "string",
       ...
   ]

}

Response Elements
job_ids
List of bulk load job ids
Type: string
Errors

TBW


Delete bulk load job

URL

DELETE v1/{project_id}/data/tables/{table_name}/bulk_load/{job_id}

Request Syntax

This operation does not require a request body

Response Syntax

This operation does not return any response apart from standard HTTP response codes

Errors

TBW