Jump to: navigation, search

Difference between revisions of "MagnetoDB/specs/streamingbulkload"

m ([Draft] MagnetoDB Streaming Bulk Load workflow and API)
m ([Draft] MagnetoDB Streaming Bulk Load workflow and API)
Line 1: Line 1:
 
= [Draft] MagnetoDB Streaming Bulk Load workflow and API=
 
= [Draft] MagnetoDB Streaming Bulk Load workflow and API=
=== Workflow ===
+
==== Workflow ====
 
This page describes process of loading large amounts of data into MagnetoDB.
 
This page describes process of loading large amounts of data into MagnetoDB.
  
Line 7: Line 7:
 
Data is uploaded in one streaming HTTP request.
 
Data is uploaded in one streaming HTTP request.
  
===== URL =====
+
==== URL ====
 
POST v1/{project_id}/data/tables/{table_name}/bulk_load
 
POST v1/{project_id}/data/tables/{table_name}/bulk_load
  
Line 16: Line 16:
 
* X-Auth-Token keystone auth token
 
* X-Auth-Token keystone auth token
  
===== Request Syntax =====
+
==== Request Syntax ====
 
Data stream is a plain text than contains '\n' separated sequence of JSON representations of items to be inserted.
 
Data stream is a plain text than contains '\n' separated sequence of JSON representations of items to be inserted.
 
<pre>
 
<pre>
Line 23: Line 23:
 
</pre>
 
</pre>
  
===== Response Syntax =====
+
==== Response Syntax ====
 
If all incoming items were written successfully, count of all written items is returned.
 
If all incoming items were written successfully, count of all written items is returned.
  

Revision as of 12:28, 27 May 2014

[Draft] MagnetoDB Streaming Bulk Load workflow and API

Workflow

This page describes process of loading large amounts of data into MagnetoDB.

Before uploading the data one should first make sure that destination table exists.

Data is uploaded in one streaming HTTP request.

URL

POST v1/{project_id}/data/tables/{table_name}/bulk_load

Headers

  • User-Agent
  • Content-Type: application/json
  • Accept: application/json
  • X-Auth-Token keystone auth token

Request Syntax

Data stream is a plain text than contains '\n' separated sequence of JSON representations of items to be inserted.

{ "attribute_name": { "attribute_type": "attribute_value"}, "attribute_name2": { "attribute_type": "attribute_value"}...}
{ "attribute_name": { "attribute_type": "attribute_value"}, "attribute_name2": { "attribute_type": "attribute_value"}...}

Response Syntax

If all incoming items were written successfully, count of all written items is returned.

{
    "processed": "string"
}

In case of error, incoming data stream stops being read, and response will contain counts of received, processed and failed items, last read item and error messages for failed items.

{
    "read": "string",
    "processed": "string",
    "failed": "string",
    "last_read": { "attribute_name": { "attribute_type": "attribute_value"}, "attribute_name2": { "attribute_type": "attribute_value"}...},
    "errors": [
        {
            "item": { "attribute_name": { "attribute_type": "attribute_value"}, "attribute_name2": { "attribute_type": "attribute_value"}...},
            "error": "string"
        },
        {
            "item": { "attribute_name": { "attribute_type": "attribute_value"}, "attribute_name2": { "attribute_type": "attribute_value"}...},
            "error": "string"
        },
        ...
    ]
}

Due to asynchronous processing of received items, 'PutItem' operations for several items may be enqueued when an error is found. In such case server will wait for all enqueued operations' results. Some of the results may be errors too. So response will contain more than one error.