Jump to: navigation, search

Difference between revisions of "MagnetoDB/specs/async-schema-operations"

m (Deployment Impact)
 
(14 intermediate revisions by one other user not shown)
Line 1: Line 1:
== Problem Description ==
+
=== Problem Description ===
  
 
Large amount of concurrent create/delete table operations creates huge load on Cassandra. In fact schema agreement process for some of the calls may take too much time so it results in timeout errors and corresponding tables stuck in CREATING/DELETING state forever.
 
Large amount of concurrent create/delete table operations creates huge load on Cassandra. In fact schema agreement process for some of the calls may take too much time so it results in timeout errors and corresponding tables stuck in CREATING/DELETING state forever.
  
==Proposed Change==
+
=== Status ===
 +
'''Implemented'''
 +
===Proposed Change===
  
===Storage manager as a RPC client===
+
====QueuedStorageManager====
Implement storage manager that, rather than executing create/delete table calls directly, enqueues them to MQ shipped with Openstack via oslo.messaging.rpc.
+
Implement QueuedStorageManager that, rather than executing create/delete table calls directly, enqueues them to MQ shipped with Openstack via oslo.messaging.rpc.
 
It should use non-blocking calls.
 
It should use non-blocking calls.
 
RPC calls should only include request context as a dictionary and a table name.
 
RPC calls should only include request context as a dictionary and a table name.
 
All necessary information about create/delete table parameters (table schema etc) should be retrieved from table_info_repo.
 
All necessary information about create/delete table parameters (table schema etc) should be retrieved from table_info_repo.
In case of error during creating/deleting table on RPC server side, status of corresponding table should be set to ERROR
+
In case of error during creating/deleting table on RPC server side, status of corresponding table should be set to CREATE_FAILED or DELETE_FAILED.
  
===Magnetodb Schema Processor (RPC server)===
+
====Magnetodb Async Task Executor====
Introduce separate executable, "magnetodb-schema-processor", that will run ''blocking'' RPC server which will execute create/delete table requests strictly one by one.
+
Introduce separate executable, "magnetodb-async-task-executor", that will run ''blocking'' RPC server which will execute create/delete table requests strictly one by one.
 
Number of simultaneously running processes will effectively define the maximum allowed number of concurrent create/delete table requests.
 
Number of simultaneously running processes will effectively define the maximum allowed number of concurrent create/delete table requests.
 +
 +
Only small number of magnetodb-async-task-executor instances is supposed to run (usually only one instance). This component should be designed 'ready to fail'. External tools should be used to monitor its presence and respawn it in case of failure.
  
 
Additionally, table status update time should be introduced. Each table status change should updade that attribute as well.
 
Additionally, table status update time should be introduced. Each table status change should updade that attribute as well.
During describe table call this attribute should be analyzed, whether table is in CREATING or DELETING status for a long time. If so, it's status should be changed to ERROR.
+
During describe table call this attribute should be analyzed, whether table is in CREATING or DELETING status for a long time. If so, it's status should be changed to CREATE_FAILED or DELETE_FAILED.
  
===RPC settings===
+
====RPC settings====
 
* control_exchange: magnetodb  
 
* control_exchange: magnetodb  
 
* amqp_durable_queues: True  
 
* amqp_durable_queues: True  
 
* topic: schema  
 
* topic: schema  
  
===RPC calls===
+
====RPC calls====
 
* create(context, table_name)
 
* create(context, table_name)
 
* delete(context, table_name)
 
* delete(context, table_name)
  
==== Notifications Impact ====
+
=== Notifications Impact ===
 
RPC server should notify on table creation/deletion stat, end and error
 
RPC server should notify on table creation/deletion stat, end and error
==== Other End User Impact ====
+
=== Other End User Impact ===
 
TBD
 
TBD
==== Performance Impact ====
+
=== Performance Impact ===
Create/Delete table operations are expected to be slower but much more reliably.
+
Create/Delete table operations are expected to be slower but much more reliable.
  
==== Deployment Impact ====
+
=== Deployment Impact ===
# Magnetodb Schema Processor should be deployed to separate node or one of MagnetoDB API nodes.
+
# Magnetodb Async Task Executor should be deployed to separate node or one of MagnetoDB API nodes.
 
# QueuedStorageManager should be used as a storage manager in "magnetodb-api.conf" for each MagnetoDB API instance.
 
# QueuedStorageManager should be used as a storage manager in "magnetodb-api.conf" for each MagnetoDB API instance.
 
# Oslo.Messaging should be configured in "magnetodb-api.conf" for each MagnetoDB API instance.
 
# Oslo.Messaging should be configured in "magnetodb-api.conf" for each MagnetoDB API instance.
  
==== Developer Impact ====
+
=== Developer Impact ===
 
None
 
None
==== Implementation ====
 
Current concept: at Cassandra node we run JMX-HTTP bridge agent ([http://www.jolokia.org Jolokia]). Client goes to MagnetoDB with REST-like interface (.../moninoting/{tenant_id}/table/{table_name}), MagnetoDB goes to Jolokia agent via HTTP, get JMX-metrics from Cassanra and returns them to client. If you have any suggestions about implementation - welcome to our IRC-channel at Freenode #magnetodb.
 
  
===== Assignee(s) =====
+
=== Assignee(s) ===
 
Primary assignee:
 
Primary assignee:
 
   <ikhudoshyn>
 
   <ikhudoshyn>
Line 52: Line 54:
 
   <None>
 
   <None>
  
 +
=== Work Items ===
 +
# implement QueuedStorageManager
 +
# implement Magnetodb Async Task Executor
 +
# update devstack integration scripts
  
===== Work Items =====
+
=== Dependencies ===
 
 
 
 
===== Dependencies =====
 
 
[https://wiki.openstack.org/wiki/Oslo/Messaging Oslo.Messaging]
 
[https://wiki.openstack.org/wiki/Oslo/Messaging Oslo.Messaging]
  
==== Documentation Impact ====
+
=== Documentation Impact ===
Magnetodb Schema Processor deployment should be covered in corresponding doc (TBD)
+
Magnetodb Async Task Executor deployment should be covered in corresponding doc (TBD)
  
==== References ====
+
=== References ===
 
[https://blueprints.launchpad.net/magnetodb/+spec/async-schema-operations Blueprint on Launchpad]
 
[https://blueprints.launchpad.net/magnetodb/+spec/async-schema-operations Blueprint on Launchpad]

Latest revision as of 09:13, 23 October 2014

Problem Description

Large amount of concurrent create/delete table operations creates huge load on Cassandra. In fact schema agreement process for some of the calls may take too much time so it results in timeout errors and corresponding tables stuck in CREATING/DELETING state forever.

Status

Implemented

Proposed Change

QueuedStorageManager

Implement QueuedStorageManager that, rather than executing create/delete table calls directly, enqueues them to MQ shipped with Openstack via oslo.messaging.rpc. It should use non-blocking calls. RPC calls should only include request context as a dictionary and a table name. All necessary information about create/delete table parameters (table schema etc) should be retrieved from table_info_repo. In case of error during creating/deleting table on RPC server side, status of corresponding table should be set to CREATE_FAILED or DELETE_FAILED.

Magnetodb Async Task Executor

Introduce separate executable, "magnetodb-async-task-executor", that will run blocking RPC server which will execute create/delete table requests strictly one by one. Number of simultaneously running processes will effectively define the maximum allowed number of concurrent create/delete table requests.

Only small number of magnetodb-async-task-executor instances is supposed to run (usually only one instance). This component should be designed 'ready to fail'. External tools should be used to monitor its presence and respawn it in case of failure.

Additionally, table status update time should be introduced. Each table status change should updade that attribute as well. During describe table call this attribute should be analyzed, whether table is in CREATING or DELETING status for a long time. If so, it's status should be changed to CREATE_FAILED or DELETE_FAILED.

RPC settings

  • control_exchange: magnetodb
  • amqp_durable_queues: True
  • topic: schema

RPC calls

  • create(context, table_name)
  • delete(context, table_name)

Notifications Impact

RPC server should notify on table creation/deletion stat, end and error

Other End User Impact

TBD

Performance Impact

Create/Delete table operations are expected to be slower but much more reliable.

Deployment Impact

  1. Magnetodb Async Task Executor should be deployed to separate node or one of MagnetoDB API nodes.
  2. QueuedStorageManager should be used as a storage manager in "magnetodb-api.conf" for each MagnetoDB API instance.
  3. Oslo.Messaging should be configured in "magnetodb-api.conf" for each MagnetoDB API instance.

Developer Impact

None

Assignee(s)

Primary assignee:

 <ikhudoshyn>

Other contributors:

 <None>

Work Items

  1. implement QueuedStorageManager
  2. implement Magnetodb Async Task Executor
  3. update devstack integration scripts

Dependencies

Oslo.Messaging

Documentation Impact

Magnetodb Async Task Executor deployment should be covered in corresponding doc (TBD)

References

Blueprint on Launchpad