Zaqar/Performance/PubSub/Redis

Overview
These tests examined the pub-sub performance of the Redis driver (Juno release).


 * zaqar-bench for load generation and stats
 * 5 minute test duration at each load level
 * ~1K messages, 60-sec TTL
 * 1 message posted to the API per producer request
 * Up to 5 messages read from the API per observer request
 * 4 queues (messages distributed evenly)

Servers

 * Load Balancer
 * Hardware
 * 1x Intel Xeon E5-2680 v2 2.8Ghz
 * 32 GB RAM
 * 10Gbps NIC
 * 32GB SATADOM
 * Software
 * Ubuntu 14.04
 * Nginx 1.4.6 (conf)
 * Load Generator
 * Hardware
 * 1x Intel Xeon E5-2680 v2 2.8Ghz
 * 32 GB RAM
 * 10Gbps NIC
 * 32GB SATADOM
 * Software
 * Ubuntu 14.04
 * Python 2.7.6
 * zaqar-bench (patched)
 * Web Head
 * Hardware
 * 1x Intel Xeon E5-2680 v2 2.8Ghz
 * 32 GB RAM
 * 10Gbps NIC
 * 32GB SATADOM
 * Software
 * Ubuntu 14.04
 * Python 2.7.6
 * zaqar server @ab3a4fb1
 * pooling=true
 * storage=mongodb (pooling catalog)
 * uWSGI 2.0.7 + gevent 1.0.1 (conf)
 * Redis
 * Hardware
 * 2x Intel Xeon E5-2680 v2 2.8Ghz
 * 32 GB RAM
 * 10Gbps NIC
 * 2x LSI Nytro WarpDrive BLP4-1600
 * Software
 * Ubuntu 14.04
 * Redis 2.8.4
 * Default config except snapshot only once very 15 minutes
 * MongoDB
 * Hardware
 * 2x Intel Xeon E5-2680 v2 2.8Ghz
 * 32 GB RAM
 * 10Gbps NIC
 * 2x LSI Nytro WarpDrive BLP4-1600
 * Software
 * Debian 7
 * MongoDB 2.6.4
 * Default config, except setting replSet and enabling periodic logging of CPU and I/O

''Note 1: CPU usage on the web head and Redis box was lower for the same amounts of load used in the pilot test. It is unclear how much of this was due to Ubuntu 14.04 vs. Debian 7, newer versions of uWSGI, Python, and Redis, or some other factor. The configurations used for the aforementioned software was very similar.''

''Note 2: The load generator's CPUs were close to saturation at some of the maximum load levels tested below. To drive larger configurations and/or push the servers to the point that requests begin timing out, zaqar-bench will need to be extended to support multiple load generator boxes. That being said, what is shown below is certainly not without value.''

Configurations
In these tests, varying amounts of load was applied to the following configurations:

Configuration 1 (C1)

 * 1x4 load balancer (1x box, 4x worker procs)
 * 1x20 web head (1x box, 20x worker procs)
 * 1x1 redis (1x box, 1x redis proc)
 * 3x mongo nodes (single replica set) for the pooling catalog
 * This is very much overkill for the catalog store, but it was already set up from a previous test, and so was reused in the interest of time.
 * Similar results should be obtainable with much more modest hardware for the catalog DB.
 * Alternatively, an RDBMs could be used here (Zaqar supports both MongoDB and SQLAlchemy catalog drivers)

Configuration 2 (C2)
Same as Configuration 1 but adds an additional web head (2x20)

Configuration 3 (C3)
Same as Configuration 2 but adds an additional redis proc (1x2)

Senario 1 (Read-Heavy)
In these tests, producers were held at 50 while the number of observer clients was steadily increased. The X axis denotes the total number of observers and producers. For some use cases that Zaqar targets, the number of observers polling the API far exceeds the number of messages being posted at any given time.

Mean Latency
The Y-axis denotes mean latency in milliseconds.

C1



C2



C3



Combined Throughput
The Y-axis denotes the combined throughput (req/sec) for all clients.

C1



C2



C3



Standard Deviation
The Y-axis denotes the standard deviation for per-request latency (ms). Even at small loads there were a few outliers, sitting outside the 99th percentile, that bumped up the stdev. Further experimentation is needed to find the root cause, whether it be in the client, the server, or the Redis instance.

C1



C2



C3



99th Percentile
The Y-axis denotes the per-request latency (ms) for 99% of client requests. In other words, 99% of requests completed within the given time duration.

C1



C2



C3



Maximum Latency
The Y-axis represents the maximum latency (ms) experienced by any client during a given run. Comparing this graph to the 99th percentile gives a rough idea re what sort of outliers were in the population. Note that the spikes may be due to Redis's snapshotting feature (TBD).

C1



C2



C3



Senario 2 (Write-Heavy)
In these tests, observers were held at 50 while the number of producer clients was steadily increased. The X-axis denotes the total number of clients (producers + observers). Each level of load was executed for 5 minutes, after which the samples were used to calculate the following statistics.

Mean Latency
The Y-axis denotes mean latency in milliseconds.

C1



C2



C3



Combined Throughput
The Y-axis denotes the combined throughput (req/sec) for all clients.

C1



C2



C3



Standard Deviation
The Y-axis denotes the standard deviation for per-request latency (ms). Even at small loads there were a few outliers, sitting outside the 99th percentile, that bumped up the stdev. Further experimentation is needed to find the root cause, whether it be in the client, the server, or the Redis instance.

C1



C2



C3



99th Percentile
The Y-axis denotes the per-request latency (ms) for 99% of client requests. In other words, 99% of requests completed within the given time duration.

C1



C2



C3



Maximum Latency
The Y-axis represents the maximum latency (ms) experienced by any client during a given run. Comparing this graph to the 99th percentile gives a rough idea re what sort of outliers were in the population. Note that the spikes may be due to Redis's snapshotting feature (TBD).

C1



C2



C3



Senario 3 (Balanced)
In these tests, both observer and producer clients were increased by the same amount each time. The X-axis denotes the total number of clients (producers + observers).

Mean Latency
The Y-axis denotes mean latency in milliseconds.

C1



C2



C3



Combined Throughput
The Y-axis denotes the combined throughput (req/sec) for all clients.

C1



C2



C3



Standard Deviation
The Y-axis denotes the standard deviation for per-request latency (ms). Even at small loads there were a few outliers, sitting outside the 99th percentile, that bumped up the stdev. Further experimentation is needed to find the root cause, whether it be in the client, the server, or the Redis instance.

C1



C2



C3



99th Percentile
The Y-axis denotes the per-request latency (ms) for 99% of client requests. In other words, 99% of requests completed within the given time duration.

C1



C2



C3



Maximum Latency
The Y-axis represents the maximum latency (ms) experienced by any client during a given run. Comparing this graph to the 99th percentile gives a rough idea of what sort of outliers were in the population. Note that the spikes may be due to Redis's snapshotting feature (TBD).

C1



C2



C3