Distributed Router for OVS

Scope
VDR(Virtual Distributed Router) is going to provide an option for user, to creat distributed routers, which allow one-hop forwarding for east-west traffic in virtualized networks. This will dramatically increase the total bandwidth, and also bring other advantages.

Brief Introduction

 * The goal of this blueprint is to implement distributed router in openvswitch plugin. Openvswitch plugin is the most popular plugin nowadays. However, lots of people found that under heavy east-west traffic, current l3-agent router becomes the bottle neck. As shown in the figure below:




 * With the help of distributed router in ovs plugin, it's not necessary for east-west traffic to traverse network node to get routing decision anymore. Instead, those traffic will be delivered directly from the source hypervisor to the destination hypervisor. As shown below:




 * The test result of our POC shows distinct performance imporvement on several aspects. Advantages we can get:
 * 1. East-west total throughput will increase distinctly when more hypervisors are involved.
 * 2. Under high concurrency of east-west traffic, on average, each VM will obtain a better bandwidth.
 * 3. North-south and east-west traffic don't impact each other anymore. North-south traffic enjoys all the bandwidth of router exclusively.
 * 4. No more unnecessary circuitous path, when east-west session happens between two VMs on the same hypervisor.

Performance Comparison
Here is a comparison between original OpenStack and a OpenStack with distributed router.

While for VDR, total bandwidth is not a static number, under high concurrency, the more hyperviosrs involve in, the more East-West bandwidth we can get.
 * For original implementation, even there are more hypervisor involve in the communication, max total bandwidth is the same.

By introducing VDR, each pair of hypervisor always get a stable bandwidth, which equals the max capability of the physical link between them.
 * Under high concurrency, since there's a static max bandwidth for original router, each VM only obtains a very small piece. If there're 20 VMs are communicating, and 10 pairs of hypervisors are involved, then each pair get less than 100Mb bandwidth on average.(under GE environment).


 * For original router, both East-West traffic and North-South traffic goes to the server on which l3-agent resides. There would be a competition for them. The total bandwidth of these two kind of traffic, equals the max capability of routing server. They influence each other.


 * After using VDR, all the bandwidth of routing server is always used by North-South traffic. Even better, East-West bandwidth could be higher than max bandwidth of routing server.

Links
Blueprint Page

Icehouse Summit Proposal