Difference between revisions of "Ovs-flow-logic"
(7 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
− | + | __FORCETOC__ | |
− | To ensure proper isolation within a single bridge, NORMAL action can't be used any more as it floods unknown unicasts on all bridges ports. It is replaced by a learn action that dynamically sets-up flows when packets are recieved from tunnel ports. As mac address are learnt in explicit flows (in table 20), we can use a default action in that table to | + | == OVS flows logic == |
+ | In openvswitch agent, tunnel packet processing was based on segmentation id in br-tun. As a consequence, with the introduction of VXLAN support, two networks using different tunnel types (GRE and VXLAN) but sharing the same segmentation id wouldn't be properly isolated any more. | ||
+ | |||
+ | The following review attempts to address this limitation: https://review.openstack.org/#/c/41239/ | ||
+ | |||
+ | To ensure proper isolation within a single bridge, NORMAL action can't be used any more as it floods unknown unicasts on all bridges ports. It is replaced by a learn action that dynamically sets-up flows when packets are recieved from tunnel ports. As mac address are learnt in explicit flows (in table 20), we can use a default action in that table to handle them in table 21 and unknown unicasts to the right set of ports, like broadcasts and multicasts packets. | ||
[[File:OVS_Flow_Tables.png]] | [[File:OVS_Flow_Tables.png]] | ||
Line 10: | Line 15: | ||
This should improve agent scalability, by reducing the number of flooded packets. | This should improve agent scalability, by reducing the number of flooded packets. | ||
− | == Performance Impacts == | + | == Performance Impacts (WIP) == |
− | As it impacts how packets are handled in tunnels, this proposed changed was benchmaked to verify how it could impact tunelling performance. Current test was run between two VM running on distinct hosts, and measuring IPerf throughput | + | As it impacts how packets are handled in tunnels, this proposed changed was benchmaked to verify how it could impact tunelling performance. Current test was run between two VM running on distinct hosts, and measuring IPerf throughput |
+ | |||
+ | '''''Note that these tests may be quite unrelevant, as throughput seemed to be rather limited by vhost process and packet segmentation rather than flow management in OVS''''' | ||
Before proposed change (using NORMAL action and set_tunnel in br-tun): | Before proposed change (using NORMAL action and set_tunnel in br-tun): | ||
Line 66: | Line 73: | ||
[ 3] 40.0-50.0 sec 1.70 GBytes 1.51 Gbits/sec | [ 3] 40.0-50.0 sec 1.70 GBytes 1.51 Gbits/sec | ||
[ 3] 0.0-50.0 sec 8.73 GBytes 1.50 Gbits/sec | [ 3] 0.0-50.0 sec 8.73 GBytes 1.50 Gbits/sec | ||
+ | |||
+ | == OVS flows logic with local ARP responder == | ||
+ | |||
+ | With ML2 plugin and l2-pop mechanism driver, it's possible to locally | ||
+ | answer to the ARP request of the VM and avoid ARP broadcasting emulation | ||
+ | on overlay which is costly. | ||
+ | |||
+ | [[File:OVS_flow_Tables_with_ARP_responder.svg]] | ||
+ | |||
+ | If OVS supports to match and modify ARP headers field (added in #REDIRECT OVS 2.1 branch [http://git.openvswitch.org/cgi-bin/gitweb.cgi?p=openvswitch;a=commitdiff;h=f6c8a6b163af343c66aea54953553d84863835f7]), the table 21 is added with a redirection flow in table 21. |
Latest revision as of 20:36, 15 January 2014
OVS flows logic
In openvswitch agent, tunnel packet processing was based on segmentation id in br-tun. As a consequence, with the introduction of VXLAN support, two networks using different tunnel types (GRE and VXLAN) but sharing the same segmentation id wouldn't be properly isolated any more.
The following review attempts to address this limitation: https://review.openstack.org/#/c/41239/
To ensure proper isolation within a single bridge, NORMAL action can't be used any more as it floods unknown unicasts on all bridges ports. It is replaced by a learn action that dynamically sets-up flows when packets are recieved from tunnel ports. As mac address are learnt in explicit flows (in table 20), we can use a default action in that table to handle them in table 21 and unknown unicasts to the right set of ports, like broadcasts and multicasts packets.
This table structure will also allow us to implement tunnelling optimisations as proposed in bp/l2-population:
- Remote mac adresses learnt by RPC can be placed in table 20
- In table 21, we'll be able to limit the number of ports on which packets are flooded on a per-network basis
This should improve agent scalability, by reducing the number of flooded packets.
Performance Impacts (WIP)
As it impacts how packets are handled in tunnels, this proposed changed was benchmaked to verify how it could impact tunelling performance. Current test was run between two VM running on distinct hosts, and measuring IPerf throughput
Note that these tests may be quite unrelevant, as throughput seemed to be rather limited by vhost process and packet segmentation rather than flow management in OVS
Before proposed change (using NORMAL action and set_tunnel in br-tun):
root@test:~# iperf -t 50 -i 10 -c 192.168.1.105 ------------------------------------------------------------ Client connecting to 192.168.1.105, TCP port 5001 TCP window size: 22.9 KByte (default) ------------------------------------------------------------ [ 3] local 192.168.1.104 port 59027 connected with 192.168.1.105 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 1.60 GBytes 1.37 Gbits/sec [ 3] 10.0-20.0 sec 1.58 GBytes 1.36 Gbits/sec [ 3] 20.0-30.0 sec 1.95 GBytes 1.68 Gbits/sec [ 3] 30.0-40.0 sec 1.75 GBytes 1.51 Gbits/sec [ 3] 40.0-50.0 sec 1.94 GBytes 1.67 Gbits/sec [ 3] 0.0-50.0 sec 8.82 GBytes 1.52 Gbits/sec
With proposed change (using the logic descibed above):
root@test:~# iperf -t 50 -i 10 -c 192.168.1.105 ------------------------------------------------------------ Client connecting to 192.168.1.105, TCP port 5001 TCP window size: 22.9 KByte (default) ------------------------------------------------------------ [ 3] local 192.168.1.104 port 59026 connected with 192.168.1.105 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 1.58 GBytes 1.36 Gbits/sec [ 3] 10.0-20.0 sec 1.80 GBytes 1.55 Gbits/sec [ 3] 20.0-30.0 sec 1.70 GBytes 1.46 Gbits/sec [ 3] 30.0-40.0 sec 1.92 GBytes 1.65 Gbits/sec [ 3] 40.0-50.0 sec 1.82 GBytes 1.56 Gbits/sec [ 3] 0.0-50.0 sec 8.83 GBytes 1.52 Gbits/sec
As it was discussed in the mailing list, the same testtbed was also used to measure the impact of using distinct bridges for tunelling: a VXLAN tunnel port was added to br-int, and two flows were set-up to hard-wire VM's ports to tunnels on each host:
add-flow br-int in_port=vm_ofport,actions="output:tunnel_ofport" add-flow br-int in_port=tunnel_ofport,actions="output:vm_ofport"
The results trend to show that current bridge separation logic doesn't introduces that a big performance penality:
root@test:~# iperf -t 50 -i 10 -c 192.168.1.105 ------------------------------------------------------------ Client connecting to 192.168.1.105, TCP port 5001 TCP window size: 22.9 KByte (default) ------------------------------------------------------------ [ 3] local 192.168.1.104 port 59037 connected with 192.168.1.105 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 1.78 GBytes 1.53 Gbits/sec [ 3] 10.0-20.0 sec 1.79 GBytes 1.54 Gbits/sec [ 3] 20.0-30.0 sec 1.70 GBytes 1.46 Gbits/sec [ 3] 30.0-40.0 sec 1.70 GBytes 1.46 Gbits/sec [ 3] 40.0-50.0 sec 1.70 GBytes 1.51 Gbits/sec [ 3] 0.0-50.0 sec 8.73 GBytes 1.50 Gbits/sec
OVS flows logic with local ARP responder
With ML2 plugin and l2-pop mechanism driver, it's possible to locally answer to the ARP request of the VM and avoid ARP broadcasting emulation on overlay which is costly.
If OVS supports to match and modify ARP headers field (added in #REDIRECT OVS 2.1 branch [1]), the table 21 is added with a redirection flow in table 21.