Neutron/L3 High Availability VRRP

Abstract
The aim of this blueprint is to add High Availability Features on virtual routers.

High availability features will be implemented as extensions and drivers. A first extension/driver will be based on VRRP.

A new scheduler will be also added in order to be able to spawn multiple instances of a same router on many agents.

Use Case
Currently we are able to spawn many l3 agent, however each l3 agent is a SPOF. If an l3 agent fails, all virtual routers of this agent will be lost, and consequently all VMs connected to these virtual routers will be isolated.

The main purpose here is to be able to spawn HA Virtual Router.

At the router creation :


 * we should be able to specify whether the router will be hosted twice or not.
 * A l3 agent will host master and slave version of virtual routers. So the two l3 agents stay actives to host any kind of router.

When a virtual router fails :


 * VIP driver of the agent will send a notification to the controller.
 * The controller could respawn the failed virtual router to another l3 agent as slave.
 * The controller and/or the agent hosting a failed router could also launch some additional scripts during the transition of state.

Management:

We should be able to manually set the master as failed, for maintenance purpose for example. This action should switch the slave in master mode and reschedule the router as backup to another l3 agent. Priority parameter of keepalived could be used as well instead of the state parameter. In this case all keepalived instances will be master, and the “current” master will be elected regarding to its priority.

Implementation Overview
The main purpose is to address this issue by adding a new type of router, which will be spawned twice on two different agents. One agent will be in charge of the master version of this router, and another l3 agent will be in charge of the slave router. Two new interfaces will be added to each virtual router in order to allow the exchanges of administrative traffic, like health state of the routers or tcp connection sessions. The original interfaces, internals and externals, will be transformed as VIP interfaces.

VIP management
Keepalived will be used to manage the VIP interfaces. One instance of keepalived per virtual router, then one per namespace.

TCP sessions
Conntrackd will be used to maintain the TCP sessions going through the router. One instance of conntrackd per virtual router, then one per namespace.

Administrative traffic
A new network could be created per tenant which will be used for the administrative traffic exchanges. This new network will be created in order to isolate the administrative traffic from the tenant traffic. Another solution would be to use the private network.

Data Model Changes
Table which will be added.

routers_vrrp
New table to associate a Router to a VR_ID.

routers_vrrp_port
New table to associate a VRRP Router to many vrrp administrative ports.

RouterL3AgentBinding
RouterL3AgentBinding will be also modified or a new table will be added in order to indicate which version(master/slave) of the router will be hosted by an agent.

Configuration variables
A new param will be added in order to set the path of the keepalived/conntrackd configuration which will be generated by the agent. A new param will be also added in the neutron server configuration in order to set the mode and the number of router instances.

Plugin configuration configuration
/etc/neutron/neutron.conf

[DEFAULT] l3_agents_per_router = 3
 * 1) Number of L3 agents scheduled to host a virtual router. This enables VRRP.

L3 agent configuration
/etc/neutron/l3_agent.ini

[DEFAULT] vrrp_confs = $state_path/vrrp

Limitations

 * There is a limit of 255 HA Virtual Routers per tenant, since VRID is 8bits length and with this proposal there is only one administrative network per tenant.

Links
Blueprint

RFC VRRP

Appendix
Below the two templates which will be used to generate configuration files for keepalived and conntrackd.

Keepalived template
global_defs { router_id ${VR_ID} } vrrp_sync_group VG${VR_GROUP_ID} { group { VI_HA }   % if NOTIFY_SCRIPT: notify_master ${NOTIFY_SCRIPT} % endif }

vrrp_instance VI_HA { % if TYPE == 'MASTER': state MASTER % else: state SLAVE % endif interface ${L3_AGENT.get_ha_device_name(TRACK_PORT_ID)} virtual_router_id ${VR_ID} priority ${PRIORITY} track_interface { ${L3_AGENT.get_ha_device_name(TRACK_PORT_ID)} }   virtual_ipaddress { % if EXTERNAL_PORT: ${EXTERNAL_PORT['ip_cidr']} dev ${L3_AGENT.get_external_device_name(EXTERNAL_PORT['id'])} % if FLOATING_IPS: ${FLOATING_IPS[0]['floating_ip_address']}/32 dev ${L3_AGENT.get_external_device_name(EXTERNAL_PORT['id'])} % endif % endif

% if INTERNAL_PORTS: ${INTERNAL_PORTS[0]['ip_cidr']} dev ${L3_AGENT.get_internal_device_name(INTERNAL_PORTS[0]['id'])} % endif }   virtual_ipaddress_excluded { % if EXTERNAL_PORT: % for FLOATING_IP in FLOATING_IPS[1:]: ${FLOATING_IP['floating_ip_address']}/32 dev ${L3_AGENT.get_external_device_name(EXTERNAL_PORT['id'])} % endfor % endif

% for INTERNAL_PORT in INTERNAL_PORTS[1:]: ${INTERNAL_PORT['ip_cidr']} dev ${L3_AGENT.get_internal_device_name(INTERNAL_PORT['id'])} % endfor }

% if EXTERNAL_PORT: virtual_routes { 0.0.0.0/0 via ${EXTERNAL_PORT['ip_cidr'].split('/')[0]} dev ${L3_AGENT.get_external_device_name(EXTERNAL_PORT['id'])} }   % endif }

Conntrackd template
General { HashSize 8192 HashLimit 65535 Syslog on   LockFile ${LOCK} UNIX { Path ${SOCK} Backlog 20 }   SocketBufferSize 262142 SocketBufferSizeMaxGrown 655355 Filter { Protocol Accept { TCP }         Address Ignore { IPv4_address 127.0.0.1 }      } } Sync { Mode FTFW { }   UDP Default { IPv4_address ${TRACK_PORT_LOCAL['ip_cidr'].split('/')[0]} IPv4_Destination_Address ${TRACK_PORT_REMOTE['ip_cidr'].split('/')[0]} Port 3780 Interface ${L3_AGENT.get_ha_device_name(TRACK_PORT_ID)} SndSocketBuffer 24985600 RcvSocketBuffer 24985600 Checksum on   } }