Jump to: navigation, search

Neutron/L3 High Availability VRRP


The aim of this blueprint is to add High Availability Features on virtual routers.

High availability features will be implemented as extensions and drivers. A first extension/driver will be based on VRRP.

A new scheduler will be also added in order to be able to spawn multiple instances of a same router on many agents.

Use Case

Currently we are able to spawn many l3 agent, however each l3 agent is a SPOF. If an l3 agent fails, all virtual routers of this agent will be lost, and consequently all VMs connected to these virtual routers will be isolated.

The main purpose here is to be able to spawn HA Virtual Router.

At the router creation :

  • we should be able to specify whether the router will be hosted twice or not.
  • A l3 agent will host master and slave version of virtual routers. So the two l3 agents stay actives to host any kind of router.

When a virtual router fails :

  • VIP driver of the agent will send a notification to the controller.
  • The controller could respawn the failed virtual router to another l3 agent as slave.
  • The controller and/or the agent hosting a failed router could also launch some additional scripts during the transition of state.


We should be able to manually set the master as failed, for maintenance purpose for example. This action should switch the slave in master mode and reschedule the router as backup to another l3 agent. Priority parameter of keepalived could be used as well instead of the state parameter. In this case all keepalived instances will be master, and the “current” master will be elected regarding to its priority.

Implementation Overview

The main purpose is to address this issue by adding a new type of router, which will be spawned twice on two different agents. One agent will be in charge of the master version of this router, and another l3 agent will be in charge of the slave router. Two new interfaces will be added to each virtual router in order to allow the exchanges of administrative traffic, like health state of the routers or tcp connection sessions. The original interfaces, internals and externals, will be transformed as VIP interfaces.

VIP management

Keepalived will be used to manage the VIP interfaces. One instance of keepalived per virtual router, then one per namespace.

TCP sessions

Conntrackd will be used to maintain the TCP sessions going through the router. One instance of conntrackd per virtual router, then one per namespace.

Administrative traffic

A new network could be created per tenant which will be used for the administrative traffic exchanges. This new network will be created in order to isolate the administrative traffic from the tenant traffic. Another solution would be to use the private network.

Proposal with dedicated network

Proposal with dedicated link

Data Model Changes

Table which will be added.


New table to associate a Router to a VR_ID.

Key Value Notes
router_id enum Router id
vr_id integer Virtual Router IDentifier (used by keepalived template, see appendix)


New table to associate a VRRP Router to many vrrp administrative ports.

Key Value Notes
router_id enum Router id
port_id uuid id of the port which will be used for the master version of the ha router. used for admin. traffic


RouterL3AgentBinding will be also modified or a new table will be added in order to indicate which version(master/slave) of the router will be hosted by an agent.

Key Value Notes
running_state enum version hosted Master/Slave


Router creation

l3 ha router creation

Router updates: floating-ip/gw/int

l3 ha router update


l3 ha router failure

Configuration variables

A new param will be added in order to set the path of the keepalived/conntrackd configuration which will be generated by the agent. A new param will be also added in the neutron server configuration in order to set the mode and the number of router instances.

Plugin configuration configuration


# Number of L3 agents scheduled to host a virtual router. This enables VRRP.
l3_agents_per_router = 3

L3 agent configuration


vrrp_confs = $state_path/vrrp

CLI Requirements


  • There is a limit of 255 HA Virtual Routers per tenant, since VRID is 8bits length and with this proposal there is only one administrative network per tenant.





Below the two templates which will be used to generate configuration files for keepalived and conntrackd.

Keepalived template

global_defs {
    router_id ${VR_ID}
vrrp_sync_group VG${VR_GROUP_ID} {
    group {
    notify_master ${NOTIFY_SCRIPT}
    % endif

vrrp_instance VI_HA {
    % if TYPE == 'MASTER':
    state MASTER
    % else:
    state SLAVE
    % endif
    interface ${L3_AGENT.get_ha_device_name(TRACK_PORT_ID)}
    virtual_router_id ${VR_ID}
    priority ${PRIORITY}
    track_interface {
    virtual_ipaddress {
        % if EXTERNAL_PORT:
        ${EXTERNAL_PORT['ip_cidr']} dev ${L3_AGENT.get_external_device_name(EXTERNAL_PORT['id'])}
        % if FLOATING_IPS:
        ${FLOATING_IPS[0]['floating_ip_address']}/32 dev ${L3_AGENT.get_external_device_name(EXTERNAL_PORT['id'])}
        % endif
        % endif

        % if INTERNAL_PORTS:
        ${INTERNAL_PORTS[0]['ip_cidr']} dev ${L3_AGENT.get_internal_device_name(INTERNAL_PORTS[0]['id'])}
        % endif
    virtual_ipaddress_excluded {
        % if EXTERNAL_PORT:
        % for FLOATING_IP in FLOATING_IPS[1:]:
        ${FLOATING_IP['floating_ip_address']}/32 dev ${L3_AGENT.get_external_device_name(EXTERNAL_PORT['id'])}
        % endfor
        % endif

        % for INTERNAL_PORT in INTERNAL_PORTS[1:]:
        ${INTERNAL_PORT['ip_cidr']} dev ${L3_AGENT.get_internal_device_name(INTERNAL_PORT['id'])}
        % endfor

    virtual_routes { via ${EXTERNAL_PORT['ip_cidr'].split('/')[0]} dev ${L3_AGENT.get_external_device_name(EXTERNAL_PORT['id'])}
    % endif

Conntrackd template

General {
    HashSize 8192
    HashLimit 65535
    Syslog on
    LockFile ${LOCK}
    UNIX {
           Path ${SOCK}
           Backlog 20
    SocketBufferSize 262142
    SocketBufferSizeMaxGrown 655355
    Filter {
           Protocol Accept {
          Address Ignore {
Sync {
    Mode FTFW {
    UDP Default {
           IPv4_address ${TRACK_PORT_LOCAL['ip_cidr'].split('/')[0]}
           IPv4_Destination_Address ${TRACK_PORT_REMOTE['ip_cidr'].split('/')[0]}
           Port 3780
           Interface ${L3_AGENT.get_ha_device_name(TRACK_PORT_ID)}
           SndSocketBuffer 24985600
           RcvSocketBuffer 24985600
           Checksum on