NovaNetNeutronRecipes

[OBSOLETE to be deleted]

This is just a rough import of parts of an earlier parity document. It does have a general outline/description of network models and deployment styles that might be relevant for constructing recipes->guidelines of moving from nova to neutron.

Nova Networking Deployment Models
Nova networking provides three principle network manager implementations. The implementations roughly map to levels of complexity and capabilities. Only one network manager type can be deployed per nova-networking instance. The FlatDHCP and VLAN manager network types share services and general functionality, but are directed at very specific use cases. Creating a network definition is tightly coupled to the chosen network manager. Changing a network manager in a deployment typically means migrating all deployed servers to the new definition.

Nova-networking’s network managers are very direct and specialized. This specialization minimizes initial complexity and makes typical deployments fairly easy to implement. There is a cost however as specialization introduces constraints that can be relatively complex or unfeasible to overcome.

The situation is quite different in Neutron based deployments. Specific networking elements are made available and typical configurations are “suggested” but not required. Neutron allows multiple network “types” to be deployed without extra effort, services like L3 routing and DHCP are added or not depending on a tenant’s requirements. While it is not clear that it was a design objective, if you can conceive how to configure it “IRL” with hardware, there is an analog in neutron. This flexibility comes at a high cost with respect to complexity and ease-of-use. From a design and implementation perspective, the additional complexity adds significantly more risk with respect to code and performance. All are areas of concern with respect to parity and deprecation.


 * ease of use
 * performance
 * quality
 * safe migration paths

The Floating IP Question
The definition of what a floating IP and it's exact purpose seems a little hard to pin down. It is not untrue that it is an address that a.) may be associated and disassociated with a VM without affecting the configuration of that VM, b.) is not directly reflected as a physical interface, instead it is a mapped address from a gateway to an actual VIF allowing it to “float”. In practice, a floating IP is often referred to as the “public” IP address.

One of the more difficult changes from nova-networking to neutron involves how floating Ips are implemented and “work”. Floating IP addresses in nova-networking result in modifications to NAT rules to support mapping the address to a server instance’s private interface. The floating IPs is also added as an aliases to the gateway (public) interface. The result is that routing from the controller works immediately. It is a straightforward, simple approach that is easy to troubleshoot. In addition to a simplified floating IP address management solution, this approach allows outgoing traffic from a VM “always works” as long as the node where the relevant nova-networking instance is running has access to external networks regardless of whether floating IPs are assigned.

In Neutron, the work of mapping floating IPs to private addresses is commonly done by one or more L3 agents. The L3 agent effectively directs outgoing traffic from a tap interface on the integration bridge to an external gateway interface. This is independent of whether the source of this traffic has a floating IP address or not. Incoming traffic is accepted through the external gateway interface and mapped to an appropriate internal address using iptables NAT rules. There are multiple consequences of this approach with respect to parity:

Another nova-networking floating IP related feature is the ability to automatically assign addresses from the floating IP pool when a VM is instantiated. Implementing this functionality in neutron is more complicated due to the details of floating IP management the finer grained control over network definitions. This may be an example of a capability where parity might be best achieved indirectly. Regardless of the approach, this is a case where the intent of the capability is as important as how it works. It may be that neutron allows for a different approach that more directly addresses the intent..
 * Access to external networks is only possible through the L3 agent and so requires extra configuration even if floating IPs are not in use.
 * As the external gateway is a network bridge interface, special consideration may be necessary for the bridge interface to connect to the external network. How to go about this may be outside of the experience of non-networking oriented users.
 * Packet flow through the external interfaces and integration bridge introduce a greater degree of complexity, even for simple deployments..

Obsoleted/Not Making the Cut
There are two areas of nova-networking that are already deprecated within the context of nova-networking. It is unreasonable to extend the lifetime of these capabilities as a part of establishing parity.


 * nova-networking can configure default private and public address ranges through configuration files that are immediately available after installing and deploying;
 * nova-networking implemented networks can be modified via the nova-manage command line utility.

The Tenant Networks
There are two basic tenant network models that are implemented in nova-networking and neutron: flat and isolated. The implementation details differ, but the same abstract functionality is provided in both nova-networking and neutron.. In nova-networking, a model and its variants are implemented through separate manager types. Neutron is comprised of implementations of flexible abstractions, so a model is realized primarily through configuration. A benefit of neutron’s flexibility is that multiple models can be realized concurrently, whereas nova-networking is restricted to one type per instance at a time.

The Flat Model
A “flat” network model has a few key distinguishing attributes:
 * The “tenant network” is shared among all tenants. The implications are that they share the same L2 space. While multiple address spaces are possible, all OpenStack instances could potentially access other tenant's data.
 * Tenant network communication between multiple OpenStack nodes require a bridge to a network interface device on each OpenStack node that are connected to the same physical network.

Services provided to the tenant network such as DHCP address management, DNS, routing and NAT are specializations of the flat model. We will examine the flat model with and without services.

Flat Network – No Services
Nova-networking supports a basic flat network model through the FlatNetworkManager class. The flat manager provides basic network address management and allocation but requires direct modification of the server instance to configure the interface (earlier versions of the FlatNetworkManager did not provide this functionality). The tenant network must be bridged with an active network link to access external devices (e.g. the Internet, DNS servers, DHCP servers) Historically this has been achieved by bridging either the primary interface on the hypervisor node or a secondary interface with the required external connectivity. Services such as L3 routing and NAT services are handled outside of OpenStack. While the functionality provided by the FlatNetworkManager is minimal, it creates a network environment that is free from obstruction. Useful for simple proof-of-concept networks, it is also useful for more complex deployments that require specialized network management tools.

In neutron, this model is realized through configuration. It is not as straightforward as nova-networking approach insofar that you need to understand what the model is and the steps required to implement it. Neutron uses a single bridge called the “integration bridge” (often named br-int) that all server instances are connected to. This is essentially the same as the linux bridge used to connect VMs to a physical interface in the nova-networking flat model implementation. Since connectivity with an external network is required for external access or multi-node deployments, the integration bridge must be connected to a physical network interface, either directly or through a bridge. This is configured by a combination of system configuration and configuration settings of L2 plugin.

The FlatNetworkManager (thanks rkukura for spelling this out!):


 * flat networks
 * flat networks

(would this work for FlatDHCPNetworkManager as well?)

Flat Network – with Services (DHCP, Routing and NAT)
With simple services like DHCP, NAT and routing, the flat network model becomes a complete, self-contained, virtual network package suitable for proof-of-concept, development and private clouds. The lack of tenant isolation makes model unsuitable for larger multi-tenant deployments or private environments where data network isolation is required..It is similar to the networking features provided in the typical home router.

Nova-networking provides these services through the FlatDHCPNetworkManager. DHCP services are provided through configuring dnsmasq and routing and NAT is handled through configuration of the network controllers iptables firewall and external interface(s).

Neutron implements this type of model through configuration of agents such as the DHCP agent and the routing agent. The configuration is more involved as it exposes the details of how this type of network actually works. Deployers who have configured their own dual interface PC with a DHCP server, iptables, etc. will find neutron’s abstractions familiar. A significant conceptual gap exists in the configuration of the external gateway for the routing agent.

Isolated Tenant Networks
Isolated tenant networks implement some form of isolation of L2 traffic between distinct networks. VLAN tagging is key concept, where traffic is “tagged” with an ordinal identifier for the VLAN. Only devices with the same associated VLAN id are able to communicate with each other. Untagged data or data with a different VLAN tag is ignored. Isolated network implementations may or may not include additional services like DHCP, NAT and L3 routing.

Nova-networking provides isolated tenant networks through the VlanNetworkManager class. It is similar to the FlatDHCPNetworkManager in its operations except that it creates datapaths that tag and untag data to provide the required isolation. Network definitions for VLANs must take into account the absence of overlapping network address namespaces. This can occasionally make network management a little awkward.

With network namespace support and flexible VLAN id mapping, neutron’s flexibility and natural application to this type of deployment starts to pay-off. DHCP services may or may not be configured, as well as L3 services such as NAT and routing to external networks.

Deployment Models
It is instructive to describe some “idealized” deployment models that illustrate the key deployment issues. For clarity, the terms used in the descriptions of the deployments are defined here:
 * network controller - software agents that manage network related configuration of OpenStack service hosts;
 * compute node - a OpenStack service host that provides hypervisor and server instance lifetime management functionality

For each deployment model, the following topics are discussed for nova-networking and neutron:


 * configuration;
 * significant implementation details;
 * known constraints and related minutiae;
 * known parity issues.

Single Collocated Network Controller and Compute Node
While the “all services running on a single host” installation is an example of this deployment model, we are only concerned with the situation where the network controller and compute node are running on the same host. Also, there can be only one such node. Other services such as nova API, scheduler, etc. as well as the other OpenStack components may be running on other nodes without altering the relevant details of this deployment model.

A single compute node with a collocated network controller is the simplest configuration. If extension of tenant networks across nodes is not an issue, there is no need to establish inter-node connectivity for the tenant network by bridging the primary interface, creating a second interface or providing some form of tunnelling.

In nova networking, the VlanNetworkManager and FlatDHCPNetworkManager implementations handle floating IP management. A default SNAT rule along with iptables stateful NAT allows all server instances to access the external networks immediately. In addition to configuring SNAT/DNAT rules for the floating IP-fixed IP mapping, the nova networking network driver adds the floating IP to the interface connected to the public network. This allows direct access to VMs (security groups notwithstanding) through the associated floating IP address without extra system configuration. DHCP is handled through a dnsmasq instance for a particular address space.

In neutron, the fully functional collocated networking and compute node is made up of several pieces. Each piece is configured and operates independently of each other. Floating IP management is handled through the routing agent which manages an independent network namespace, bridge interfaces and iptables to map floating IP addresses to private network addresses on the tenant networks. The router is configured with a gateway bridge and traffic to external networks flows through that bridge. The gateway bridge may be externally managed or may reflect a physical network mapping configured in the L2 plugin. Private networks are explicitly added to the router and cannot otherwise gain access to external networks. The fine grained control allows a great deal of flexibility but does not provide the same level of user experience as nova-networking.

Configuring the external gateway interface has proved problematic for newcomers and experienced nova networking gurus alike. The key differentiator is that the interface that neutron’s router uses (the public interface in nova-networking) cannot be the same as the host’s primary “management network” interface. This has proven difficult for experimenters with a single NIC. The gateway interface is actually a bridge and bridging the primary interface causes it to lose addressing, etc. which can shut the host off of the actual network. A second NIC makes this easier, but there are ways to solve it with a single interface. Perhaps the most direct way is to bridge the primary interface at boot time and set the host’s IP address on the bridge. Access to the physical network is retained as well as the ability to access the host system from the management network. This bridge can then be used to connect the router to the outside world. The exact mechanics of how it is done depends on the preferences or needs of the deployer. This is one area where there is a significant usability gap between nova-network and neutron.

Configuration of private networks in nova-networking and neutron does not differ much in this type of deployment. Configuration of bridges, VIFs and rules for security groups, etc. differ somewhat but operate under similar principles. Actual command lines used to create networks, etc. will differ and the introduction of network namespaces may make some approaches to managing VLANs irrelevant, but beyond that most changes will not have that great of impact.

Single Network Controller Node, Multiple Compute Nodes
Multiple compute nodes makes it necessary to extend the tenant network across multiple hosts. The most direct approach is to provide a second physical network and “private interfaces”. These private interfaces are generally not used for anything other than bridging tenant networks. With an isolated network, it may be configured as a “trunk” or independent interfaces are used for each VLAN. If the compute nodes are limited to a single network interface, it may be bridged and additional virtual network interfaces defined and added to the bridge. This may or may not be ideal for a production environment, but is an useful for proof-of-concept, development or small or private clouds. Alternatively, the tenant network data may be encapsulated and transferred between nodes through a tunnelling protocol (e.g. GRE).

This is an area where nova-networking and neutron are, in some respects, not very different. The exception would be that neutron provides more options such as GRE tunnelling and multiple physical network and VLAN mappings. The major differences are arguably in troubleshooting and debugging. For example, if Open vSwitch is used, knowledge of the relevant command line tools, etc. is required and may be new to some users.

The other major difference between nova networking and neutron is the “multi-host” capability. The “multi-host” deployment is a nova-networking concept that includes details of how components are deployed as well as how networks are created and configured. It is a pervasive deployment decision that extends to the requirement of each compute node having some access to the external network. Multi-host also affects network creation and is immutable. “Changing your mind” after a complete deployment is not a simple matter as it usually entails starting from scratch, including recreating the network and redeploying OpenStack services and re-instantiating any OpenStack servers.

The details of the “multi-host” feature is influenced by the realities of nova-network’s implementation. With neutron the goal should be to achieve the benefits of multi-host without incurring the limitations and, at the same time, adhering to the “spirit of neutron”. If the end result is cleaner, more flexible and subjectively more appealing, then there is some validation in neutron’s approach. Regardless, the multi-host question represents a parity gap and must be addressed as part of the deprecation efforts.

Support for Multiple NICs
Supporting multiple NICs in nova-networking is invasive and involves configuring multiple redundant data paths to the OpenStack server instances. Each instance is configured with a network interface for each data. The idea is that if one of these paths should fail, traffic can continue through the alternate path. For migration or backwards compatibility purposes, Neutron can easily support multiple networks for OpenStack server instances. However, support for redundant networks can be implemented more elegantly as multiple physical network bindings on the integration bridge. This is an improvement as it is completely transparent to the OpenStack servers.

Every compute node also a network controller

 * runs nova-network multi-host HA FlatDHCP
 * Every compute node (tested up to hundreds) runs nova-network, meaning:
 * no requirement for additional resources for a "network controller" - important for high throughput deployments
 * no bottleneck in network traffic as with a shared "network controller"
 * higher availability: if nova-network on that compute host goes down, only the VMs on that compute host are affected
 * dnsmasq configuration allows use of hardware gateway, meaning very little L3 is done on the host
 * shared_dhcp option means only one IP address is necessary/used on the publicnet

(jaypipes): Note that multi-host nova-network also works well with VlanManager. In fact, in all our production deployments, this setup of multi-host nova-network with VlanManager has been by far the most reliable, stable, and performant networking solution.

So far, not solved by neutron.

An Aside: Arbitrary Distribution
The two deployment models supported by nova-networking are multi-host and single controller. While neutron can easily be adapted for arbitrary deployment scenarios, there are no other nova-networking related options that are germane to this discussion.

Client Functionality (python-novaclient)
A working list of relevant client operations must be maintained as they are bound to change. The following is a minimal subset of potentially related client operations that should perform equivalent operations in nova networking and neutron. Test drivers should exercise these operations with both “back-ends” and use the same acceptance criteria. The following is required for each operation: