Nova-neutron-sriov

= Nova: support Neutron SR-IOV ports =

Background
This blueprint is based on the discussions as documented in this wiki [PCI Passthrough Meeting]

While the blueprint [PCI Passthrough SR-IOV Support] addresses the common SR-IOV support in Nova, this blueprint attempts to capture the changes in Nova in order to support the neutron SR-IOV ports.

Traditionally, a neutron port is a virtual port that is either attached to a linux bridge or an openvswitch bridge on a compute node. With the introduction of SR-IOV, the intermediate virtual bridge is no longer required. Instead, the SR-IOV port is associated with a virtual function (VF) that is supported by the vNIC adaptor. In addition, the SR-IOV port may be extended to an upstream physical switch (IEEE 802.1br), and in such case, the port's configuration takes place in that switch. The SR-IOV port can also be connected with a macvtap device that resides on the host, which is then connected with a VF on the vNIC. The benefit of using a macvtap device is that it makes live migration with SR-IOV possible. We'll use a combination of vnic-type and vif-type to support the above requirements.

In this document, we use the term neutron SR-IOV port to refer to a VF that can be configured as an Network interface.

A vNIC may be configured with multiple PFs, with each of them supporting the configuration of multiple VFs. This means that neutron SR-IOV ports are limited resources on a compute node. In addition, a neutron SR-IOV port is connected to a physical network, and different SR-IOV ports may be connected to different physical networks. Therefore, a VM that requires an SR-IOV port on a particular network needs to be placed on a compute node that supports neutron SR-IOV ports on that network.

nova boot: Specify a neutron SR-IOV port
Given the limited time we have in Icehouse, we decided not to change the syntax of the nova boot API initially. In order to specify a neutron SR-IOV port for a VM, however, the semantics of the port-id parameter in the --nic option will be extended to support SR-IOV ports. With these blueprints, each neutron port will be associated with a binding:profile dictionary, in which the port's vnic-type and pci flavor are defined.

a pci-extra-attr net-group
It's assumed, and it's highly possible that the pci flavor APIs specified in this wiki [PCI Passthrough SR-IOV Support] will not be available in the Icehouse release. To support neutron SR-IOV ports in Icehouse, a pci-extra-attr net-group is defined. The values of this attribute are the names of all the physical networks that are supported in a cloud. Further, PCI stats will be collected with net-group as the grouping key.

PCI Passthrough Device List
In the wiki [PCI Passthrough SR-IOV Support], this is called PCI information. To support neutron SR-IOV ports, on a compute node, define the PCI passthrough device list so that neutron PCI devices are tagged with net-group. For example, if a compute node supports two physical networks: service_provider_net and customer_net, then in the PCI passthrough device list, PCI devices for networking can be tagged with either "net-group: service_provider_net" or "net-group: customer_net"

vnic-type
Each neutron port has a vnic-type. Three vnic types are defined:
 * virtio: the traditional virtual port
 * direct: direct pci passthrough without macvtap
 * macvtap: pci passthrough with macvtap

vif-type
Each neutron port is associated with a vif-type. Two vif-types are in our interest here:
 * VIF_TYPE_802_QBG: corresponds to IEEE 802.1QBG. However, this existing vif type may not be useful now because the libvirt parameters for 1QBG (managerid, typeidversion and instanceid) are not supported by known neutron plugins that support SR-IOV.
 * VIF_TYPE_802_QBH: corresponds to IEEE 802.1BR (used to be IEEE 802.1Qbh)
 * VIF_TYPE_HW_VEB: for vNIC adapters that supports virtual embedded switching itself.

IEEE 802.1Qbh/802.1br profileid
The profileid is a name that defines a template for the port configuration in the upstream switch. The profileid is communicated by the device driver to the upstream switch and as a result, a port is configured and brought up. Note that the association of an SR-IOV port with a profileid is made when the neutron port is created. The profileid is an optional parameter, and only required when the vNICs on the host supports it.

vlan id
Vlan id is required by IEEE 802.1QBG and VIF_TYPE_HW_VEB.

Putting It All Together
On the controller node, make sure net-group is included in the pci_flavor_attrs. Make sure the neutron plugin is properly configured, especially the physical network configuration. On a compute node, make sure PCI devices for networking are tagged with "net-group: " in the PCI passthrough device list. Assuming that this compute node supports two physical networks: service_provider_net and customer_net. After the nova compute service is successfully loaded up on this compute node, the number of PCI devices belonging to each of the net groups will be calculated and later retrieved by the controller when they are needed.

In order to create a VM with a SR-IOV port, a user would do the following:

1: create a neutron network
 neutron net-create --provider:physical_network=service_provider_net --provider:network_type=vlan --provider:segmentation_id=100   Note that the --provider:* arguments may not be provided. In that case, proper values for each of the arguments will be used depending on the configuration. With the above command, a neutron network is created and associated with a physical network

2: create a neutron subnetwork
this is important for neutron, but not important for SR-IOV operation. So we will skip the example

3: create a neutron port
use direct pci passthrough as an example  neutron port-create  --name sriov_port --vnic-type direct  The port sriov_port is created and associated with the network that is created from step 1. Obviously, this port is on the physical network service_provider_net.

4: boot up an instance
 nova boot --flavor m1.large --image  --nic port-id= 


 * nova-api validates the port-id by invoking the neutronv2 API. As part of the validation, the information that nova retrieves from neutron will include a valid physical network name. (Note that, when pci flavor is available and required, the retrieved information will include a valid pci flavor name). In our example above, the physical network name is service_provider_net.
 * nova-api creates a PCI request with "net-group: service_provider_net"
 * nova scheduler makes its decision by calling the PCI passthrough filter with the PCI request, and decides the compute node where the instance will be placed
 * on the compute node where the instance is placed, a PCI device is allocated. Nova compute invokes neutronv2 API to update the port, especially providing the detailed PCI information. Neutron returns detailed port information. As a result, a vif object is created. vnic-type, vif-type, vlan-id and profileid will be stored in the vif object.
 * nova compute generates the device config and the interface xml to be part of the domain xml

Scope of changes

 * interpret the enhanced port-id parameter. For each neutron SR-IOV port, create a PCI request
 * nova.network.neutronv2: changes required to support binding:profile
 * vif dictionary: add vlan id.
 * nova.virt.libvirt: add support to generate configs and interface XML for neutron SR-IOV ports
 * live migration: macvtap plus per interface network xml. This will be a stretch goal for the initial release