Difference between revisions of "PCI passthrough SRIOV support"
(→Implement tracking device allocated for the pci-flavor) |
(→Requements from SRIOV) |
||
Line 330: | Line 330: | ||
*group device | *group device | ||
for SRIOV, all VFs belong to same PF share same physical network reachability. so if you want, say, deploy a vlan network, you need choose the right PF's VF, otherwise network does not work for you. the pci flavor does this work well. | for SRIOV, all VFs belong to same PF share same physical network reachability. so if you want, say, deploy a vlan network, you need choose the right PF's VF, otherwise network does not work for you. the pci flavor does this work well. | ||
− | * | + | *mark the device alloced to the the flavor |
− | networking or other special deive is not as simple as pass though to the VM, there is need more configration. to acheive this, SRIOV must know the device infomation allocation to the specific | + | networking or other special deive is not as simple as pass though to the VM, there is need more configration. to acheive this, SRIOV must know the device infomation allocation to the specific flavor. |
=== Implement the grouping=== | === Implement the grouping=== |
Revision as of 14:01, 12 December 2013
Contents
- 1 background
- 2 PCI configration API use cases
- 2.1 Use cases
- 2.1.1 admin check PCI devices present per host
- 2.1.2 admin check avaliable pci flavor (white-list)
- 2.1.3 admin create a pci flavor (white-list)
- 2.1.4 Take advantage of host aggregate
- 2.1.5 admin delete a pci flavor (white-list)
- 2.1.6 admin configures extra spec in flavor request pci device
- 2.1.7 admin boot VM with this flavours
- 2.1.8 admin configures SRIOV flavor
- 2.1.9 Admin config SRIOV
- 2.2 transite config file to API
- 2.1 Use cases
- 3 DB for pci configration
- 4 Requements from SRIOV
- 5 Implement the grouping
- 6 Implement device mark from the pci-flavor
background
PCI devices has not only PCI standard property like BDF, vendor_id etc, it also has some extra information which may be application specific. For example, attached network switch for NIC, or resolution for GPU etc.These information can't be achieved through hypervisor, and may be provided externally through like configuration file.
Currently nova PCI support has basic support for such extra information in database and object layer. But we need more effort to it, including: get such information from configuration file, group devices with same extra information value etc.
this design based on this discsstion docs, the part which achieve agreement : https://docs.google.com/document/d/1EMwDg9J8zOxzvTnQJ9HwZdiotaVstFWKIuKrPse6JOs/edit?pli=1#heading=h.30de7p6sgoxp
link back to bp: https://blueprints.launchpad.net/nova/+spec/pci-extra-info
PCI configration API use cases
To get a better between user and amdin, and remove redudent code from pci, alias will fade out, white-list will be used to map devices to an pci-flavor: the group to use for scheduler and configration. this approach keep the possibility to take advange of aggregate.
User will see flavors like:
- flavor that gives you a cheap GPU
- flavor that gives you a big GPU
- flavor that gives you two SSD disks (of varying types depending on where it lands) and a big GPU
- flavor that gives you your public network via SRIOV (which could be one of several makes of network card, depending on the host picked)
And for the admin...
Admin sees:
- per host devices (adding things to os-host/<host>/pci-device)
- lists pci devices present
- lists pci devices that are exposed to users, and which are in use or free
- per pci-flavor
- creates a pci-device description
- specifies vendor-id, address, uuid, name, etc
- this used to be a combination of alias and whitelist
- this could be overlapping descriptions
- flavor extra specs
- this has entries that describe:
- list of possible pci-flavors that could be picked
- use the key: pci-passthrough:<label_just_to_make_uniqe value:<pci-flavor-uuid-1>,<pci-flavor-uuid-2>
- for multiple devices, you just add multiple entries
- this has entries that describe:
Take advantage of host aggregate:
- host aggregates used to map hosts to pci-flavor
- use host aggregates to expose specific pci-flavors as available on a particular host
Use cases
admin check PCI devices present per host
admin might want to know if there are some pci device avaliable to use, it's convenience for admin to know such infomation.
nova host-list pci-device GET v2/{tenant_id}/os-hosts/{host_name}/os-pci-devices
return a summary infomation about pci devices on this host:
os-pci-devices:{ [ { 'vendor_id':'8086', 'product_id':'xxx', 'address': '0000:01:00.7', ... 'pci-type'VF', 'status': 'avaliable' , }, ] }
to find which pci-flavor this device belong to, status, we had to query the database.
currently pci device in the databse is after filter, if want inpect the device on node from db, we should let all device going in to database. db will too large then eventually slow down query, became a scale problem. so we'd use RPC call for this goal, use RPC call to get reulst from a node, show it to admin.
admin check avaliable pci flavor (white-list)
- list all avaliable pci flavor (whitelist)
nova pci-flavor-list GET v2/{tenant_id}/os-pci-flavors data: os-pci-flavors:{ [ { 'UUID':'xxxx-xx-xx' , 'description':'xxxx' ..... 'pci-flavor':'xxx', } , ] }
- list avaliable pci flavor on host (white list)
nova host-list pci-flavor
GET v2/{tenant_id}/os-hosts/{host_name}/os-pci-flavors data: os-pci-flavors{ [ { 'pci_flavor_uuid': <uuid>, total: 10, available: 6, in_use: 2 } , ] }
- get detailed infomation about one pci-flavor:
nova pci-flavor-show <UUID> GET v2/{tenant_id}/os-pci-flavor/<UUID> data: os-pci-flavor: { 'UUID':'xxxx-xx-xx' , 'description':'xxxx' ... 'pci-flavor':'xxx', }
admin create a pci flavor (white-list)
- create flavor
nova pci-flavor-create name 'GetMePowerfulldevice' description "xxxxx"
API: POST v2/{tenant_id}/os-pci-flavors data: pci-flavor: { 'name':'GetMePowerfulldevice', description: "xxxxx" } action: create database entry for this flavor.
- update flavor defination
nova pci-flavor-update UUID set 'description'='xxxx' 'address'= '0000:01:*.7', 'host'='compute-id' PUT v2/{tenant_id}/os-pci-flavors/<UUID> with data : { 'action': "update", 'pci-flavor': { 'description':'xxxx', 'address': '0000:01:*.7'} } } action: set this as the new defination of the pci flavor.
Take advantage of host aggregate
host aggregate can be used to enhancement the scheduler for PCI.
- create aggregate
nova aggregate-create pci-aware-group nova aggregate-add-host host1 nova aggregate-add-host host2
- map flavor to aggregate
nova aggregate-set-metadata pci-aware-group set 'pci-flavor'='intelNICpublic, intelNICprivate, nvidiaGPUnew, nvidiaGPUolder'
this means all hosts in the aggregate can provide these pci-flaovr if the host had free one. and this infomation also usefull for pci flavor filter on these hosts, we can check only these flavor on these hosts, don't need check each flavor we had in DB.
- set instance flavor key to enhancement PCI scheduler
nova flavor-create --is-public true m1.iwantPCI 100 2048 20 2 nova flavor-key 100 set 'pci-flavor='1:intelNICprivate; 1:intelNICprivate; 1:nvidiaGPUnew, nvidiaGPUolder'
these information can use to select the aggreate, or try keep instance not scheule to those host if instance don't want the pci passthrough.
admin delete a pci flavor (white-list)
nova pci-flavor-delete <UUID>
API will be: DELETE v2/{tenant_id}/os-pci-flavor/<UUID> flow: delete it from database
admin configures extra spec in flavor request pci device
to allocate the device from a pci flavor, just fill pci flavor into the flavor's extra spec:
nova flavor-key 100 set 'pci-flavor='1:intelNICprivate; 1:intelNICpublic'
admin boot VM with this flavours
nova boot mytest --flavor m1.small --image=cirros-0.3.1-x86_64-uec
admin configures SRIOV flavor
- create a pci flavor for the SRIOV
nova pci-flavor-create name 'vlan-SRIOV' description "xxxxx" nova pci-flavor-update UUID set 'description'='xxxx' 'address'= '0000:01:*.7'
Admin config SRIOV
- create pci-flavor :
{"name": "privateNIC", "neutron-network-uuid": "uuid-1", ...} {"name": "publicNIC", "neutron-network-uuid": "uuid-2", ...} {"name": "smallGPU", "neutron-network-uuid": "", ...}
- set aggregate meta according the flavors existed in the hosts
flavor extra-specs, for a VM that gets two small GPUs and VIFs attached from the above SRIOV NICs:
nova aggregate-set-metadata pci-aware-group set 'pci-flavor'='smallGPU,oldGPU, privateNIC,privateNIC'
- create instance flavor for sriov
nova flavor-key 100 set 'pci-flavor='1:privateNIC; 1: publicNIC; 2:smallGPU,oldGPU'
- User just specifies a quantum port as normal:
nova boot --flavor "sriov-plus-two-gpu" --image img --nic net-id=uuid-2 --nic net-id=uuid-1 vm-name
the uuid-1 and uuid-2 map to a "provider" network (with VLAN config, etc) that gets implemented using the privateNIC and publicNIC flavors, we bind the flavor to the network uuid alredy via "neutron-network-uuid" key, network specific code can identify the deivce binding to that network/interface.
transite config file to API
- the config file for alias and whitelist defination is going to deprecated.
- if database is not NULL , configration is ommit and given deprecated warning.
- if database is NULL, config if read from the file,
*white list/alias schema still work * And also given a deprecated notice, alias will fade out which will be remove start from next release.
with this solution, we move pci config from file to API.
DB for pci configration
each pci flavor will be a set of (k,v), and the pci flavor don't need to contain same k, v pair. another problem this define try to slove is, i,.e SRIOV also want feature autodiscovery(under discuss), with this, the flavor might need a 'feature' key to be added if not store it as (k,v) pair. the (k,v) paire define let more extra infomation can be store in the pci device.
talbe: pci_flavor{ id : data base of this k,v pair UUID: which pci-flavor the k,v belong to key value (might be a simple value or reduce Regular express) }
API interface
- get pci devices infomation on host
nova host-list pci-device GET v2/{tenant_id}/os-hosts/{host_name}/os-pci-devices return a summary infomation about pci devices on this host: os-pci-devices:{ [ { 'vendor_id':'8086', 'product_id':'xxx', .... 'pci-type'VF', 'status': 'avaliable' , }, ] }
* list avaliable pci flavor (white list) nova pci-flavor-list GET v2/{tenant_id}/os-pci-flavors data: os-pci-flavors{ [ { 'UUID':'xxxx-xx-xx' , 'description':'xxxx' 'vendor_id':'8086', .... 'pci-flavor':'xxx', } , ] }
- list avaliable pci flavor on host (white list)
nova host-list pci-flavor GET v2/{tenant_id}/os-hosts/{host_name}/os-pci-flavors data: os-pci-flavors{ [ { 'pci_flavor_uuid': <uuid>, total: 10, available: 6, in_use: 2 } , ] }
- get detailed infomation about one pci-flavor:
nova pci-flavor-show <UUID> GET v2/{tenant_id}/os-pci-flavor/<UUID> data: os-pci-flavor: { 'UUID':'xxxx-xx-xx' , 'description':'xxxx' .... 'address': '0000:01:*.7', 'pci-flavor':'xxx', }
- create pci flavor
nova pci-flavor-create name 'GetMePowerfulldevice' description "xxxxx" API: POST v2/{tenant_id}/os-pci-flavors data: pci-flavor: { 'name':'GetMePowerfulldevice', description: "xxxxx" } action: create database entry for this flavor.
- update the pci flavor
nova pci-flavor-update UUID set 'description'='xxxx' 'address'= '0000:01:*.7', 'host'='compute-id' PUT v2/{tenant_id}/os-pci-flavors/<UUID> with data : { 'action': "update", 'pci-flavor': { 'description':'xxxx', 'address': '0000:01:*.7'} } } action: set this as the new defination of the pci flavor.
- delete a pci flavor
nova pci-flavor-delete <UUID> DELETE v2/{tenant_id}/os-pci-flavor/<UUID>
Requements from SRIOV
- group device
for SRIOV, all VFs belong to same PF share same physical network reachability. so if you want, say, deploy a vlan network, you need choose the right PF's VF, otherwise network does not work for you. the pci flavor does this work well.
- mark the device alloced to the the flavor
networking or other special deive is not as simple as pass though to the VM, there is need more configration. to acheive this, SRIOV must know the device infomation allocation to the specific flavor.
Implement the grouping
concept introduce here: spec: a filter defined by (k,v) paris, which k in the pci object fileds, this means those (k,v) is the pci device property like: vendor_id, 'address', pci-type etc. extra_spec: the filter defined by (k, v) and k not in the pci object fileds.
pci utils/objects support grouping
* pci utils k,v match support the address reduce regular expression * objects provide a class level extrac interface to extract base spec and extra spec
pci-flavor(white list) support address set
* white list support 'address' reduce regular expresion compare. * white list support any other (k,v) pair to group or store special infomation * object extrac specs and extra_info, specs use as whitelist spec, extra info will be updated to device's extra_info fields
enable flavor support pci-flavor
* pci-flavor's name set in the extra spec in the instance type * pci manager use extrac method extrac the specs and extra_specs, match them agains the pci object & object.extra_info.
pci stats grouping device base on pci flavor
* current gourping base on [vendor_id, product_id, extra_info] * going to use 'pci-flavor' grouping the device. * still keep compatible by default, via a new config option switch to new grouping policy.
Implement device mark from the pci-flavor
here is the idea how user can identify which device allocated for the pci-flavor.
*while define the flavor, put a marker(network uuid) into the flavor then store in the device's extra_info fileds *after finished allocation, user can seach a instance's pci devices to find the specific device do further configration
the way marker data transfer from user to device utilize the pci_request, which convert from the pci-flavor.