Jump to: navigation, search

Difference between revisions of "PCI passthrough SRIOV support"

(admin boot VM with this flavours)
m (Common PCI SRIOV design)
 
(77 intermediate revisions by 6 users not shown)
Line 1: Line 1:
 +
'''!!!This is a design discussion document, not for end user reference!!! '''
  
===background===
+
===Related Resource===
PCI devices has not only PCI standard property like BDF, vendor_id etc, it also has some extra information which may be application specific. For example, attached network switch for NIC, or resolution for GPU etc.These information can't be achieved through hypervisor, and may be provided externally through like configuration file.
+
This design is based on the PCI pass-through IRC meetings, provide common support for PCI SRIOV:
 +
*https://wiki.openstack.org/wiki/Meetings/Passthrough
 +
This document was used to finalise the design:
 +
*https://docs.google.com/document/d/1vadqmurlnlvZ5bv3BlUbFeXRS_wh-dsgi5plSjimWjU/edit#
 +
link back to bp:
 +
*https://blueprints.launchpad.net/nova/+spec/pci-extra-info
 +
 
 +
===Common PCI SRIOV design ===
 +
 
 +
PCI devices have PCI standard properties like address (BDF), vendor_id, product_id, etc, Virtual functions also have a property referring to the function's physical address.  Application specific or installation specific extra information can be attached to PCI device, like physical network connectivity using by Neutron SRIOV.
 +
 
 +
This bp focus on functionality to provide the common PCI SRIOV  support.
 +
 
 +
* on compute node the pci_information/white-list define a set of filter. PCI device passed filter will be available for allocation. at same time extra information attached to the PCI device.
 +
* PCI compute report the PCI stats information to scheduler. PCI stats contain several pools. each pool defined by several PCI property, control by the local configuration item: pci_flavor_attrs , default value is vendor_id, product_id, extra_info.
 +
* PCI flavor define the user point of view PCI device selector: PCI flavor provide a set of (k,v) to form specs to select the available device(filtered by pci_information/white-list).
 +
 
 +
the new PCI design based on 2 key change: the PCI flavor and the extra information attache to pci device. the following diagram is a summary of the design:
 +
 
 +
[[Image:PCI-sriov.jpg]]
 +
 
 +
====Design Choice ====
 +
   
 +
      PCI flavor:  User don't want to know details of a pci device and all of it's attrs and extra information attached to it.
 +
      (pci_flavor_attr:  admin need to know all PCI attrs to define flavor for user/tenant.)
 +
      PCI Stats:  compute node might have many devices and most of device properties are same,  summary a stats to scheduler can reduce DB load, simply schedule.
 +
      PCI extra info:  a pci device might attach to a specific network, a specific resource, with can be attach to device and schedule base on it.
 +
 
 +
 
 +
====PCI flavor====
  
Currently nova PCI support has basic support for such extra information in database and object layer. But we need more effort to it, including: get such information from configuration file, group devices with same extra information value etc.
+
For OS users, PCI flavor is a reasonable name like 'oldGPU', 'FastGPU',  '10GNIC', 'SSD', describe one kind of PCI device. User use the PCI flavor to select available pci devices. Internally the PCI flavor created by a set of API and saved in a DB table, keep PCI flavor available for all cloud.  
  
this design based on this discsstion docs, the part which achieve agreement :
 
https://docs.google.com/document/d/1EMwDg9J8zOxzvTnQJ9HwZdiotaVstFWKIuKrPse6JOs/edit?pli=1#heading=h.30de7p6sgoxp
 
  
link back to bp:
+
Administrator define the PCI flavors via matching expression that selects available(offer by white list) devices, and a reasonable name. PCI flavor  matching expression is a set of (k,v), the k is the PCI property, v is its value. not every PCI property is available to PCI flavor, only a selected set of PCI property can used to define the PCI flavor, the selected property should be global to cloud like vendor/product_id, can not be BDF or host of a PCI device. these selected PCI property is defined via compute local configuration :
https://blueprints.launchpad.net/nova/+spec/pci-extra-info
+
 
 +
    pci_flavor_attrs = vendor_id, product_id, ...
 +
 
 +
a important behavior is the PCI flavors could overlap - that is, the same device on the same machine may be matched by multiple flavors.
 +
 
 +
====Use PCI flavor in instance flavor extra info====
 +
 
 +
user set pci flavor into instance flavor's extra info to specify how many PCI device/and what type PCI flavor the VM want to boot with.
 +
 
 +
    nova flavor-key m1.small set pci_passthrough:pci_flavor= <pci flavor spec list>
 +
 
 +
    pci flavor spec:
 +
                    num1:flavor1,flavor2
 +
    mean: want <number>'s pci devices from flavor or flavor
 +
   
 +
    pci flavor spec list:
 +
                <pci flavor spec1>; <pci flavor spec2>
 +
 
 +
 
 +
for example:
 +
   
 +
    nova flavor-key m1.small set pci_passthrough:pci_flavor= 1:IntelGPU,NvGPU;1:intelQuickAssist;
 +
   
 +
    which define requirements:
 +
          boot with 1 of IntelGPU or NvGPU, and 1 IntelQuickAssist card.
  
 +
====PCI pci_flavor_attrs ====
  
===PCI configration API use cases ===
+
this configuration is keep local to every compute node, this will make deploy process can locally decide what PCI properties this node will exposed.
To get a better between user and amdin, and remove redudent code from pci, alias will fade out, white-list will be used to map devices to an  pci-flavor:  the group to use for scheduler and configration.  this approach keep the  possibility to take advange of aggregate.
 
  
 +
    pci_flavor_attrs = vendor_id, product_id, ...
  
User will see flavors like:
+
compute node update local pci extra properties to PCI flavor Database, which is accessible by flavor API, provide PCI properties to define flavor.
* flavor that gives you a cheap GPU
 
* flavor that gives you a big GPU
 
* flavor that gives you two SSD disks (of varying types depending on where it lands) and a big GPU
 
* flavor that gives you your public network via SRIOV (which could be one of several makes of network card, depending on the host picked)
 
  
And for the admin...
+
pci_flavor_attrs store in flavor DB as a normal flavor, it's name "_flavor_attrs", it's UUID use "0":
  
Admin sees:
+
    {"vendor_id":"Ture", "product_id":"True", ... }
  
* per host devices (adding things to os-host/<host>/pci-device)
+
this flavor contain all available attrs can be used to define pci flavor, list it's content use:
** lists pci devices present
 
** lists pci devices that are exposed to users, and which are in use or free
 
  
* per pci-flavor
+
    nova pci-flavor-show  0
** creates a pci-device description
+
    GET v2/​{tenant_id}​/os-pci-flavor/<0>
** specifies vendor-id, address, uuid, name, etc
+
    data:
** this used to be a combination of alias and whitelist
+
        os-pci-flavor: {
** this could be overlapping descriptions
+
                                'UUID':'0' ,  
 +
                                'description':'Available flavor attrs '
 +
                                'name':'_flavor_attrs',  
 +
                                'vendor_id": "True",
 +
                                'product_id": "True",
 +
                                ....
 +
          }
  
* flavor extra specs
+
====PCI request====
** this has entries that describe:
 
*** list of possible pci-flavors that could be picked
 
*** use the key: pci-passthrough:<label_just_to_make_uniqe  value:<pci-flavor-uuid-1>,<pci-flavor-uuid-2>
 
** for multiple devices, you just add multiple entries
 
  
Take advantage of host aggregate:
+
PCI request is a internal structure to represent all PCI devices a VM want to have.
* host aggregates used to map hosts to pci-flavor
 
** use host aggregates to expose specific pci-flavors as available on a particular host
 
  
====Use cases ====
+
    request = {'count': int(count),
 +
              'spec': [{"vendor_id":"8086", "phynetwork":"phy1"}, ...],
 +
              'alias_name': "Intel.NIC"}
 +
[[Image:PCI-Request.jpg]]
  
=====admin check PCI devices present per host =====
+
====Extra information of PCI device  ====
admin might want to know if there are some pci device avaliable to use,  it's convenience  for  admin to know such infomation.
 
  
    nova host-list pci-device
+
the compute nodes offer available PCI devices for pass-through, since the list of devices doesn't usually change unless someone tinkers with the hardware, this matching expression used to create this list of offered devices is stored in compute node config.
    GET  v2/​{tenant_id}​/os-hosts/​{host_name}​/os-pci-devices
 
  
return a summary infomation about pci devices on this host:
 
  
        os-pci-devices:{
+
    *the device information (device ID, vendor ID, BDF etc.) is discovered from the device and stored as part of the PCI device, same as current implement.
                      [ 
+
    *on the compute node, additional arbitrary extra information, in the form of key-value pairs, can be added to the config and is included in the PCI device
                              {
+
 
                                  'vendor_id':'8086',  
+
this is achieved by extend the pci white-list to:   
                                  'product_id':'xxx',
 
                                    'address': '0000:01:00.7',  
 
                                    ...
 
                                    'pci-type'VF',  
 
                                    'status': 'avaliable' ,
 
                                },  
 
                      ]
 
        }
 
  
to find which pci-flavor this device belong to, status, we had to query the database.
 
  
currently pci device in the  databse is after filter, if want inpect the device on node from db, we should let all device going in to database. db will too large then eventually slow down query, became a scale problem. so we'd use RPC call for this goal, use RPC call to get reulst from a node, show it to admin.
+
  *pci_information = [ { pci-regex } ,{pci-extra-attrs } ]
 +
  *pci-regex is a dict of { string-key: string-value } pairs , it can only match device properties, like vendor_id, address, product_id,etc.
 +
  *pci-extra-attrs is a dict of { string-key: string-value } pairs.  The values can be arbitrary  The total size of the extra attrs may be restricted. all this extra attrs will be store in the pci device table's extra info field. and the extra attrs should use this naming schema: e.attrname
  
=====admin check avaliable pci flavor (white-list)  =====
+
====PCI stats grouping device base on pci_flavor_attrs ====
 +
       
 +
PCI stats pool summary the devices of a compute node, and the scheduler use flavor's matching specs select the available host for VM. The stats pool must contain the PCI properties used by PCI flavor.
  
* list all avaliable pci flavor (whitelist)
+
  * current grouping is based on  [vendor_id, product_id, extra_info]
    nova pci-flavor-list
+
  *  going to group by keys specified by  pci_flavor_attrs.
 
   
 
   
     GET v2/​{tenant_id}​/os-pci-flavors
+
The algorithm for stats report should meet this request: the one device should only be in one pci stats pool, this mean pci stats can not overlap. this simplifies the scheduler design.
    data:
+
 
    os-pci-flavors:{
+
on computer node the pci_flavor_attrs provide the specs for pci stats to group its pool. and the pci_flavor_attrs on control node collection the attrs which can be used to define the pci_flavor. the definition of pci_flavor_attrs on controller should contain all the pci_flavor_attrs's content on every compute node.
                  [  
+
 
                            {  
+
*but compute node report stats pool by a subset of controller's pci_flavor_attrs is acceptable, in such scenario, this means the compute node can only provide the devices with these propertes.*
                                'UUID':'xxxx-xx-xx' ,
+
 
                                'description':'xxxx'  
+
===Use cases ===
                                  .....
+
 
                                  'pci-flavor':'xxx',  
+
==== General PCI pass through  ====
                                } ,
+
given compute nodes contain 1 GPU with vendor:device 8086:0001
                ]
+
 
      }
+
*on the compute nodes, config the pci_information
 +
     pci_information =  { { 'device_id': "8086", 'vendor_id': "0001" }, {} }
 +
 
 +
    pci_flavor_attrs ='device_id','vendor_id'
 +
 
 +
the compute node would report PCI stats group by ('device_id', 'vendor_id').
 +
pci stats will report one pool:
 +
  {'device_id':'0001', 'vendor_id':'8086', 'count': 1 }
 +
 
 +
* create PCI flavor
 +
 
 +
  nova pci-flavor-create  name 'bigGPU'  description 'passthrough Intel's on-die GPU'
 +
  nova pci-flavor-update  name 'bigGPU'  set    'vendor_id'='8086'  'product_id': '0001'
 +
 
 +
* create flavor and boot with it ( same as current PCI passthrough)
 +
 
 +
  nova flavor-key m1.small set pci_passthrough:pci_flavor= 1:bigGPU;
 +
  nova boot  mytest  --flavor m1.tiny  --image=cirros-0.3.1-x86_64-uec
 +
 
 +
==== General PCI pass through with multi PCI flavor candidate ====
 +
 
 +
given compute nodes contain 2 type GPU with , vendor:device 8086:0001, or vendor:device 8086:0002
 +
 
 +
*on the compute nodes, config the pci_information
 +
    pci_information =  { { 'device_id': "8086", 'vendor_id': "000[1-2]" }, {} }
 +
 
 +
* on controller
 +
  pci_flavor_attrs = ['device_id', 'vendor_id']
 +
 
 +
the compute node would report PCI stats group by ('device_id', 'vendor_id').
 +
pci stats will report 2 pool:
 +
 
 +
  {'device_id':'0001', 'vendor_id':'8086', 'count': 1 }
 +
  {'device_id':'0002', 'vendor_id':'8086', 'count': 1 }
 +
 
 +
* create PCI flavor
 +
  nova pci-flavor-create  name 'bigGPU'  description 'passthrough Intel's on-die GPU'
 +
  nova pci-flavor-update  name 'bigGPU'  set    'vendor_id'='8086'  'product_id': '0001'
 +
  nova pci-flavor-create  name 'bigGPU2' description 'passthrough Intel's on-die GPU'
 +
  nova pci-flavor-update  name 'bigGPU2'  set    'vendor_id'='8086'  'product_id': '0002'
 +
 
 +
* create flavor and boot with it
 +
  nova flavor-key m1.small set pci_passthrough:pci_flavor= '1:bigGPU,bigGPU2;'
 +
  nova boot  mytest  --flavor m1.tiny  --image=cirros-0.3.1-x86_64-uec
 +
 
 +
==== General PCI pass through wild-cast PCI flavor ====
 +
 
 +
given compute nodes contain 2 type GPU with , vendor:device 8086:0001, or vendor:device 8086:0002
 +
 
 +
*on the compute nodes, config the pci_information
 +
    pci_information =  { { 'device_id': "8086", 'vendor_id': "000[1-2]" }, {} }
 +
    pci_flavor_attrs = ['device_id', 'vendor_id']
 +
 
 +
the compute node would report PCI stats group by ('device_id', 'vendor_id').
 +
pci stats will report 2 pool:
 +
 
 +
  {'device_id':'0001', 'vendor_id':'8086', 'count': 1 }
 +
  {'device_id':'0002', 'vendor_id':'8086', 'count': 1 }
 +
 
 +
* create PCI flavor
 +
 
 +
  nova pci-flavor-create  name 'bigGPU'  description 'passthrough Intel's on-die GPU'
 +
  nova pci-flavor-update  name 'bigGPU'  set    'vendor_id'='8086'  'product_id': '000[1-2]'
 +
 
 +
* create flavor and boot with it
 +
 
 +
  nova flavor-key m1.small set pci_passthrough:pci_flavor= '1:bigGPU;'
 +
  nova boot  mytest  --flavor m1.tiny  --image=cirros-0.3.1-x86_64-uec
 +
 
 +
====  PCI pass through support grouping tag ====
 +
 
 +
given compute nodes contain 2 type GPU with , vendor:device 8086:0001, or vendor:device 8086:0002
  
 +
*on the compute nodes, config the pci_information
 +
    pci_information =  { { 'device_id': "8086", 'vendor_id': "000[1-2]" }, { 'e.group':'gpu' } }
  
* list avaliable pci flavor on host (white list)
+
     pci_flavor_attrs = ['e.group']
     nova host-list pci-flavor
 
  
    GET  v2/​{tenant_id}​/os-hosts/​{host_name}​/os-pci-flavors
+
the compute node would report PCI stats group by ('e.group').
    data:
+
pci stats will report 1 pool:
    os-pci-flavors{
+
{'e.group':'gpu', 'count': 2 }
                  [
 
                            {  
 
                                    'pci_flavor_uuid': <uuid>,
 
                                    total: 10, 
 
                                    available: 6,
 
                                    in_use: 2
 
                                } ,
 
                ]
 
      }
 
  
  
* get detailed infomation about one pci-flavor:
+
* create PCI flavor
 
   
 
   
    nova pci-flavor-show  <UUID>
+
  nova pci-flavor-create  name 'bigGPU'  description 'passthrough Intel's on-die GPU'
 
+
  nova pci-flavor-update  name 'bigGPU'   set    'e.group'='gpu'
    GET v2/​{tenant_id}​/os-pci-flavor/<UUID>
 
    data:
 
        os-pci-flavor: {
 
                                'UUID':'xxxx-xx-xx' ,
 
                                'description':'xxxx'
 
                                  ...
 
                                  'pci-flavor':'xxx',
 
          }
 
  
=====admin create a  pci flavor (white-list)  =====
+
* create flavor and boot with it
  
#  create flavor  
+
  nova flavor-key m1.small set pci_passthrough:pci_flavor= '1:bigGPU;'
  nova pci-flavor-create  name 'GetMePowerfulldevice'  description "xxxxx"
+
  nova boot  mytest  --flavor m1.tiny  --image=cirros-0.3.1-x86_64-uec
  
  API:
+
====  PCI SRIOV with tagged flavor ====
  POST  v2/​{tenant_id}​/os-pci-flavors
+
given compute nodes contain 5 PCI NIC , vendor:device 8086:0022, and it connect to physical network "X".
 
 
  data:  
 
      pci-flavor: {
 
              'name':'GetMePowerfulldevice',
 
              description: "xxxxx"  
 
      }
 
  action:  create database entry for this flavor.
 
  
#  update flavor defination
+
*on the compute nodes, config the pci_information
    nova pci-flavor-update UUID  set    'description'='xxxx'  'address'= '0000:01:*.7', 'host'='compute-id'
 
     
 
    PUT v2/​{tenant_id}​/os-pci-flavors/<UUID>
 
    with data  :
 
          { 'action': "update",  
 
            'pci-flavor':
 
                          {
 
                              'description':'xxxx',
 
                              'address': '0000:01:*.7'}
 
                          }
 
          }
 
    action: set this as the new defination of the pci flavor.
 
  
=====Take advantage of host aggregate =====
+
    pci_information { { 'device_id': "8086", 'vendor_id': "000[1-2]" }, { 'e.physical_netowrk': 'X' } }
  
host aggregate can be used to enhancement the scheduler for PCI.
+
    pci_flavor_attrs = 'e.physical_netowrk'
  
* create aggregate
+
the compute node would report PCI stats group by ('e.physical_netowrk').
    nova aggregate-create  pci-aware-group
+
pci stats will report 1 pool:
    nova aggregate-add-host  host1
 
    nova aggregate-add-host  host2
 
  
* map flavor to aggregate
 
    nova aggregate-set-metadata pci-aware-group set 'pci-flavor'='intelNICpublic, intelNICprivate, nvidiaGPUnew, nvidiaGPUolder'
 
  
    this means all hosts in the aggregate can provide these pci-flaovr if the host had free one. and this infomation also usefull for pci flavor filter on these hosts, we can check only these flavor on these hosts, don't need check each flavor we had in DB.
+
  {'e.physical_netowrk':'X', 'count': 1 }
  
* set instance flavor key to enhancement PCI scheduler
 
    nova flavor-create --is-public true m1.iwantPCI 100 2048 20 2
 
    nova flavor-key 100 set  'pci-flavor='1:intelNICprivate; 1:intelNICprivate; 1:nvidiaGPUnew, nvidiaGPUolder'
 
  
these information can use to select the aggreate, or try keep instance not scheule to those host if instance don't want the pci passthrough.
+
* create PCI flavor
 +
  nova pci-flavor-create  name 'phyX_NIC'  description 'passthrough NIC connect to physical network X'
 +
  nova pci-flavor-update  name 'bigGPU'  set    'e.physical_netowrk'='X'
  
=====admin delete a  pci flavor (white-list)  =====
 
 
 
    nova pci-flavor-delete <UUID>
 
  
    API will be:
+
* create flavor and boot with it
    DELETE v2/​{tenant_id}​/os-pci-flavor/<UUID>
+
  nova boot  mytest  --flavor m1.tiny  --image=cirros-0.3.1-x86_64-uec  --nic  net-id=network_X  pci_flavor= '1:phyX_NIC;'
    flow: delete it from database
 
 
 
=====admin configures extra spec in flavor request pci device =====
 
  
to allocate the device from a pci flavor, just fill pci flavor into the flavor's extra spec:
+
====encryption card use case ====
    nova flavor-key 100 set  'pci-flavor='1:intelNICprivate; 1:intelNICpublic'
 
  
=====admin boot VM with this flavours =====
+
there is 3 encryption card: [ V1-3 here means a vendor_id number.]
    nova boot  mytest  --flavor m1.small --image=cirros-0.3.1-x86_64-uec
 
  
=====admin configures SRIOV flavor =====
+
  card 1 (vendor_id is V1, device_id =0xa) 
 +
  card 2 (vendor_id is V1, device_id=0xb)
 +
  card 3 (vendor_id is V2, device_id=0xb)
 +
suppose there is  two images. One image only support Card 1 and another image support Card 1/3 (or any other combination of the 3 card type).
  
* create a pci flavor for the SRIOV
+
*on the compute nodes, config the pci_information
  nova pci-flavor-create  name 'vlan-SRIOV'  description "xxxxx"
 
  nova pci-flavor-update UUID  set    'description'='xxxx'  'address'= '0000:01:*.7'
 
  
 +
    pci_information =  { { 'device_id': "0xa", 'vendor_id': "v1" }, { 'e.QAclass':'1' } }
 +
    pci_information =  { { 'device_id': "0xb", 'vendor_id': "v1" }, { 'e.QAclass':'2' } }
 +
    pci_information =  { { 'device_id': "0xb", 'vendor_id': "v2" }, { 'e.QAclass':'3' }}
  
=====Admin config SRIOV=====
+
    pci_flavor_attrs = ['e.QAclass']
  
* create pci-flavor :
+
the compute node would report PCI stats group by (['e.QAclass']).
    {"name": "privateNIC", "neutron-network-uuid": "uuid-1", ...}
+
pci stats will report 3 pool:
    {"name": "publicNIC", "neutron-network-uuid": "uuid-2", ...}
 
    {"name": "smallGPU", "neutron-network-uuid": "", ...}
 
  
* set aggregate meta according the flavors existed in the hosts
 
flavor extra-specs, for a VM that gets two small GPUs and VIFs attached from the above SRIOV NICs:
 
    nova aggregate-set-metadata pci-aware-group set 'pci-flavor'='smallGPU,oldGPU, privateNIC,privateNIC'
 
  
* create instance flavor for sriov
+
  { 'e.QAclass":"1" , 'count': 1 }
    nova flavor-key 100 set 'pci-flavor='1:privateNIC; 1: publicNIC; 2:smallGPU,oldGPU'
+
  { 'e.QAclass":"2" , 'count': 1 }
 +
  { 'e.QAclass":"3" , 'count': 1 }
  
*User just specifies a quantum port as normal:
+
* create PCI flavor  
    nova boot --flavor "sriov-plus-two-gpu" --image img --nic net-id=uuid-2 --nic net-id=uuid-1 vm-name
 
  
And the uuid-1 and uuid-2 map to a "provider" network (with VLAN config, etc) that gets implemented using the privateNIC and publicNIC flavors, we bind the flavor to the network uuid alredy via "neutron-network-uuid" key.
+
  nova pci-flavor-create  name 'QA1'  description  'QuickAssist card version 1'
 +
  nova pci-flavor-update  name 'QA1'    set 'e.QAclass"="1"
 +
  nova pci-flavor-create  name 'QA13'  description 'QuickAssist card version 1 and version 3'
 +
  nova pci-flavor-update name 'QA13'    set 'e.QAclass"="(1|3)"
  
Reviewer comment: the flavor creation is an admin operation, not a user opertaion.
+
* create flavor and boot with it
Reviewer comments:
+
  nova boot  mytest  --flavor m1.tiny  --image=QA1_image  --nic  net-id=network_X  pci_flavor= '1:QA1;'
* I still think this is wrong...
+
  nova boot  mytest  --flavor m1.tiny  --image=QA13_image  --nic  net-id=network_X  pci_flavor= '3:QA13;'
* can't a pci-flavour just take a quantum network uuid? when network uuid is specified, the device is only ever attached by the VIF driver
 
* then the user requests a flavor where there are some SRIOV options that the VIF attach, and the VIF driver does what it can.
 
here is the example...
 
  
====transite config file to API ====
+
===Common PCI SRIOV Configuration detail ===
#  the config file for alias and whitelist defination is going to deprecated.
 
#  if database is not NULL , configration is ommit and given deprecated  warning.
 
#  if database is NULL, config if read from the file,
 
    *white list/alias schema still work
 
    * And also  given a deprecated notice, alias will fade out  which will be remove start from next release.
 
  
with this solution, we move pci config from file to API.
+
====Compute host====
 +
pci_information = [ {pci-regex},{pci-extra-attrs} ]
 +
pci_flavor_attrs=attr,attr,attr
  
Reviewer comments: This sounds good, but I am stopping reviewer here.
+
For instance, when using device and vendor ID this would read:
 +
    pci_flavor_attrs=device_id,vendor_id
 +
When the back end adds an arbitrary ‘group’ attribute to all PCI devices:
 +
    pci_flavor_attrs=e.group
 +
When you wish to find an appropriate device and perhaps also filter by the connection tagged on that device, which you use an extra-info attribute to specify on the compute node config:
 +
pci_flavor_attrs=device_id,vendor_id,e.connection
  
===DB for pci configration===
 
each pci flavor will be a set of (k,v), and the pci flavor don't need to contain same k, v pair.  another problem this define try to slove is, i,.e  SRIOV also want feature autodiscovery(under discuss), with this, the flavor might need a 'feature' key to be added if not store it as (k,v) pair.  the (k,v) paire define let more extra infomation can be store in the pci device.
 
  
  talbe: pci_flavor{
+
====flavor API====
                id  :  data base of this k,v pair
 
                UUID:  which pci-flavor the  k,v belong to
 
                key
 
                value (might be a simple value or reduce Regular express)
 
            }
 
  
====API interface====
+
* overall
 +
nova pci-flavor-list
 +
nova pci-flavor-show    name|UUID  <name|UUID>
 +
nova pci-flavor-create  name|UUID  <name|UUID>  description <desc>
 +
nova pci-flavor-update  name|UUID  <name|UUID>  set    'description'='xxxx'  'e.group'= 'A'
 +
nova pci-flavor-delete <name|UUID>  name|UUID
  
*  get pci devices infomation on host
 
  nova  host-list pci-device
 
  GET  v2/​{tenant_id}​/os-hosts/​{host_name}​/os-pci-devices
 
  return a summary infomation about pci devices on this host:
 
  os-pci-devices:{
 
                    [ 
 
                            {
 
                                  'vendor_id':'8086',
 
                                  'product_id':'xxx', 
 
                                  ....
 
                                  'pci-type'VF',
 
                                    'status': 'avaliable' ,
 
                              },
 
                    ]
 
        }
 
  
 
+
  * list available pci flavor  (white list)
  * list avaliable pci flavor  (white list)
+
    nova pci-flavor-list  
    nova pci-flavor-list  
 
 
     GET v2/​{tenant_id}​/os-pci-flavors
 
     GET v2/​{tenant_id}​/os-pci-flavors
 
     data:
 
     data:
Line 271: Line 319:
 
                                 'vendor_id':'8086',  
 
                                 'vendor_id':'8086',  
 
                                   ....
 
                                   ....
                                 'pci-flavor':'xxx',  
+
                                 'name':'xxx',  
 
                               } ,
 
                               } ,
 
                 ]
 
                 ]
Line 277: Line 325:
  
  
 
+
* get detailed information about one pci-flavor:  
*list avaliable pci flavor on host (white list)
+
    nova pci-flavor-show  <UUID|name>
 
+
     GET v2/​{tenant_id}​/os-pci-flavor/<UUID|name>
  nova host-list pci-flavor
 
  GET  v2/​{tenant_id}​/os-hosts/​{host_name}​/os-pci-flavors
 
  data:
 
    os-pci-flavors{
 
                [
 
                            {
 
                                  'pci_flavor_uuid': <uuid>,
 
                                    total: 10, 
 
                                    available: 6,
 
                                    in_use: 2
 
                              } ,
 
                ]
 
    }
 
 
 
 
 
* get detailed infomation about one pci-flavor:  
 
      nova pci-flavor-show  <UUID>
 
     GET v2/​{tenant_id}​/os-pci-flavor/<UUID>
 
 
     data:
 
     data:
 
         os-pci-flavor: {  
 
         os-pci-flavor: {  
Line 303: Line 333:
 
                                 'description':'xxxx'  
 
                                 'description':'xxxx'  
 
                                   ....
 
                                   ....
                                 'address': '0000:01:*.7',
+
                                 'name':'xxx',  
                                'pci-flavor':'xxx',  
 
 
           }  
 
           }  
  
Line 320: Line 349:
  
 
*update the pci flavor  
 
*update the pci flavor  
     nova pci-flavor-update UUID  set    'description'='xxxx'  'address'= '0000:01:*.7', 'host'='compute-id'
+
     nova pci-flavor-update UUID  set    'description'='xxxx'  'e.group'= 'A'
 
     PUT v2/​{tenant_id}​/os-pci-flavors/<UUID>
 
     PUT v2/​{tenant_id}​/os-pci-flavors/<UUID>
 
     with data  :
 
     with data  :
Line 327: Line 356:
 
                           {  
 
                           {  
 
                             'description':'xxxx',
 
                             'description':'xxxx',
                             'address': '0000:01:*.7'}
+
                             'vendor': '8086',
 +
                            'e.group': 'A',
 +
                              ....
 
                           }
 
                           }
 
         }
 
         }
     action: set this as the new defination of the pci flavor.
+
     action: set this as the new definition of the pci flavor.
  
 
* delete a pci flavor
 
* delete a pci flavor
Line 336: Line 367:
 
   DELETE v2/​{tenant_id}​/os-pci-flavor/<UUID>
 
   DELETE v2/​{tenant_id}​/os-pci-flavor/<UUID>
  
===Requements from SRIOV===
+
=== Current PCI implementation gaps ===
*group device
+
 
  for SRIOV, all VFs belong to same PF share same physical network reachability. so if you want, say, deploy a vlan network, you need choose the right PF's VF, otherwise network does not work for you.  the pci flavor does this work well.
+
concept introduce here:
*tracking device alloced to the NIC
+
spec: a filter defined by (k,v) pairs, which k in the pci object fields, this means those (k,v) is the pci device property like: vendor_id, 'address', pci-type etc.
  networking or other special deive is not as simple as pass though to the VM, there is need more configration. to acheive this, SRIOV must know the device infomation allocation to the specific NIC. pci flavor can map device for neturon port.
+
extra_spec: the filter defined by (k, v) and k not in the pci object fields.  
 +
 
 +
 
 +
====pci utils support extra property ====
 +
      * pci utils k,v match support the address reduce regular expression
 +
      * uitils provide a global extract interface to extract base property and extra property of a pci device.
 +
      * extra information also should use schema 'e.name'
 +
 
 +
====PCI information(extended the white-list) support extra tag====
 +
      * PCI information support  reduce regular expression compare, match the pci device
 +
      * PCI information  support  store any other (k,v) pair pci device's extra info
 +
      * any extra tag's k, v is string.
 +
 
 +
====pci_flavor_attrs====
 +
      * implement the attrs parser, updated to flavor Database
 +
 
 +
====support pci-flavor ====
 +
      * pci-flavor store in DB
 +
      * pci-flavor config via API
 +
      * pci manager use extract method extract the specs and extra_specs, match them against the pci object & object.extra_info.
 +
 
 +
====PCI scheduler: PCI filter ====
 +
    When scheduling,  marcher should  applied  regular expression stored in the named flavor, this read out from DB.
 +
 
 +
 
 +
==== convert pci flavor from SRIOV ====
 +
in API stage the network parser should convert the pci flavor in the --nic option to pci request and save them into instance meta data.
 +
 
 +
1. translate pci flavor spec to request
 +
    *translate_pci_flavor_to_requests(flavor_spec)
 +
    * input flavor_spce is a list of pci flavor and requested number:  "flavor_name:number, flavor_name2:number2,..." if number is 1, it can be omit.
 +
    * output is pci_request, a internal represents data structure
 +
 
 +
2. save request to instance meta data:
 +
    *update_pci_request_to_metadata(metadata, new_requests, prefix='')
 +
 
 +
==== find specific device of a instance based on request ====
 +
 
 +
to boot VM with PCI SRIOV devices, there might need more configuration action to pci device instead of just a pci host dev. to achieve this, common SRIOV need a interface to query the device allocated to a specific usage, like the SRIOV network.
  
=== Implement the grouping===
+
3 steps facility to achive this:
  spec: a filter defined by (k,v) paris
 
  extra_spec: the filter defined by (k, v) and k not in the pci object fileds.
 
  
====pci utils/objects support grouping ====
+
    * Mark the PCI request use UUID, so the pci request is distinguishable.
      * pci utils k,v match support the list values
+
    * remember the devices allocated the this request.
      * objects provide a class level extrac interface to extract base spec and extra spec
+
    * a interface function provide to get these device from a instance.
  
====pci-flavor(white list) support address set====
+
====DB for PCI flavor ====
      * white list support 'address':[bdf1, ....]
+
each pci flavor will be a set of (k,v), store the (k,v) pair in DB.  both k, v is string, and the value could be a simple  regular expression, support wild-cast, address range operators.
      * white list support  any other (k,v) pair to group or store special infomation
 
      * object extrac specs and extra_info, specs use as whitelist spec, extra info will be updated to device's extra_info fields
 
  
====enable flavor support pci-flavor ====
 
      * pci-flavor's name set in the extra spec in the instance type
 
      * pci manager use extrac method extrac the specs and extra_specs, match them agains  the pci object & object.extra_info.
 
  
====pci stats grouping device on demand====
 
        * pci_grouping_key configration option define a set of key name which will used to group the device to stats
 
        * default value is  [vendor_id, product_id], this current implemtation
 
        * limited support to 3 keys grouping for algorithm simplicity.
 
  
=== Implement tracking device allocated for the pci-flavor===
+
Talbe: pci_flavor
  
here is the idea how user can identify which device allocated for the pci-flavor.
+
  {
 +
              UUID:  which pci-flavor the k,v belong to
 +
              name: the pci flavor's name, we need this filed to index the DB with flavor's name
 +
              key
 +
              value (might be a simple string value or reduce Regular express)
 +
  }
  
    *while allocated device, user put a  marker into the device ( into the pci device extra_info fileds)
+
DB interface:
    *after finished allocation, user can seach a instance's pci devices to find the specific marker
 
  
    the way marker data transfer from user to device utilize the pci_request, which convert from the pci-flavor.
+
        get_pci_flavor_by_name
 +
        get_pci_flavor_by_UUID
 +
        get_pci_flavor_all
 +
        update_pci_flavor_by_name
 +
        update_pci_flavor_by_UUID
 +
        delete_pci_flavor_by_name
 +
        delete_pci_flavor_by_UUID
  
Reviewer: please see the extensible resource manager blueprint
+
===transient config file to API ===
 +
    *the config file for alias and white-list definition is going to deprecated.
 +
    *new config pci_information will replace white-list
 +
    *pci flavor will replace alias
 +
    *white list/alias schema still work, and given a deprecated notice, will fade out  which will be remove start from next release.
 +
    *if pci_flavor_attrs is not defined it will default to vendor_id, product_id and extra_info. this is keep compatible with old system.

Latest revision as of 03:21, 27 July 2016

!!!This is a design discussion document, not for end user reference!!!

Related Resource

This design is based on the PCI pass-through IRC meetings, provide common support for PCI SRIOV:

This document was used to finalise the design:

link back to bp:

Common PCI SRIOV design

PCI devices have PCI standard properties like address (BDF), vendor_id, product_id, etc, Virtual functions also have a property referring to the function's physical address. Application specific or installation specific extra information can be attached to PCI device, like physical network connectivity using by Neutron SRIOV.

This bp focus on functionality to provide the common PCI SRIOV support.

  • on compute node the pci_information/white-list define a set of filter. PCI device passed filter will be available for allocation. at same time extra information attached to the PCI device.
  • PCI compute report the PCI stats information to scheduler. PCI stats contain several pools. each pool defined by several PCI property, control by the local configuration item: pci_flavor_attrs , default value is vendor_id, product_id, extra_info.
  • PCI flavor define the user point of view PCI device selector: PCI flavor provide a set of (k,v) to form specs to select the available device(filtered by pci_information/white-list).

the new PCI design based on 2 key change: the PCI flavor and the extra information attache to pci device. the following diagram is a summary of the design:

PCI-sriov.jpg

Design Choice

     PCI flavor:  User don't want to know details of a pci device and all of it's attrs and extra information attached to it.
     (pci_flavor_attr:  admin need to know all PCI attrs to define flavor for user/tenant.)
     PCI Stats:   compute node might have many devices and most of device properties are same,  summary a stats to scheduler can reduce DB load, simply schedule.
     PCI extra info:  a pci device might attach to a specific network, a specific resource, with can be attach to device and schedule base on it.


PCI flavor

For OS users, PCI flavor is a reasonable name like 'oldGPU', 'FastGPU', '10GNIC', 'SSD', describe one kind of PCI device. User use the PCI flavor to select available pci devices. Internally the PCI flavor created by a set of API and saved in a DB table, keep PCI flavor available for all cloud.


Administrator define the PCI flavors via matching expression that selects available(offer by white list) devices, and a reasonable name. PCI flavor matching expression is a set of (k,v), the k is the PCI property, v is its value. not every PCI property is available to PCI flavor, only a selected set of PCI property can used to define the PCI flavor, the selected property should be global to cloud like vendor/product_id, can not be BDF or host of a PCI device. these selected PCI property is defined via compute local configuration :

    pci_flavor_attrs = vendor_id, product_id, ...

a important behavior is the PCI flavors could overlap - that is, the same device on the same machine may be matched by multiple flavors.

Use PCI flavor in instance flavor extra info

user set pci flavor into instance flavor's extra info to specify how many PCI device/and what type PCI flavor the VM want to boot with.

   nova flavor-key m1.small set pci_passthrough:pci_flavor= <pci flavor spec list>
   pci flavor spec:
                   num1:flavor1,flavor2
   mean: want <number>'s pci devices from flavor or flavor 
   
   pci flavor spec list:
               <pci flavor spec1>; <pci flavor spec2>


for example:

   nova flavor-key m1.small set pci_passthrough:pci_flavor= 1:IntelGPU,NvGPU;1:intelQuickAssist;
   
   which define requirements: 
         boot with 1 of IntelGPU or NvGPU, and 1 IntelQuickAssist card.

PCI pci_flavor_attrs

this configuration is keep local to every compute node, this will make deploy process can locally decide what PCI properties this node will exposed.

   pci_flavor_attrs = vendor_id, product_id, ...

compute node update local pci extra properties to PCI flavor Database, which is accessible by flavor API, provide PCI properties to define flavor.

pci_flavor_attrs store in flavor DB as a normal flavor, it's name "_flavor_attrs", it's UUID use "0":

   {"vendor_id":"Ture", "product_id":"True", ... }

this flavor contain all available attrs can be used to define pci flavor, list it's content use:

    nova pci-flavor-show  0
    GET v2/​{tenant_id}​/os-pci-flavor/<0>
    data:
       os-pci-flavor: { 
                               'UUID':'0' , 
                               'description':'Available flavor attrs ' 
                               'name':'_flavor_attrs', 
                               'vendor_id": "True",
                               'product_id": "True",
                                ....
         }

PCI request

PCI request is a internal structure to represent all PCI devices a VM want to have.

   request = {'count': int(count),
             'spec': [{"vendor_id":"8086", "phynetwork":"phy1"}, ...],
             'alias_name': "Intel.NIC"}

PCI-Request.jpg

Extra information of PCI device

the compute nodes offer available PCI devices for pass-through, since the list of devices doesn't usually change unless someone tinkers with the hardware, this matching expression used to create this list of offered devices is stored in compute node config.


   *the device information (device ID, vendor ID, BDF etc.) is discovered from the device and stored as part of the PCI device, same as current implement.
   *on the compute node, additional arbitrary extra information, in the form of key-value pairs, can be added to the config and is included in the PCI device

this is achieved by extend the pci white-list to:


  *pci_information = [ { pci-regex } ,{pci-extra-attrs } ]
  *pci-regex is a dict of { string-key: string-value } pairs , it can only match device properties, like vendor_id, address, product_id,etc.
  *pci-extra-attrs is a dict of { string-key: string-value } pairs.  The values can be arbitrary  The total size of the extra attrs may be restricted. all this extra attrs will be store in the pci device table's extra info field. and the extra attrs should use this naming schema: e.attrname

PCI stats grouping device base on pci_flavor_attrs

PCI stats pool summary the devices of a compute node, and the scheduler use flavor's matching specs select the available host for VM. The stats pool must contain the PCI properties used by PCI flavor.

  *  current grouping is based on  [vendor_id, product_id, extra_info]
  *  going to group by keys specified by   pci_flavor_attrs.

The algorithm for stats report should meet this request: the one device should only be in one pci stats pool, this mean pci stats can not overlap. this simplifies the scheduler design.

on computer node the pci_flavor_attrs provide the specs for pci stats to group its pool. and the pci_flavor_attrs on control node collection the attrs which can be used to define the pci_flavor. the definition of pci_flavor_attrs on controller should contain all the pci_flavor_attrs's content on every compute node.

  • but compute node report stats pool by a subset of controller's pci_flavor_attrs is acceptable, in such scenario, this means the compute node can only provide the devices with these propertes.*

Use cases

General PCI pass through

given compute nodes contain 1 GPU with vendor:device 8086:0001

  • on the compute nodes, config the pci_information
   pci_information =  { { 'device_id': "8086", 'vendor_id': "0001" }, {} }
   pci_flavor_attrs ='device_id','vendor_id'

the compute node would report PCI stats group by ('device_id', 'vendor_id'). pci stats will report one pool:

 {'device_id':'0001', 'vendor_id':'8086', 'count': 1 }
  • create PCI flavor
 nova pci-flavor-create  name 'bigGPU'  description 'passthrough Intel's on-die GPU'
 nova pci-flavor-update  name 'bigGPU'   set    'vendor_id'='8086'   'product_id': '0001'
  • create flavor and boot with it ( same as current PCI passthrough)
 nova flavor-key m1.small set pci_passthrough:pci_flavor= 1:bigGPU;
 nova boot  mytest  --flavor m1.tiny  --image=cirros-0.3.1-x86_64-uec

General PCI pass through with multi PCI flavor candidate

given compute nodes contain 2 type GPU with , vendor:device 8086:0001, or vendor:device 8086:0002

  • on the compute nodes, config the pci_information
   pci_information =  { { 'device_id': "8086", 'vendor_id': "000[1-2]" }, {} }
  • on controller
  pci_flavor_attrs = ['device_id', 'vendor_id']

the compute node would report PCI stats group by ('device_id', 'vendor_id'). pci stats will report 2 pool:

 {'device_id':'0001', 'vendor_id':'8086', 'count': 1 }
 {'device_id':'0002', 'vendor_id':'8086', 'count': 1 }
  • create PCI flavor
 nova pci-flavor-create  name 'bigGPU'  description 'passthrough Intel's on-die GPU'
 nova pci-flavor-update  name 'bigGPU'   set    'vendor_id'='8086'   'product_id': '0001'
 nova pci-flavor-create  name 'bigGPU2'  description 'passthrough Intel's on-die GPU'
 nova pci-flavor-update  name 'bigGPU2'   set    'vendor_id'='8086'   'product_id': '0002'
  • create flavor and boot with it
 nova flavor-key m1.small set pci_passthrough:pci_flavor= '1:bigGPU,bigGPU2;'
 nova boot  mytest  --flavor m1.tiny  --image=cirros-0.3.1-x86_64-uec

General PCI pass through wild-cast PCI flavor

given compute nodes contain 2 type GPU with , vendor:device 8086:0001, or vendor:device 8086:0002

  • on the compute nodes, config the pci_information
   pci_information =  { { 'device_id': "8086", 'vendor_id': "000[1-2]" }, {} }
   pci_flavor_attrs = ['device_id', 'vendor_id']

the compute node would report PCI stats group by ('device_id', 'vendor_id'). pci stats will report 2 pool:

 {'device_id':'0001', 'vendor_id':'8086', 'count': 1 }
 {'device_id':'0002', 'vendor_id':'8086', 'count': 1 }
  • create PCI flavor
 nova pci-flavor-create  name 'bigGPU'  description 'passthrough Intel's on-die GPU'
 nova pci-flavor-update  name 'bigGPU'   set    'vendor_id'='8086'   'product_id': '000[1-2]'
  • create flavor and boot with it
 nova flavor-key m1.small set pci_passthrough:pci_flavor= '1:bigGPU;'
 nova boot  mytest  --flavor m1.tiny  --image=cirros-0.3.1-x86_64-uec

PCI pass through support grouping tag

given compute nodes contain 2 type GPU with , vendor:device 8086:0001, or vendor:device 8086:0002

  • on the compute nodes, config the pci_information
   pci_information =  { { 'device_id': "8086", 'vendor_id': "000[1-2]" }, { 'e.group':'gpu' } } 
   pci_flavor_attrs = ['e.group']

the compute node would report PCI stats group by ('e.group'). pci stats will report 1 pool:

{'e.group':'gpu', 'count': 2 }


  • create PCI flavor
 nova pci-flavor-create  name 'bigGPU'  description 'passthrough Intel's on-die GPU'
 nova pci-flavor-update  name 'bigGPU'   set    'e.group'='gpu'
  • create flavor and boot with it
 nova flavor-key m1.small set pci_passthrough:pci_flavor= '1:bigGPU;'
 nova boot  mytest  --flavor m1.tiny  --image=cirros-0.3.1-x86_64-uec

PCI SRIOV with tagged flavor

given compute nodes contain 5 PCI NIC , vendor:device 8086:0022, and it connect to physical network "X".

  • on the compute nodes, config the pci_information
   pci_information =  { { 'device_id': "8086", 'vendor_id': "000[1-2]" }, { 'e.physical_netowrk': 'X' } }
   pci_flavor_attrs = 'e.physical_netowrk'

the compute node would report PCI stats group by ('e.physical_netowrk'). pci stats will report 1 pool:


 {'e.physical_netowrk':'X', 'count': 1 }


  • create PCI flavor
 nova pci-flavor-create  name 'phyX_NIC'  description 'passthrough NIC connect to physical network X'
 nova pci-flavor-update  name 'bigGPU'   set    'e.physical_netowrk'='X'


  • create flavor and boot with it
 nova boot  mytest  --flavor m1.tiny  --image=cirros-0.3.1-x86_64-uec  --nic  net-id=network_X  pci_flavor= '1:phyX_NIC;'

encryption card use case

there is 3 encryption card: [ V1-3 here means a vendor_id number.]

  card 1 (vendor_id is V1, device_id =0xa)  
  card 2 (vendor_id is V1, device_id=0xb)
  card 3 (vendor_id is V2, device_id=0xb)

suppose there is two images. One image only support Card 1 and another image support Card 1/3 (or any other combination of the 3 card type).

  • on the compute nodes, config the pci_information
   pci_information =  { { 'device_id': "0xa", 'vendor_id': "v1" }, { 'e.QAclass':'1' } }
   pci_information =  { { 'device_id': "0xb", 'vendor_id': "v1" }, { 'e.QAclass':'2' } }
   pci_information =  { { 'device_id': "0xb", 'vendor_id': "v2" }, { 'e.QAclass':'3' }}
   pci_flavor_attrs = ['e.QAclass']

the compute node would report PCI stats group by (['e.QAclass']). pci stats will report 3 pool:


 { 'e.QAclass":"1" ,  'count': 1 }
 { 'e.QAclass":"2" ,  'count': 1 }
 { 'e.QAclass":"3" ,  'count': 1 }
  • create PCI flavor
 nova pci-flavor-create  name 'QA1'  description  'QuickAssist card version 1'
 nova pci-flavor-update  name 'QA1'    set 'e.QAclass"="1"
 nova pci-flavor-create  name 'QA13'   description 'QuickAssist card version 1 and version 3'
 nova pci-flavor-update  name 'QA13'    set 'e.QAclass"="(1|3)"
  • create flavor and boot with it
 nova boot  mytest  --flavor m1.tiny  --image=QA1_image  --nic  net-id=network_X  pci_flavor= '1:QA1;'
 nova boot  mytest  --flavor m1.tiny  --image=QA13_image  --nic  net-id=network_X  pci_flavor= '3:QA13;'

Common PCI SRIOV Configuration detail

Compute host

pci_information = [ {pci-regex},{pci-extra-attrs} ] pci_flavor_attrs=attr,attr,attr

For instance, when using device and vendor ID this would read:

    pci_flavor_attrs=device_id,vendor_id

When the back end adds an arbitrary ‘group’ attribute to all PCI devices:

    pci_flavor_attrs=e.group

When you wish to find an appropriate device and perhaps also filter by the connection tagged on that device, which you use an extra-info attribute to specify on the compute node config: pci_flavor_attrs=device_id,vendor_id,e.connection


flavor API

  • overall

nova pci-flavor-list nova pci-flavor-show name|UUID <name|UUID> nova pci-flavor-create name|UUID <name|UUID> description <desc> nova pci-flavor-update name|UUID <name|UUID> set 'description'='xxxx' 'e.group'= 'A' nova pci-flavor-delete <name|UUID> name|UUID


* list available pci flavor  (white list)
   nova pci-flavor-list 
   GET v2/​{tenant_id}​/os-pci-flavors
   data:
    os-pci-flavors{
                [ 
                           { 
                               'UUID':'xxxx-xx-xx' , 
                               'description':'xxxx' 
                               'vendor_id':'8086', 
                                 ....
                                'name':'xxx', 
                              } ,
               ]
    }


  • get detailed information about one pci-flavor:
    nova pci-flavor-show  <UUID|name>
    GET v2/​{tenant_id}​/os-pci-flavor/<UUID|name>
    data:
       os-pci-flavor: { 
                               'UUID':'xxxx-xx-xx' , 
                               'description':'xxxx' 
                                  ....
                                'name':'xxx', 
         } 
  • create pci flavor
 nova pci-flavor-create  name 'GetMePowerfulldevice'  description "xxxxx"
 API:
 POST  v2/​{tenant_id}​/os-pci-flavors
 data: 
     pci-flavor: { 
            'name':'GetMePowerfulldevice',
             description: "xxxxx" 
     }
 action:  create database entry for this flavor.


  • update the pci flavor
    nova pci-flavor-update UUID  set    'description'='xxxx'   'e.group'= 'A'
    PUT v2/​{tenant_id}​/os-pci-flavors/<UUID>
    with data  :
        { 'action': "update", 
          'pci-flavor':
                         { 
                            'description':'xxxx',
                            'vendor': '8086',
                            'e.group': 'A',
                             ....
                         }
        }
   action: set this as the new definition of the pci flavor.
  • delete a pci flavor
  nova pci-flavor-delete <UUID>
  DELETE v2/​{tenant_id}​/os-pci-flavor/<UUID>

Current PCI implementation gaps

concept introduce here: spec: a filter defined by (k,v) pairs, which k in the pci object fields, this means those (k,v) is the pci device property like: vendor_id, 'address', pci-type etc. extra_spec: the filter defined by (k, v) and k not in the pci object fields.


pci utils support extra property

      * pci utils k,v match support the address reduce regular expression
      * uitils provide a global extract interface to extract base property and extra property of a pci device.
      * extra information also should use schema 'e.name'

PCI information(extended the white-list) support extra tag

      * PCI information support  reduce regular expression compare, match the pci device 
      * PCI information  support  store any other (k,v) pair pci device's extra info
      * any extra tag's k, v is string.

pci_flavor_attrs

      * implement the attrs parser, updated to flavor Database

support pci-flavor

      * pci-flavor store in DB
      * pci-flavor config via API
      * pci manager use extract method extract the specs and extra_specs, match them against  the pci object & object.extra_info.

PCI scheduler: PCI filter

   When scheduling,  marcher should  applied  regular expression stored in the named flavor, this read out from DB. 


convert pci flavor from SRIOV

in API stage the network parser should convert the pci flavor in the --nic option to pci request and save them into instance meta data.

1. translate pci flavor spec to request

   *translate_pci_flavor_to_requests(flavor_spec)
   * input flavor_spce is a list of pci flavor and requested number:  "flavor_name:number, flavor_name2:number2,..." if number is 1, it can be omit.
   * output is pci_request, a internal represents data structure

2. save request to instance meta data:

    *update_pci_request_to_metadata(metadata, new_requests, prefix=)

find specific device of a instance based on request

to boot VM with PCI SRIOV devices, there might need more configuration action to pci device instead of just a pci host dev. to achieve this, common SRIOV need a interface to query the device allocated to a specific usage, like the SRIOV network.

3 steps facility to achive this:

   * Mark the PCI request use UUID, so the pci request is distinguishable.
   * remember the devices allocated the this request.
   * a interface function provide to get these device from a instance.

DB for PCI flavor

each pci flavor will be a set of (k,v), store the (k,v) pair in DB. both k, v is string, and the value could be a simple regular expression, support wild-cast, address range operators.


Talbe: pci_flavor

 {
              UUID:  which pci-flavor the  k,v belong to
              name: the pci flavor's name, we need this filed to index the DB with flavor's name
              key 
              value (might be a simple string value or reduce Regular express)
  }

DB interface:

        get_pci_flavor_by_name
        get_pci_flavor_by_UUID
        get_pci_flavor_all
        update_pci_flavor_by_name
        update_pci_flavor_by_UUID
        delete_pci_flavor_by_name
        delete_pci_flavor_by_UUID

transient config file to API

   *the config file for alias and white-list definition is going to deprecated.
   *new config pci_information will replace white-list 
   *pci flavor will replace alias
   *white list/alias schema still work, and given a deprecated notice, will fade out  which will be remove start from next release.
   *if pci_flavor_attrs is not defined it will default to vendor_id, product_id and extra_info. this is keep compatible with old system.