StarlingX/Containers/Applications/app-intel-device-plugins
Contents
Application: app-intel-device-plugins
Source
Building
- From the Debian Build environment:
Build dependent packages:
build-pkgs -p helm build-info
Build helm chart packages and python plugin:
build-pkgs -p python3-k8sapp-intel-device-plugins-operator intel-device-plugins-dsa-helm intel-device-plugins-gpu-helm intel-device-plugins-qat-helm intel-device-plugins-operator-helm intel-device-plugins-secret-observer-helm
Build final helm application:
build-pkgs -p stx-intel-device-plugins-operator-helm
Testing
- Upload and apply node-feature-discovery app - Upload and apply intel-device-plugins-operator app
Testing DSA device plugin
- Enable DSA device plugin helm chart:
system helm-chart-attribute-modify --enabled true intel-device-plugins-operator intel-device-plugins-dsa intel-device-plugins-operator
Apply intel-device-plugins-operator again Confirm that DSA resources are available:
[sysadmin@controller-0 ~(keystone_admin)]$ kubectl get nodes -o go-template='Template:Range .itemsTemplate:.metadata.nameTemplate:"\n"Template:Range $k,$v:=.status.allocatableTemplate:" "Template:$kTemplate:": "Template:$vTemplate:"\n"Template:EndTemplate:End' | grep '^\([^ ]\)\|\( dsa\)' controller-0 dsa.intel.com/wq-user-shared: 40
The plugin can be tested by deploying a pod using the VRAN tools image:
apiVersion v1
kind: Pod
metadata:
name: dsa-accel-config-demo
labels:
app: dsa-accel-config-demo
spec:
containers:
- name: dsa-accel-config-demo
image: registry.local:9001/docker.io/starlingx/stx-debian-tools-dev:stx.10.0-v1.0.0
imagePullPolicy: "Always"
workingDir: "/usr/libexec/accel-config/test/"
command:
- "./dsa_user_test_runner.sh"
args:
- "--skip-config"
resources:
limits:
dsa.intel.com/wq-user-shared: 1
restartPolicy: Never
imagePullSecrets:
- name: default-registry-key
Review the job's log:
$ kubectl logs dsa-accel-config-demo | tail [debug] PF in sub-task[6], consider as passed [debug] PF in sub-task[7], consider as passed [debug] PF in sub-task[8], consider as passed [debug] PF in sub-task[9], consider as passed [debug] PF in sub-task[10], consider as passed [debug] PF in sub-task[11], consider as passed [debug] PF in sub-task[12], consider as passed [debug] PF in sub-task[13], consider as passed [debug] PF in sub-task[14], consider as passed [debug] PF in sub-task[15], consider as passed
If the pod did not successfully launch, possibly because it could not obtain the DSA resource, it will be stuck in the Pending status:
$ kubectl get pods NAME READY STATUS RESTARTS AGE dsa-accel-config-demo 0/1 Pending 0 7s This can be verified by checking the Events of the pod:
$ kubectl describe pod dsa-accel-config-demo | grep -A3 Events: Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 2m26s default-scheduler 0/1 nodes are available: 1 Insufficient dsa.intel.com/wq-user-dedicated, 1 Insufficient dsa.intel.com/wq-user-shared.
Customize the configuration
The default configuration uses shared queues for controller-0 node and dedicated queues for the remaining nodes. Node specific configuration can be passed by defining the config name with dsa-<node-name>.conf. The default config is as follow
dsa.conf: |
[
{
"dev":"dsaX",
"read_buffer_limit":0,
"groups":[
{
"dev":"groupX.0",
"read_buffers_reserved":0,
"use_read_buffer_limit":0,
"read_buffers_allowed":8,
"grouped_workqueues":[
{
"dev":"wqX.0",
"mode":"dedicated",
"size":16,
"group_id":0,
"priority":10,
"block_on_fault":1,
"type":"user",
"name":"dpdk_appX0",
"driver_name":"user",
"threshold":15
}
],
"grouped_engines":[
{
"dev":"engineX.0",
"group_id":0
},
]
},
{
"dev":"groupX.1",
"read_buffers_reserved":0,
"use_read_buffer_limit":0,
"read_buffers_allowed":8,
"grouped_workqueues":[
{
"dev":"wqX.1",
"mode":"dedicated",
"size":16,
"group_id":1,
"priority":10,
"block_on_fault":1,
"type":"user",
"name":"dpdk_appX1",
"driver_name":"user",
"threshold":15
}
],
"grouped_engines":[
{
"dev":"engineX.1",
"group_id":1
},
]
},
{
"dev":"groupX.2",
"read_buffers_reserved":0,
"use_read_buffer_limit":0,
"read_buffers_allowed":8,
"grouped_workqueues":[
{
"dev":"wqX.2",
"mode":"dedicated",
"size":16,
"group_id":2,
"priority":10,
"block_on_fault":1,
"type":"user",
"name":"dpdk_appX2",
"driver_name":"user",
"threshold":15
}
],
"grouped_engines":[
{
"dev":"engineX.2",
"group_id":2
},
]
},
{
"dev":"groupX.3",
"read_buffers_reserved":0,
"use_read_buffer_limit":0,
"read_buffers_allowed":8,
"grouped_workqueues":[
{
"dev":"wqX.3",
"mode":"dedicated",
"size":16,
"group_id":3,
"priority":10,
"block_on_fault":1,
"type":"user",
"name":"dpdk_appX3",
"driver_name":"user",
"threshold":15
}
],
"grouped_engines":[
{
"dev":"engineX.3",
"group_id":3
},
]
},
]
}
]
dsa-controller-0.conf: |
[
{
"dev":"dsaX",
"read_buffer_limit":0,
"groups":[
{
"dev":"groupX.0",
"read_buffers_reserved":0,
"use_read_buffer_limit":0,
"read_buffers_allowed":8,
"grouped_workqueues":[
{
"dev":"wqX.0",
"mode":"shared",
"size":16,
"group_id":0,
"priority":10,
"block_on_fault":1,
"type":"user",
"name":"dpdk_appX0",
"threshold":15
}
],
"grouped_engines":[
{
"dev":"engineX.0",
"group_id":0
},
]
},
{
"dev":"groupX.1",
"read_buffers_reserved":0,
"use_read_buffer_limit":0,
"read_buffers_allowed":8,
"grouped_workqueues":[
{
"dev":"wqX.1",
"mode":"shared",
"size":16,
"group_id":1,
"priority":10,
"block_on_fault":1,
"type":"user",
"name":"dpdk_appX1",
"driver_name":"user",
"threshold":15
}
],
"grouped_engines":[
{
"dev":"engineX.1",
"group_id":1
},
]
},
{
"dev":"groupX.2",
"read_buffers_reserved":0,
"use_read_buffer_limit":0,
"read_buffers_allowed":8,
"grouped_workqueues":[
{
"dev":"wqX.2",
"mode":"shared",
"size":16,
"group_id":2,
"priority":10,
"block_on_fault":1,
"type":"user",
"name":"dpdk_appX2",
"threshold":15
}
],
"grouped_engines":[
{
"dev":"engineX.2",
"group_id":2
},
]
},
{
"dev":"groupX.3",
"read_buffers_reserved":0,
"use_read_buffer_limit":0,
"read_buffers_allowed":8,
"grouped_workqueues":[
{
"dev":"wqX.3",
"mode":"shared",
"size":16,
"group_id":3,
"priority":10,
"block_on_fault":1,
"type":"user",
"name":"dpdk_appX3",
"driver_name":"user",
"threshold":15
}
],
"grouped_engines":[
{
"dev":"engineX.3",
"group_id":3
},
]
},
]
}
]
which is based on upstream default configuration file: https://github.com/intel/intel-device-plugins-for-kubernetes/blob/main/deployments/dsa_plugin/overlays/dsa_initcontainer/dsa-config.yaml
The DSA device configuration can be customized via application overrides. For instance, the following config uses dedicated queues for all nodes:
overrideConfig:
dsa.conf: |
[
{
"dev":"dsaX",
"read_buffer_limit":0,
"groups":[
{
"dev":"groupX.0",
"read_buffers_reserved":0,
"use_read_buffer_limit":0,
"read_buffers_allowed":8,
"grouped_workqueues":[
{
"dev":"wqX.0",
"mode":"dedicated",
"size":16,
"group_id":0,
"priority":10,
"block_on_fault":1,
"type":"user",
"name":"dpdk_appX0",
"driver_name":"user",
"threshold":15
}
],
"grouped_engines":[
{
"dev":"engineX.0",
"group_id":0
},
]
},
{
"dev":"groupX.1",
"read_buffers_reserved":0,
"use_read_buffer_limit":0,
"read_buffers_allowed":8,
"grouped_workqueues":[
{
"dev":"wqX.1",
"mode":"dedicated",
"size":16,
"group_id":1,
"priority":10,
"block_on_fault":1,
"type":"user",
"name":"dpdk_appX1",
"driver_name":"user",
"threshold":15
}
],
"grouped_engines":[
{
"dev":"engineX.1",
"group_id":1
},
]
},
{
"dev":"groupX.2",
"read_buffers_reserved":0,
"use_read_buffer_limit":0,
"read_buffers_allowed":8,
"grouped_workqueues":[
{
"dev":"wqX.2",
"mode":"dedicated",
"size":16,
"group_id":2,
"priority":10,
"block_on_fault":1,
"type":"user",
"name":"dpdk_appX2",
"driver_name":"user",
"threshold":15
}
],
"grouped_engines":[
{
"dev":"engineX.2",
"group_id":2
},
]
},
{
"dev":"groupX.3",
"read_buffers_reserved":0,
"use_read_buffer_limit":0,
"read_buffers_allowed":8,
"grouped_workqueues":[
{
"dev":"wqX.3",
"mode":"dedicated",
"size":16,
"group_id":3,
"priority":10,
"block_on_fault":1,
"type":"user",
"name":"dpdk_appX3",
"driver_name":"user",
"threshold":15
}
],
"grouped_engines":[
{
"dev":"engineX.3",
"group_id":3
},
]
},
]
}
]
The custom config can be applied with:
$ system helm-override-update intel-device-plugins-operator intel-device-plugins-dsa intel-device-plugins-operator --values <your-override-file>.yaml
Testing QAT device plugin
The host should have Intel QAT hardware. Installation and testing steps are mentioned here. After installation, please verify intel QAT plugin pods are running on each host where application pods can be scheduled to consume QAT resources.
Testing GPU device plugin
The host should have Intel GPU hardware. Installation and testing steps are mentioned here. After installation, please verify intel GPU plugin pods are running on each host where application pods can be scheduled to consume GPU resources.