Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot mount /dev/dri in Pod: invalid volume specification #1612

Closed
marhav20 opened this issue Dec 6, 2023 · 2 comments
Closed

Cannot mount /dev/dri in Pod: invalid volume specification #1612

marhav20 opened this issue Dec 6, 2023 · 2 comments

Comments

@marhav20
Copy link

marhav20 commented Dec 6, 2023

The /dev/dri host directory needs to be mounted into a pod so that the integrated GPU can be used from within a pod, e.g. via the onnxruntime OpenVINOExecutionProvider. An attempt to mount causes an "invalid volume specification" by kubelet. The reason is that the i915 driver creates files with a ":" in filenames, e.g. /dev/dri/by-path/pci-0000:00:02.0-card.

It is possible to delete the by-path subdirectory without any obvious impact, but after a node reboot the i915 driver re-creates the directory.

To reproduce:
Kubernetes v1.26.2
cri-docker 0.3.1
docker 24.0.5

Example manifest:

kind: Pod
metadata:
  name: test-dri-mount
spec:
  containers:
  - name: test-dri-mount
    image: openvino/onnxruntime_ep_ubuntu20:2023.1.0
    imagePullPolicy: Always
    resources:
      limits:
        gpu.intel.com/i915: 1
    stdin: true
    tty: true
    volumeMounts:
    - mountPath: /dev/dri
      name: dri-device
  volumes:
  - name: dri-device
    hostPath:
      path: /dev/dri

Error reported with kubectl describe pod test-dri-mount:

Events:
  Type     Reason   Age                            From     Message
  ----     ------   ----                           ----     -------
  ...
  Warning  Failed   <invalid> (x3 over <invalid>)  kubelet  Error: Error response from daemon: invalid volume specification: '/dev/dri/by-path/pci-0000:00:02.0-card:/dev/dri/by-path/pci-0000:00:02.0-card:ro'
  ...

Directory /dev/dri:

├── by-path
│   ├── pci-0000:00:02.0-card -> ../card0
│   └── pci-0000:00:02.0-render -> ../renderD128
├── card0
└── renderD128
@tkatila
Copy link
Contributor

tkatila commented Dec 7, 2023

Hi @marhav20, are you using dockerd as CRI? If you are, you have two options:

  1. Downgrade GPU plugin to 0.26.0. From a functional point of view, you won't lose anything.
  2. Move from dockerd to containerd or cri-o. The mount works fine with those.

There is an existing issue about this: #1564

@tkatila
Copy link
Contributor

tkatila commented Dec 8, 2023

Duplicate to #1564

@tkatila tkatila closed this as not planned Won't fix, can't repro, duplicate, stale Dec 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants