Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable CDI in containerd configuration #2100

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dims
Copy link
Member

@dims dims commented Dec 26, 2024

enable_cdi = true is set to true in containerd 2.0:
containerd/containerd@c8e8a09

Some DRA drivers are using CDI:
https://github.com/search?q=repo%3ANVIDIA%2Fk8s-dra-driver%20cdi&type=code

A bunch of CI jobs in upstream are now enabling CDI:
https://cs.k8s.io/?q=enable_cdi&i=nope&files=&excludeFiles=&repos=

So let's switch it on in our images as well please.

@dims dims force-pushed the enable-cdi-in-containerd-configuration branch from d3ecb51 to 5657448 Compare December 26, 2024 03:27
@dims dims force-pushed the enable-cdi-in-containerd-configuration branch from 5657448 to 567ee65 Compare December 26, 2024 04:06
@dims
Copy link
Member Author

dims commented Jan 6, 2025

@cartermckinnon does this look ok to you?

@cartermckinnon
Copy link
Member

@dims sorry, catching up on this one. I don't know in detail what this option enables, and I'm hesitant to switch it on in 1.7.x until the early adopters bake it a bit. IIUC this won't be functional for DRA on EKS anyway until that feature is default on (in 1.33, TBD)?

@dims
Copy link
Member Author

dims commented Jan 16, 2025

@cartermckinnon you can read about it here - https://github.com/cncf-tags/container-device-interface

yes 1.33 is targetted for DRA/Beta to be enabled by default.

@cartermckinnon
Copy link
Member

Ah thanks! I'm mostly concerned about what is toggled in the containerd internals, e.g. what if the cdi_spec_dirs don't exist/can't be created or can't be watched? https://github.com/containerd/containerd/blob/2ab62acd6379cda100b5af7e3e0b65877359c8b1/internal/cri/server/service_linux.go#L107

It looks like those errors are not currently surfaced here, but that's an impl detail: https://github.com/cncf-tags/container-device-interface/blob/4149a8a524dfe96686d97789833cc78f61eb6631/pkg/cdi/cache.go#L102

This part looks safe, if the CRI spec doesn't have CDI devices or annotations, it's a no-op: https://github.com/containerd/containerd/blob/2ab62acd6379cda100b5af7e3e0b65877359c8b1/internal/cri/server/container_create_linux.go#L124

I'm on board with enabling it for 1.32 for now, since we'll need it in 1.33 for DRA

@dims
Copy link
Member Author

dims commented Jan 17, 2025

This part looks safe, if the CRI spec doesn't have CDI devices or annotations, it's a no-op:

yep!

@austinvazquez sounds about right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants