-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Partitioned NodePool Multi-node Consolidation #853
Comments
We've talked about this a little bit for deprovisioning improvements for the exact reason that you called out: If we order without considering a NodePool boundary, we have some potential to get stuck if those boundaries are independent of each other. There's potential to perhaps try another form of multi-node consolidation where we perform Also, I'm curious: Are you still seeing single node consolidation? I would expect that if we were failing to multi-node consolidaste, we would still attempt to single node consolidate if there are nodes available. |
I remember this being an issue on an older version, but I'm not seeing it getting blocked on single-node consolidation anymore. I think before it would just short-circuit if there were any Pods in pending state.
But what if there are 2+ NodePools in each "partition"? For example, with a different Has introducing the concept of NodePool groups as an explicit API ever been discussed? Something like having a |
#488 I think similar thought and perf improvement. |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
I've been looking at multi-node consolidation and I also discovered these findings independently. The consolidation occurs across all nodepools, which is not ideal if you have very different nodepool configurations in your cluster. If the nodepools are similar, I could see this being a positive, but that is not how our clusters are designed. There is one other factor that we found that was equally significant if not more significant, multi-node consolidation also does not take into account node architectures. So an amd64 node may be consolidated with a arm node. This in theory could succeed if the workloads were all multi-arch compatible, but that is not the case in our workload clusters, so this consolidation also always fails. So the combination of nodepool mixing + architecture mixing means multi-node consolidation effectively never finds a successful simulation in our clusters. |
/remove-lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close not-planned |
@k8s-triage-robot: Closing this issue, marking it as "Not Planned". In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Description
Observed Behavior:
I have a few NodePools that I'm using in a "partitioned" manner. Basically, each NodePool is made independent using user-defined requirements & taints, and Pods in different namespaces use different Nodepools.
This works fine for the most part, but I'm observing issues with multi-node consolidation.
As far as I can tell, multi-node consolidation looks at all deprovisionable Nodes together:
https://github.com/kubernetes-sigs/karpenter/blob/cc54b340f630b46a26d19a3cbd49d90c8b3a6d45/pkg/controllers/disruption/multinodeconsolidation.go#L44C42-L44C42
Which I think means there's no multi-node consolidation happening (or it's sub-optimal at best). Shouldn't this be done on groups of compatible NodePools independently?
Another place where I think this is a problem is when simulating the scheduling you look at all Pending Pods:
https://github.com/kubernetes-sigs/karpenter/blob/cc54b340f630b46a26d19a3cbd49d90c8b3a6d45/pkg/controllers/disruption/helpers.go#L97C35-L97C35
But if one of those Pending Pods is not compatible with the firstN candidates chosen from all Nodes, then simulation will always complain about unschedulable Pods (highly likely as the number of nodes/paritions increase)
Expected Behavior:
NodePools should be consolidated in groups computed based on their requirements or based on some configurable partition key.
Reproduction Steps (Please include YAML): See above
Versions:
kubectl version
): 1.27The text was updated successfully, but these errors were encountered: