-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unexpected false Requeue in result.Min
#1889
Comments
result.Min
result.Min
Why should this change be made? If initialization is false, that means some condition on the Node has not yet been satisfied (e.g. missing resources, startup-taints present, etc). For those conditions to be satisfied, an update must be performed against the node (out-of-band of Karpenter) which will trigger the reconciler. Is there a case that this doesn't hold true for? |
/triage needs-information |
Yes, in fact I'm using kwok as cloud provider to benchmark karpenter, when I'm trying to scale nodeclaims up to 2000 at one time this issue happened, it seems that controller missed node readiness event and the nodeclaim won't be reconciled again, so the nodeclaim will stay in not ready condition |
Interesting, are you able to reproduce this consistently or was it a one off? I'm wondering if there's some other condition which caused the Node to not be considered initialized or if we really are missing the event. If we're missing the event, we should figure out if it's due to the predicates in the registration function, or an actual issue in controller-runtime / client-go (less likely). I don't think we want to jump ahead and just add a requeue though until we understand the root cause. |
It's reproducable, here is my reproduction process:
then up to 2000 node will be created and in ready status as expected, but some nodeclaims will stay in not ready forever |
Description
Observed Behavior:
If nodeclaim's initialized condition is false and registered condition is true, here all the items in
results
will have a zeroRequeueAfter
, and this function will throw a result withRequeue
is false andRequeueAfter
is zero.karpenter/pkg/utils/result/result.go
Lines 27 to 40 in 79fe772
Expected Behavior:
maybe we should set
Requeue
to trueReproduction Steps (Please include YAML):
Versions:
kubectl version
): 1.30.0The text was updated successfully, but these errors were encountered: