Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add minimumKubeletVersion #2059

Merged

Conversation

haircommander
Copy link
Member

No description provided.

Copy link
Contributor

openshift-ci bot commented Oct 9, 2024

Hello @haircommander! Some important instructions when contributing to openshift/api:
API design plays an important part in the user experience of OpenShift and as such API PRs are subject to a high level of scrutiny to ensure they follow our best practices. If you haven't already done so, please review the OpenShift API Conventions and ensure that your proposed changes are compliant. Following these conventions will help expedite the api review process for your PR.

@openshift-ci openshift-ci bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Oct 9, 2024
@openshift-ci openshift-ci bot requested review from deads2k and sjenning October 9, 2024 17:06
haircommander added a commit to haircommander/kubernetes that referenced this pull request Oct 9, 2024
@openshift-ci openshift-ci bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 11, 2024
haircommander added a commit to haircommander/kubernetes that referenced this pull request Oct 17, 2024
haircommander added a commit to haircommander/kubernetes that referenced this pull request Oct 18, 2024
@haircommander
Copy link
Member Author

/retest

@haircommander
Copy link
Member Author

/retest

@openshift-ci openshift-ci bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 22, 2024
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 22, 2024
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 22, 2024
@haircommander haircommander force-pushed the min-kubelet-version branch 2 times, most recently from 905a012 to 6471c90 Compare October 24, 2024 14:51
@@ -62,6 +62,11 @@ type KubeAPIServerConfig struct {

// TODO this needs to be removed.
APIServerArguments map[string]Arguments `json:"apiServerArguments"`

// MinimumKubeletVersion is the lowest version of a kubelet that can meaningfully join the cluster.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/can meaningfully join/is allowed to join/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's allowed to join but it's blocked from doing anything meaningful in the cluster.

@haircommander
Copy link
Member Author

/retest

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 24, 2024
features/features.go Outdated Show resolved Hide resolved
config/v1/types_node.go Outdated Show resolved Hide resolved
config/v1/types_node.go Outdated Show resolved Hide resolved
// Specifically, the apiserver will deny (most) authorization requests of kubelets that are older
// than the specified version.
// +kubebuilder:validation:Pattern=`^[0-9]*\.[0-9]*\.[0-9]*$`
// +openshift:enable:FeatureGate=MinimumKubeletVersion
Copy link
Member

@enxebre enxebre Oct 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this field is not marked as immutable and it's optional, what happens if I change it to a version value higher than existing kubelets?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an admission plugin in the kube-apiserver that denies such changes https://github.com/openshift/kubernetes/pull/2104/files#diff-5c437405eeba0789e9c42802e3f36cf6bdafd59d6d5c5dbaa6b66a2e02948bd7R100-R125
something similar will be used in hypershift, but set a condition on the hosted cluster object, rather than block on admission

Copy link
Member

@enxebre enxebre Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack. I think we should probably clarify somewhere that this validation on admission is by the nature of a distributed system only a best effort. Inflight operations might still result in older kubelet bypassing it if the right lucky timing takes place.

For hypershift yes the norm is signal via conditions. However for improving consumer UX we might also want to explore introducing VAP and let the current minimal kubelet be reported via vap parameter that is inferred from e.g. the HC.status.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I can mention it in the library-go PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but set a condition on the hosted cluster object, rather than block on admission

Why can't we block on admission?

config/v1/types_node.go Outdated Show resolved Hide resolved
config/v1/types_node.go Outdated Show resolved Hide resolved
@haircommander
Copy link
Member Author

@enxebre there's also openshift/library-go#1831 and openshift/cluster-kube-apiserver-operator#1754

the overarching epic is https://issues.redhat.com/browse/OCPNODE-2506. Note a number of the cards tracking these PRs are closed because they were originally scoped to drafts for the work until I got feedback and could size the time to respond

@haircommander
Copy link
Member Author

thanks @enxebre @JoelSpeed I've updated for your comments, PTAL 🙂

@haircommander
Copy link
Member Author

/retest

config/v1/types_node.go Outdated Show resolved Hide resolved
Comment on lines +51 to +53
// Specifically, the apiserver will deny most authorization requests of kubelets that are older
// than the specified version, only allowing the kubelet to get and update its node object, and perform
// subjectaccessreviews.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting technical detail, but, what does this mean to an end user?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good question, I added it from @enxebre 's review, I'm not feeling too opinionated on keeping it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Alberto's question was fair, I'm just not sure we are adding the right context

What does it look like to an end user when the kubelet is having these various authzs fail?

Does it mean pods won't schedule there? Does it mean the node goes not ready? Is the node actually still functional?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does

	// minimumKubeletVersion is the lowest version of a kubelet that can join the cluster.
	// Specifically, the apiserver will deny most authorization requests of kubelets that are older
	// than the specified version, only allowing the kubelet to get and update its node object, and perform
	// subjectaccessreviews.
	// This means the kubelet won't be able to view API objects it's responsible for running,
	// and will eventually be marked as NotReady.

work for you?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens to a running pod if this version is changed to mean then that the kubelet no longer meets the requirements? Is that possible?

Does won't be able to view API objects it's responsible for running mean that existing pods break, or, new pods won't be able to execute?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that should be blocked at admission: https://github.com/openshift/kubernetes/pull/2104/files#diff-5c437405eeba0789e9c42802e3f36cf6bdafd59d6d5c5dbaa6b66a2e02948bd7R117-R122
So really this will only affect new kubelets that attempt to join the cluster after the min version is in place. those kubelets won't be able to run anything

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So perhaps then we take what you had above with a slight tweak?

// This means, any kubelet that attempts to join the cluster will not be able to run any assigned workloads, and will eventually be marked as not ready.

When you say will eventually be marked not ready, surely it will never get ready, since the CNI won't be able to initialise the network right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah unless it manages to upgrade itself which would require manual intervetion

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated to your suggestion

config/v1/types_node.go Show resolved Hide resolved
config/v1/types_node.go Outdated Show resolved Hide resolved
kubecontrolplane/v1/types.go Outdated Show resolved Hide resolved
config/v1/types_node.go Show resolved Hide resolved
@haircommander haircommander force-pushed the min-kubelet-version branch 2 times, most recently from 865cb02 to 85078a8 Compare October 30, 2024 17:14
haircommander added a commit to haircommander/kubernetes that referenced this pull request Oct 30, 2024
@haircommander
Copy link
Member Author

/retest

@JoelSpeed @enxebre any other thoughts?

@haircommander haircommander force-pushed the min-kubelet-version branch 2 times, most recently from e8e382b to 75510c3 Compare October 31, 2024 18:09
@haircommander
Copy link
Member Author

/retest

Copy link
Contributor

@JoelSpeed JoelSpeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we have this final test case, I'm good to merge, thank you

apiVersion: config.openshift.io/v1
kind: Node
spec:
minimumKubeletVersion: 1.30.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a test that shows the empty string passes validation

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

@JoelSpeed
Copy link
Contributor

/lgtm

Thanks @haircommander

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Nov 1, 2024
Copy link
Contributor

openshift-ci bot commented Nov 1, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: haircommander, JoelSpeed

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 1, 2024
@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD ddd439f and 2 for PR HEAD 808d881 in total

Copy link
Contributor

openshift-ci bot commented Nov 1, 2024

@haircommander: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-azure 808d881 link false /test e2e-azure

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit 04eb3fd into openshift:master Nov 1, 2024
18 of 19 checks passed
@openshift-bot
Copy link

[ART PR BUILD NOTIFIER]

Distgit: ose-cluster-config-api
This PR has been included in build ose-cluster-config-api-container-v4.18.0-202411012108.p0.g04eb3fd.assembly.stream.el9.
All builds following this will include this PR.

haircommander added a commit to haircommander/kubernetes that referenced this pull request Nov 4, 2024
haircommander added a commit to haircommander/kubernetes that referenced this pull request Nov 4, 2024
haircommander added a commit to haircommander/kubernetes that referenced this pull request Nov 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants