Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Set Machine's BootstrapReady when there is no ConfigRef #11459

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

zaneb
Copy link

@zaneb zaneb commented Nov 21, 2024

If there is no ConfigRef but the bootstrap data secret is set by the user (instead of a bootstrap provider), then BootstrapReady should be true. This is the case for MachineSet, and was originally the case for Machine since 5113f80. However, in d93eadc this changed as a side effect of ensuring that bootstrap config object can continue to be reconciled after the bootstrap provider has produced the bootstrap data secret.

This change ensures that, once a bootstrap data secret exists, in the case of a ConfigRef it can still be reconciled, while in the case there is no ConfigRef, BootstrapReady is set.

/area machine

@k8s-ci-robot k8s-ci-robot added area/machine Issues or PRs related to machine lifecycle management cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Nov 21, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign neolit123 for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot
Copy link
Contributor

Welcome @zaneb!

It looks like this is your first PR to kubernetes-sigs/cluster-api 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/cluster-api has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Nov 21, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @zaneb. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@JoelSpeed
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 21, 2024
}
s.bootstrapConfig = obj

// If the bootstrap data is populated, set ready and return.
if m.Spec.Bootstrap.DataSecretName != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this come after reconciling the bootstrap config? The data secret name is either set by the user, in which case bootstrap config is not needed, or, it is set by the bootstrap provider, in which case the bootstrap config has already been reconciled. Wondering if we just need to move this block above?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe #7394 has the answer to that:

to ensure the ownerReference, and other changes, are set before the dataSecretName is checked. This is designed to help in backup and restore use cases, or any other case where the ownerReference is removed.

// Tolerate bootstrap object not found when the machine is being deleted.
// TODO: we can also relax this and tolerate the absence of the bootstrap ref way before, e.g. after node ref is set
return ctrl.Result{}, nil
if m.Spec.Bootstrap.ConfigRef != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about

Suggested change
if m.Spec.Bootstrap.ConfigRef != nil {
// If the bootstrap data secret is set by the user, set ready and return.
if m.Spec.Bootstrap.ConfigRef == nil {
if m.Spec.Bootstrap.DataSecretName == nil {
return ctrl.Result{}, errors.New("either spec.bootstrap.dataSecretName or spec.bootstrap.configRef must be populated")
}
m.Status.BootstrapReady = true
conditions.MarkTrue(m, clusterv1.BootstrapReadyCondition)
return ctrl.Result{}, nil
}
// Call generic external reconciler if we have an external reference.
...
// Drop >> If the bootstrap data is populated, set ready and return.
...

So we take care of user provided data secret first, and then we handle when data secret is generated by the bootstrap provider (without mixing the two use cases)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks like it would work (though people thought the same about this). I guess it comes down to whether it's worse to duplicate the code that marks the bootstrap ready, or to duplicate an if statement.

Copy link
Member

@fabriziopandini fabriziopandini Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to add more tests to ensure the solution works and we don't have regressions (and that the fact that test did not catch the issue with the previous PR, is a clear signal that we need more tests).

WRT to the implementation options, I would prefer to not mix up the two use cases in the code because the resulting code is simpler to read and to reason about and this usually pays back on the long run.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for self: Separation between use cases doesn't seems to be addressed in latest commit, also unit test are missing

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's obviously no unit tests but the latest commit is exactly what you asked for in terms of separation of the use cases I thought?

Copy link
Member

@fabriziopandini fabriziopandini Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry but I don't think the last commits address the separation between use cases as the snipped above will do

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid I'm going to need you to explain in more detail, because the control flow in the updated patch is identical to the snippet you gave above.
It checks for a missing bootstrap ref and handles that case and returns early, leaving the rest of the function to handle the case where there is a bootstrap ref. That's what I understood from when you said:

take care of user provided data secret first, and then we handle when data secret is generated by the bootstrap provider (without mixing the two use cases)

@zaneb zaneb force-pushed the machine-bootstrap-ready branch from 8ebd235 to 834bead Compare November 28, 2024 02:43
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Nov 28, 2024
If there is no ConfigRef but the bootstrap data secret is set by the
user (instead of a bootstrap provider), then BootstrapReady should be
true. This is the case for MachineSet, and was originally the case for
Machine since 5113f80. However, in
d93eadc this changed as a side effect
of ensuring that bootstrap config object can continue to be reconciled
after the bootstrap provider has produced the bootstrap data secret.

This change ensures that, once a bootstrap data secret exists, in the
case of a ConfigRef it can still be reconciled, while in the case there
is no ConfigRef, BootstrapReady is set.

Signed-off-by: Zane Bitter <[email protected]>
@zaneb zaneb force-pushed the machine-bootstrap-ready branch from 834bead to f74c2e3 Compare November 28, 2024 02:44
Copy link
Member

@chrischdi chrischdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we also add some test coverage for the resulting machine to ensure we don't have regressions in future for this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/machine Issues or PRs related to machine lifecycle management cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants