-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement agent reuse, toggled by a cluster profile config option #355
base: master
Are you sure you want to change the base?
Implement agent reuse, toggled by a cluster profile config option #355
Conversation
Thanks for this. Appreciate all the work here. Unfortunately I have limited capacity to review, try out and give useful insight so might take me some time 🙏 |
Nice work +1 on this |
179c5a7
to
a7e1761
Compare
Hi there @chadlwilson! Is there any chance that this PR could move forward in the next month or so? Is there anything I can do to help, or make you more comfortable merging this? If not - no worries - I would probably take a stab at forking the repo, building from a fork, and testing this out more in our production GoCD server. If I find any issues from that I'll push them up to this PR. |
Thanks for the ping. Honestly the challenge with these elastic agent plugins is just my concerns around regression and most don't have any particular "integration" tests when actually working with GoCD - and there are a lot of weird timing and other scenarios that can arise with the elastic agent plugins. (e.g what happens if the GoCD server dies, when it comes back does it still correctly adopt the agents or do they end up orphaned? what happens if a pod dies that was expecting to be reused? what if the container restarts inside?). This plugin also happens to probably be the most actively used elastic agent approach. But ya, the real limitation has been time and energy, distraction and the extra 'friction' on validating this one. it is still on my radar, I've just been kicking the can down the road. But it's a bit easier for me to validate locally now that we have smaller non-Alpine arm64 container images to use on elastic agents. I naively expected you were probably running this off your own build anyway, is that not the case? :-) |
a7e1761
to
148c342
Compare
26fc03a
to
f7da6b9
Compare
After an inordinate amount of time, I gave this a quick whirl locally in a colima/k3s single node cluster. Certainly not tested in anger, nor fully, so I only have a few observations so far.
Generally speaking I'm finding this hard to review due to the other "supporting" and style-oriented changes in here which distract me a bit from the core feature. Largely this is probably because I didn't write any of the original code, and so it's not always obvious to me why the supporting changes were needed, or what the implications are.
Basically, this might be too complex for me to review. My main concern is to not break existing things, as I'm basically the only maintainer who looks at anything across all of GoCD server plus plugins and this is one of the most heavily used plugins. If we can't get it so it's easier for me to reason about and figure out how it might change the as-is behaviour or some other help to review/test so it's not all on me, it might be better to maintain a fork. |
Description
This implements agent reuse following the approach outlined in these comments: #53 (comment) #53 (comment). Resolves #53
Cluster profile has an option to enable agent reuse, defaulting to false:
The naming, and presentation of this option, are open to feedback! 🙂
Changes
When pods are created, they are annotated with a hash of the elastic config. This ensures that when elastic config is changed, agents created from the old config will not be reused, and will eventually expire after the timeout.
When agent reuse is enabled, the main behavior changes are:
Other supporting changes:
properties
) for clarity between cluster profile properties and elastic profile properties.This ended up being a pretty large set of changes. Happy to explain in more detail if needed!
Testing
Many unit tests updated and added, and run with
./gradlew test
. I've also tested this running GoCD on my machine with a Kubernetes cluster inkind
.