Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new project/feature "Karpenter Downscaler" #1800

Open
flbla opened this issue Nov 7, 2024 · 5 comments
Open

new project/feature "Karpenter Downscaler" #1800

flbla opened this issue Nov 7, 2024 · 5 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@flbla
Copy link

flbla commented Nov 7, 2024

Description

What problem are you trying to solve?

With my team we started to work on a project to stop nodes managed by Karpenter at specific times.
We called it "Karpenter Downscaler"

Karpenter Downscaler operates as a controller with a CRD.
Based on the schedules we set, it automatically scales down to 0 the nodes managed by Karpenter, freeing up resources.

Right now, it's a private project, but we would like to opensource it.

Does the Karpenter community could be interested in such tool ? or shall we make it opensource but in our org.

Thanks

How important is this feature to you?

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@flbla flbla added the kind/feature Categorizes issue or PR as related to a new feature. label Nov 7, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If Karpenter contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Nov 7, 2024
@alter3d
Copy link

alter3d commented Nov 12, 2024

I think this is best handled outside of, but in conjunction with, Karpenter, using existing tools such as KEDA. We scale our dev cluster down to (almost) zero overnight using KEDA's ScaledObjects, and as KEDA scales down all the deployments, etc, Karpenter automatically scales down the nodes.

Karpenter isn't application-aware and I don't think it should be. The only signal it needs to scale down is unused capacity in the cluster (i.e. nodes that can be consolidated or removed), and there are existing tools for that.

@yrotilio
Copy link

I agree that in many cases, independent tools like KEDA or kube-downscaler are sufficient to scale down most of a cluster using karpenter's built-in disruption behavior. But, like you say, these tools are made to scale applications, not clusters.

There are at least 2 caveats that this tool intends to address:

  • To be able to scale down the whole cluster, not just apps, including base system tools. (and it will be using Cloud APIs in order to do so unless karpenter finds a way to run outside the cluster)
  • To lower the burden of scheduling on applications teams. (Which is, I agree, not the best practice, but still a necessity when you face hundreds of teams with very heterogeneous competence and knowledge on Kubernetes ecosystem)

The goal for such a tool is not for Karpenter to become application aware, because it's not connected in any way with application manifests, but to empower infra teams with a way to force a shutting down a cluster in order to manage costs, without relying on application teams.

@mariuskimmina
Copy link

mariuskimmina commented Nov 12, 2024

Related to: #1177

We also ran into this and we currently don't have KEDA in our cluster so we had to find another solution. For us Karpenter is running on Fargate in our cluster (as well as CoreDNS). We have a cronjob that patches the nodepool cpu limits to 0 on a schedule. Only Karpenter and CoreDNS remain. Then another cronjob (also on Fargate) patches the nodepool back to it's original limit to bring up nodes again.

@sftim
Copy link

sftim commented Dec 5, 2024

As a project, we love it when people publish software outside of Kubernetes that can work with Kubernetes.

There is a process for donating components to Kubernetes, and we like code donations too, but it is more work. Talk to SIG Autoscaling if you want to donate a controller repository to Kubernetes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
None yet
Development

No branches or pull requests

6 participants