-
Notifications
You must be signed in to change notification settings - Fork 476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question] Anti affinity rules for function pods without need to create separate Profile for each function #783
Comments
The first thing to note is that (anti)-affinity operators support: Second, every function has a label Last, I would be very curious to know what the actual use case is. This feels like something that you really shouldn't care about. I don't mean to discount that there are use-cases that might really need this. But if the goal is to help reserve resources for each instance (e.g. CPU or GPU) then resource request/limits are going to be a better way to express this. A concrete example of what you are trying to do and why it is needed would be very helpful to help guide toward a better solution. I could see reasons to keep "stacks" of functions co-located (for example to reduce latency of requests between them, e.g. with a cache). In this case, using affinity with the |
Hi Lucas, thanks for your answer. I've checked the docs and I saw the options to use In, NotIn, Exists, DoesNotExist, but they are still not something that will help me the achieve what I want. The actual use case is that if I have 3 node k8s cluster, I want the pods to be spread evenly. If one node goes down, the function invocations can continue working without disruption. If for some reason all 3 pods of the function are living on the same node and the node goes down, there is down time introduced even if k8s is about to bring the pods up on some of the living nodes. I think the use case is pretty simple and I was looking for a way to not having yet another entity to manage. |
I like your use case/example. I wonder if we could add support for this in the Profiles https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/ Depending on how your cluster is deployed you might also be able to use this https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/#cluster-level-default-constraints Then it should be possible to have a single Profile for all of your functions and achieve the even spread between nodes. Those features require k8s 1.19, but this should be available in all clusters and should also do exactly what you want https://kubernetes.io/docs/reference/scheduling/policies/#:~:text=ServiceSpreadingPriority I haven't used it, but I think it requires configuring an |
Welcome to the community. I see Lucas has been able to give you some suggestions.
I've repopulated the issue template that you deleted. This is a required part of the community participation and we would like you to fill it out with all the details we ask for. How many replicas and nodes do you expect to have on average? What work have you done to determined that the spread of functions across your nodes is uneven? How did the function replicas spread when you scaled from 1 to 3 replicas in your 3 node cluster? Related to this - how often, have you observed node failures in production? I'm curious what workarounds you have considered already? A couple that came to mind were:
More importantly, you should apply whatever strategy you take to not just the functions that (we create via faas-netes), but to any components you deploy, including all the OpenFaaS core services in the helm chart. If you can find a way to update your default scheduler, that may be a quick win that gives you the result you're looking for. |
Hi everyone, Sharing my similar problems! However, the challenge is that the above solutions work only when we are dealing with different functions or different versions of a function, not different replicas of a function. In fact, we need to treat replicas (Pod) of a function (Deployment) differently so that we can schedule the X-th replica on a particular node or that we can route Y% of traffic to particular replicas based on our custom metrics. I feel such challenges come from the stateless deployment followed by Serverless platforms. For instance, traffic splitting for replicas of a function could probably achieve if functions were deployed as StatefulSet so they can communicate with EndPointSlice while, in a stateless Deployment object, this EndPointSlice won't work desirably (I'm not sure about what I am saying here). Going through documents, I find it much feasible to benefit from topology-based designs where we can schedule X-th replica of a function on a certain zone, region, etc or we can give different weights to each zone, region, etc. so the replicas scheduled in that zone will receive Y% of the traffic (invocations). To implement such policies, I see, as Lucas said, PodTopologySpreadConstraints can help with the scheduling problem and Topology Aware Hint (and EndpointSlice) can help with traffic management. Any suggestion is appreciated. |
Coming back to this later on, have you considered simply creating one profile per function? Why did you conclude that this would not work? Do you have any data? For Pod Topology Spread Constraints, we've had another request for this, but only for the core components deployed via helm #856 |
Expected Behaviour
Current Behaviour
Are you a GitHub Sponsor (Yes/No?)
Check at: https://github.com/sponsors/openfaas
List All Possible Solutions and Workarounds
Which Solution Do You Recommend?
Steps to Reproduce (for bugs)
Context
Your Environment
FaaS-CLI version ( Full output from:
faas-cli version
):Docker version
docker version
(e.g. Docker 17.0.05 ):What version and distriubtion of Kubernetes are you using?
kubectl version
Operating System and version (e.g. Linux, Windows, MacOS):
Link to your project or a code example to reproduce issue:
What network driver are you using and what CIDR? i.e. Weave net / Flannel
First of all, hello everyone!
I am looking for a way to make the function pods get spread evenly across all available k8s nodes. I was looking for information and tinkering with what is available so far, but I don't see a way to implement this functionality without having separate Profile for each function.
Looking in OpenFaas docs, I understood that I need to create a Profile which contains the anti affinity rule, and then I can link the function to this profile with annotation, and then openfaas will put the anti affinity rule from the profile into the deployment spec of the function.
According again to Profiles docs, I think this is somehow done in order to reuse a single profile for multiple functions, based on the function's requirements. Well in my case, if i have 3 node k8s cluster, and I want the function to scale up to 3 pods, then I want each of the function pod to be running on a different node.
Every function has a label with unique ID, and if my tool was creating the k8s deployment, I could simply put this label with it's ID in the anti affinity pod rule, however I cannot set this directly in the function and the only way using a Profile. To me it seems that currently I cannot create a single Profile with such anti affinity rule and then pass it to every function, because every function has unique ID and the anti affinity rules in kubernetes require specifying both the key and value of the label.
I was wondering if anybody from the community here had the same problem to solve and have found solution without having to create a separate profile for each function.
It is my first time creating an issue in order to ask such thing, but since this is not an actual bug, I've skipped the issue template, I hope that's not a problem.
Regards,
Angel!
The text was updated successfully, but these errors were encountered: