-
Notifications
You must be signed in to change notification settings - Fork 538
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SERVE] Allow adjustment of scaling policies without redeployment #4442
Comments
cc'ing @cblmemo |
Hi @JGSweets , thanks for reporting this! However, I think we already have this feature, see: skypilot/sky/serve/replica_managers.py Lines 1189 to 1223 in ee3cabd
I also tried the following on current master ( # now `service.replicas` field is 1
sky serve up -n minimal examples/serve/minimal.yaml
# change `service.replicas` field to 2
sky serve update minimal examples/serve/minimal.yaml and got the following: Noticed that the replica with ID 1 has a version of 2, which means this replica's version is bumped and the replica is reused. Could you share more of your usage? If that does not works for you, it is possible that there are some bugs in our system, and some related tests could help us find it ;) |
Interesting, I had an experience recently where I increased |
Is the intended functionality mentioned in the docs? |
Yes, pls check the first hint in this doc: https://docs.skypilot.co/en/latest/serving/update.html |
I'll have to see if I can tease the issue out more / can replicate it. I thought using old resources was the intended functionality, so glad you confirmed it. I'd need to verify that ami didn't change ain addition to the min/max replicas. Possibly what happened, but I thought it was the same. |
Currently, when altering the
replica_policy
, update runs a pseudo blue-green deployment in the sense it launches all new resources.Preferably, if only the replica_policy is changing, it alters the policy itself without deploying /tearing down new instances unless required by the new policy.
Example 1:
Init: Currently, 2 resources are running, but the
min_replica
is set to 3.Result: Only 1 instance is launched.
Example 2:
Init: Currently, 2 resources are running, but the
min_replica
is set to 1 and qps would not be met if scaled down.Result: 1 instance is torn down.
Solution Options:
Version & Commit info:
skypilot, version 0.7.0
skypilot, commit 3f62588
The text was updated successfully, but these errors were encountered: