Improve dependencies versioning and ci #141

LeoGrin · 2025-01-16T16:22:31Z

Adds minimum (found on my mac M3 using py3.9 and py3.10, all <= 2022 expect torch which I didn't change) and maximum version to all the dependencies. This prevent failures when a dependency makes a breaking change
Add CI: dependabot to update the max dependency, and some actions to run the tests on the minimum and maximum versions of the dependencies (on mac and ubuntu).
Run the pre-commit on all files

LeoGrin · 2025-01-16T17:14:46Z

@noahho I'm pushing the maximum python version from 3.11 to 3.12 (the tests are passing). Is there any reason you set it to 3.11 instead of 3.12?

LennartPurucker

Two small comments, other LGTM

.github/workflows/pull_request.yml

LennartPurucker · 2025-01-21T12:30:56Z

pyproject.toml

+  "torch>=2.1,<=2.5.1",
+  "scikit-learn>=1.2.0,<=1.6.1",
+  "typing_extensions>=4.4.0,<=4.12.2",
+  "scipy>=1.7.3,<=1.15.1",
+  "pandas>=1.4.0,<=2.2.3",
+  "einops>=0.2.0,<=0.8.0",
+  "huggingface-hub>=0.0.1,<=0.27.1",


Are these the current max versions or the end of our support versions?
IMO, as long as we do not have an upper limit, we should not define one.

But I think this is up to the taste of the maintainer and boils to one of the following choices:

Update this file every time any of the unlimited packages are updated, or

update this file only when something breaks

It's the current max versions. My thinking was that it would prevent failures for users, and dependabot would update these max versions weekly (and then we could merge in one click if it works, otherwise investigate).
But yeah I'm not sure, not specifying the max version does reduce the maintainer work a little bit, and could be combined with some other github action which checks that things are working with the latest version.

Just to chime in:

Both options have tradeoffs and I want to be a bit more explicit about them:

No upperbound (Max compatibility with other libs, more immediate issues):

You effectively are at the whim of the lib to not break their API. This rarely happens with well maintained libraries except in X changes of a version (X.Y.Z, see semvar). However it will eventually happen, depending on how much functionality you use of a library. For example, with numpy, they're very unlikely to change how np.add works and if that's all you use, then you shouldn't need to upperbound it. However if you were to rely on numpy's C bindings, then they may change it as they did in the bump to 2.0.0. The upside of this is that if a user installs a higher version and it just works, then they don't have to come ask you to bump it.

Explicit Upperbound (Limited compatibility, less immediate issues):

You effectively specify a set of dependencies which guarantee you library works (with cave-eats such as pre-existing bugs in existing versions within bounds). The downside here is that if the user uses a library that needs a library version with a lower bound such as 2.6.0 and you upperbound on 2.5.0, then they can not use your library alongside their other one, and you will likely get a request to up it at some point.

So the tradeoff is when do you want to patch a version change? With no upperbound, it will happen immediately for a user when an issue occurs and they will raise an issue, at which point you recommend downgrading the conflict (if they can, same problem you get with Explicit upperbound but harder to debug.

With Explicit upperbound, you save the user from unknown dependancy hell, replacing it with known dependancy hell. You also save yourself immediate dependency management, but as things go with time, eventually your upperbounds become the lowerbounds of the eco-system, and you will have to solve them bit by bit. You also prevent perfectly compatible libraries from working when they otherwise would.

Given that, my recommendation and strategy is use your vibe-checks to know which libraries to upperbound and which to not. For example, I would very comfortably set numpy to have an upperbound on X, i.e. numpy<3, as they are unlikely to break anything anytime soon. Likewise with torch, but you need to consider that torch is usually a lot more unstable due to the hardware eco-system they build upon and the functionality you utilize from torch, such as their attention mechanisms. There are some libraries like typing_extensions which are developed by Python themselves and (almost) never backwards breaking, as the library is pure python and doesn't rely on CPython or otherwise system-level libraries. I would never upper-bound this.

If you were to use something like SMAC (sorry SMAC) or otherwise less production grade libraries, then they are unlikely to follow semvar strictly, and from version to version, they could break things unknowingly, usually not intentionally. These are often the hardest to judge as if several popular libraries depend on it, each of them distrusting it and setting a relatively strict upperbound, this will cause lots of conflicts.

Other considerations is the frequency of library updates. Things like pyyaml are basically complete, in that it is unlikely to update, meaning a stricter upperbound isn't going to cause as much dep maintanence as a package which updates extremely frequently.

---lock files

Recommendation: Upperbound libraries like numpy and torch to their next X. If you are unsure of the development practices of a dependancy, upperbound on their Y. Never upper bound on a Z unless there's a history of issues.

If you want to provide a this setup works assuming no other dependancies, such as in a Docker environment for deployment, then you provide lock files which specify exact dependancies.

Thanks a lot @eddiebergman, really interesting!! Based on your comments I made the max versions much laxer: no bound for typing-extension, bound on X for all the rest except scikit-learn and einops, for which bound on Y. Tell me if you disagree! I would say that using these lax upper bounds + dependabot is a good balance between making life easy for maintainers while being robust for users.

Very well put into words @eddiebergman!

I agree with going forward like this and like the reasoning.
One more comment: I think it is also useful to no upperbound to push ourselves to support higher versions.

LeoGrin added 7 commits January 16, 2025 13:37

xfail instead of fail for one sklearn test

e4865bd

add minimum and maximum (current) version for our package

587bea7

add CI and dependabot

afe8025

add miniumum versions even when very old to help the parsing in the ci

e39ec66

accept python3.12

ea33ed5

apply precommit

3ef122d

update ruff version to match precommit config

55aefe5

LeoGrin force-pushed the improve_dependencies_versioning_and_ci branch from 4b923a0 to 55aefe5 Compare January 16, 2025 17:07

LeoGrin added 2 commits January 16, 2025 18:09

no ruff on examples

4bb2abf

fix requirement install in ci

221e34e

fix requirement install in ci

9e0dc05

LeoGrin requested a review from eddiebergman January 16, 2025 17:35

LeoGrin mentioned this pull request Jan 20, 2025

Support Python >= 3.12 #145

Closed

LeoGrin requested review from LennartPurucker and removed request for eddiebergman January 21, 2025 12:20

LennartPurucker requested changes Jan 21, 2025

View reviewed changes

LeoGrin added 6 commits January 21, 2025 15:56

use uv

54b757e

fix ci test for macos13

a23800e

test fix

528ae4c

test fix

f269f73

make max versions more lax

5a7271c

fix github action

72f6405

LeoGrin requested a review from LennartPurucker January 21, 2025 16:01

LennartPurucker approved these changes Jan 21, 2025

View reviewed changes

LeoGrin merged commit 7529757 into main Jan 21, 2025
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve dependencies versioning and ci #141

Improve dependencies versioning and ci #141

LeoGrin commented Jan 16, 2025 •

edited

Loading

LeoGrin commented Jan 16, 2025

LennartPurucker left a comment

LennartPurucker Jan 21, 2025

LeoGrin Jan 21, 2025

eddiebergman Jan 21, 2025 •

edited

Loading

LeoGrin Jan 21, 2025

LennartPurucker Jan 21, 2025

Improve dependencies versioning and ci #141

Improve dependencies versioning and ci #141

Conversation

LeoGrin commented Jan 16, 2025 • edited Loading

LeoGrin commented Jan 16, 2025

LennartPurucker left a comment

Choose a reason for hiding this comment

LennartPurucker Jan 21, 2025

Choose a reason for hiding this comment

LeoGrin Jan 21, 2025

Choose a reason for hiding this comment

eddiebergman Jan 21, 2025 • edited Loading

Choose a reason for hiding this comment

LeoGrin Jan 21, 2025

Choose a reason for hiding this comment

LennartPurucker Jan 21, 2025

Choose a reason for hiding this comment

LeoGrin commented Jan 16, 2025 •

edited

Loading

eddiebergman Jan 21, 2025 •

edited

Loading