Question: Is `ThreadPoolExecutor` a known incompatible library? #1102

schuylermartin45 · 2024-12-16T22:10:59Z

Describe the bug
Not sure if this is a bug yet, as much as a sanity check/question. I recently started using the ThreadPoolExecutor context from concurrent.futures to parallelize network requests.

I've had tests running using pyfakefs for about a month or so now. As soon as I added the thread pool, some tests started taking a significant amount of time. I'm talking about from going from milliseconds to 30 seconds.

I thought it was a deadlock at first but the tests always eventually resolve. There is also no such slow down outside of the testing environment. Although I haven't ruled everything out, if I disable pyfakefs, the issue disappears.

I assume this is caused for much the same reasons why subprocess and multiprocessing have known compatibility issues. I mean we are talking threads and not processes, but they probably make similar low-level calls.

So here's my question: is concurrent.futures another library with known compatibility issues? Should it be added to the documentation I linked to above? Does the same go for the older ThreadPool library?

And is there a general recommendation for working around this kind of issue? I don't really want to disable the thread pool just for the tests and I also need pyfakefs to mock the file system.

Let me know if you would like more context or details. All of this is in an open conda project so I can grab some GitHub Action logs if that would be useful. Thanks!

The text was updated successfully, but these errors were encountered:

mrbean-bremen · 2024-12-17T06:40:50Z

Interesting... I haven't seen this yet, probably because nobody has used ThreadPoolExecutor or concurrent.futures with pyfakefs, but would guess that it is not compatible given that multithreading also has problems. pyfakefs is generally more suited to "standard" unit tests as opposed to integration tests in the sense of the complexity of the environment.

The behavior is strange though. So far, the problems usually appear as a faked file function trying to access a real file or vice verse, resulting in an exception, mostly FileNotFound or similar.
I have no idea what could cause that slowdown though, but I will see if I can find something out later. I guess it is unrealistic to ask for a simple reproducible example...

schuylermartin45 · 2024-12-17T15:33:40Z

If I find time this week, I could try to isolate this to an streamlined example. I guess this is further complicated by also using:

pytest w/ xdist to parallelize the tests
Python's tempfile library
The requests library (but this is mocked out in the problematic tests)
The click library's test CLI invocation interface
So far in my testing, those components don't appear to interfere to cause the slow down.

If I don't find time before the holidays, I'll at least point you to the project code:

Thread pool usage
The test code (namely the cases that use the gsm-amzn2-aarch64_version_bump.yaml and libprotobuf_version_bump.yaml files)
Recent GHA run Note that the pytest-cov phase takes a minute to complete. This used to be on the order of a few seconds. I will point out that I haven't addressed the warnings related to weakref yet.

~~On my current working branch, I've added what should have been a simple test that now takes 7 minutes to complete.~~
Edit: The 7 minute test was a complete goof on my part that I conflated with the other issue.

Maybe I have some battle scars from my IoT days, but I suppose I could have some of these problematic tests write to actual disk. PITA I could probably work around for local testing, but shouldn't be an issue on the GitHub Action runners.

mrbean-bremen · 2024-12-17T17:48:12Z

Thank you! I'm also not sure when I will tackle this - I've a couple of other things in my queue, and there are the holidays of course.
As for the used libraries:
xdist should not be a problem, each process will have it's own fake filesystem. This makes it more difficult to debug, of course, but I guess that problem will persist if running without xdist.
tempfile should not be a problem, it is mostly patched like any other module (with the excption of some additional patching done under Posix to handle a default argument that is a file system function).
request would probably be a problem, but not if it is mocked out, of course. Not sure about click - I do remember some problem with it, have to check that.

If you are doing more integrated tests, sometimes you can't get around using the real filesystem. A good compromise is to use a RAM disk as file system, if that is possible, that would cleaning it up easy.

Anyway I'm interested to understand the problem, even if cannot be fixed easily - it sounds a bit wierd.

schuylermartin45 · 2024-12-17T17:58:19Z

I've had some issues with mixing click and pathlib with pyfakefs. With how click uses decorators, you can't guarantee Path object will be constructed after pyfakefs mocks all the calls. We have a few notes in the code base about this and have had to use string file paths with the click decorators as a work around.

I've thought about using a RAM disk or setting up a local container to run these in, but that's a lot of overhead and maintenance I'd like to try to avoid. The more steps I add to my setup process, the less likely folks will be willing to try to fork and contribute to the project. I'm sure I could find a way to streamline that, but the appeal to me to use pyfakefs is how easy it is to setup/use once it is installed.

Total props for maintaining this project btw, I've really enjoyed using it so far.

schuylermartin45 mentioned this issue Dec 16, 2024

test_bump_recipe_cli() for multi-sourced recipes dramatically slows tests down conda-incubator/conda-recipe-manager#265

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: Is `ThreadPoolExecutor` a known incompatible library? #1102

Question: Is `ThreadPoolExecutor` a known incompatible library? #1102

schuylermartin45 commented Dec 16, 2024

mrbean-bremen commented Dec 17, 2024

schuylermartin45 commented Dec 17, 2024 •

edited

Loading

mrbean-bremen commented Dec 17, 2024

schuylermartin45 commented Dec 17, 2024

Question: Is ThreadPoolExecutor a known incompatible library? #1102

Question: Is ThreadPoolExecutor a known incompatible library? #1102

Comments

schuylermartin45 commented Dec 16, 2024

mrbean-bremen commented Dec 17, 2024

schuylermartin45 commented Dec 17, 2024 • edited Loading

mrbean-bremen commented Dec 17, 2024

schuylermartin45 commented Dec 17, 2024

Question: Is `ThreadPoolExecutor` a known incompatible library? #1102

Question: Is `ThreadPoolExecutor` a known incompatible library? #1102

schuylermartin45 commented Dec 17, 2024 •

edited

Loading