Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

📚 Docs: Document that inputs is still needed for cache:false and that the daemon checksums every file in the repo, potentially blowing up memory #9583

Open
gaberudy opened this issue Dec 6, 2024 · 1 comment
Assignees

Comments

@gaberudy
Copy link

gaberudy commented Dec 6, 2024

What is the improvement or update you wish to see?

We have a repo with a few tasks in various sub-directories. We build bioinformatic software and one of our sub-directories sometimes contains ~100+GB test datasets.

I expected to be able to run a dev task like this one:

 "auth:dev": {
      "cache": false,
      "persistent": true
    },

And because cache:false not have turbo scan the directory.

Turbo is trying to compute checksums of the entire repository content, including the test files and it blows up memory and crashes computers (quickly gets to ~64GB of RAM).

I can stop this by setting an input string for the dev server:

    "auth:dev": {
      "cache": false,
      "inputs": ["src/**"],
      "persistent": true
    },

But this was not intuitive, if cache:false, why should I need to tell turbo to only look under src/ for cache files?

Secondly, even with this above configuration, running "npx turbo run auth:dev" from the root of the monorepo also blew up memory, and after much troubleshooting I fixed it by adding

"daemon": false,

To my root turbo.json, because the daemon appears to checksum every file in the repository regardless of the input definitions of sub-tasks and the value of globalDependencies.

The documentation for the daemon parameter says "Turborepo runs a background process to pre-calculate some expensive operations.", but it would be helpful to control what files the deamon is going to checksum, because every file in the repo is quite surprising and in this case disruptive without some way to control it.

Is there any context that might help us understand?

You can recreate this by creating a monorepo with a sub-directory with a persistent cache:false task. In some sub-directory add a ~100+GB files and try to run the dev task with the default settings. Memory explodes.

Does the docs page already exist? Please link to it.

No response

@anthonyshew
Copy link
Contributor

This is a good finding. I'm wondering if this is something we would be able to make an (some?) outright change(s?) on to use less resources, rather than document.

Either way, I appreciate you bringing this to our attention. I'm going to bring this up with the team and see what we all think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants