Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could you suggest how to run from shortest sequence to longest sequence? #241

Closed
sky1ove opened this issue Jan 6, 2025 · 1 comment
Closed
Labels
question Further information is requested

Comments

@sky1ove
Copy link

sky1ove commented Jan 6, 2025

I ran a number of proteins and I always encounter the out of memory warning, which shutdown the running during I sleep..

However, i'm totally ok to drop those big proteins if they are out of memory. Is there anyway to sort the files based on the sequence length from short to long so that it will run as many proteins as possible? (I only have one single protein per json file so should be fine). Could you suggest the lines of code in which I can make adjustment? Thanks.

@Augustin-Zidek Augustin-Zidek added the question Further information is requested label Jan 7, 2025
@Augustin-Zidek
Copy link
Collaborator

You have two options:

  1. No code option: when you run with --input_dir=... and provide a directory of individual JSON files, AlphaFold 3 processes them in alphabetical order (since 948827f). So you can name your input JSON files in a way so that they sort by the sequence length, e.g. for sequences with lengths 100, 514, 1560 you would name your inputs as input_0100.json, input_0514.json, input_1560.json.
  2. If you want to do this in code, the place to modify is https://github.com/google-deepmind/alphafold3/blob/main/src/alphafold3/common/folding_input.py#L1344. You would have to load them all, determine their lengths, then yield them in the order given by sequence lengths. Note that if you load all inputs into memory, you might run out of RAM (if you have MSA/templates in them) -- make sure to do that lazily.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants