Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeWarning: overflow encountered in reduce #236

Open
wtni-gidle opened this issue Dec 26, 2024 · 0 comments
Open

RuntimeWarning: overflow encountered in reduce #236

wtni-gidle opened this issue Dec 26, 2024 · 0 comments
Labels
question Further information is requested

Comments

@wtni-gidle
Copy link

wtni-gidle commented Dec 26, 2024

I’ve been using AlphaFold3 to predict some relatively large proteins. And I’m amazed that with A40 (48GB), AlphaFold3 can successfully predict some proteins with lengths exceeding 6000 residues! Of course, this requires enabling unified memory and setting pair_transition_shard_spec. However, during the feature extraction process, the following warning occasionally occurs:

I1224 05:16:32.210339 140683971744768 pipeline.py:263] Got bucket size 4608 for input with 4592 tokens, resulting in 16 padded tokens.
/path/to/python3.11/site-packages/numpy/_core/fromnumeric.py:86: RuntimeWarning: overflow encountered in reduce
  return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
Featurising protein_xxx with rng_seed 2093475194 took 128.02 seconds.
Featurising data for seeds (2093475194,) took  132.53 seconds.
Running model inference for seed 2093475194...
Running model inference for seed 2093475194 took  11610.44 seconds.

This warning happens with particularly large inputs (e.g., 5878 tokens), but it doesn’t happen every time. Some larger inputs, such as a protein of 6879 residues, do not trigger the warning.

Despite the warning, I inspected the outputs (including ranking_scores.csv and the generated .cif files). The results seem reasonable:

  • The ranking scores are positive.
  • No severe structural clashes are observed in the predicted models.

Environment Details

  • GPU: NVIDIA A40 (48GB)
  • Memory Settings: Unified memory was enabled, with XLA_CLIENT_MEM_FRACTION set to 3.64 for 128G CPU memory and 4.74 for 180G CPU memory.
  • Custom Configuration: pair_transition_shard_spec
    pair_transition_shard_spec: Sequence[_Shape2DType] = (
          (2048, None),
          (3072, 1024),
          (None, 512),
      )
    

Does this warning have any impact, or can I safely ignore it?

Another question, I’d like to know if allocating more CPU memory and adjusting pair_transition_shard_spec (e.g., (None, 256)) would help predict even larger proteins.

Thank you for your amazing work on AlphaFold3! 🙌

@Augustin-Zidek Augustin-Zidek added the question Further information is requested label Dec 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants