Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Packages downloaded from anaconda.org have unsupported filenames? #230

Closed
2 tasks done
bdice opened this issue Feb 22, 2024 · 7 comments
Closed
2 tasks done

Packages downloaded from anaconda.org have unsupported filenames? #230

bdice opened this issue Feb 22, 2024 · 7 comments
Labels
locked [bot] locked due to inactivity type::bug describes erroneous operation, use severity::* to classify the type

Comments

@bdice
Copy link

bdice commented Feb 22, 2024

Checklist

  • I added a descriptive title
  • I searched open reports and couldn't find a duplicate

What happened?

I have a question that may be a bug in conda-package-handling. It might be intentional behavior.

I often download packages from anaconda.org to check their contents. I visited https://anaconda.org/conda-forge/nanoarrow/files and clicked a link to download the .conda package. This saved a file named linux-64_nanoarrow-0.4.0-py310h2372a71_0.conda. However, running cph x linux-64_nanoarrow-0.4.0-py310h2372a71_0.conda fails. It gives an error like:

LookupError: didn't find info-linux-64_nanoarrow-0.4.0-py310h2372a71_0 component in /mnt/c/Users/bdice/Downloads/linux-64_nanoarrow-0.4.0-py310h2372a71_0.conda

Full traceback:

$ cph x linux-64_nanoarrow-0.4.0-py310h2372a71_0.conda
Traceback (most recent call last):
  File "/home/bdice/miniforge3/bin/cph", line 10, in <module>
    sys.exit(main())
  File "/home/bdice/miniforge3/lib/python3.10/site-packages/conda_package_handling/cli.py", line 121, in main
    api.extract(args.archive_path, args.dest, prefix=args.prefix)
  File "/home/bdice/miniforge3/lib/python3.10/site-packages/conda_package_handling/api.py", line 77, in extract
    format.extract(fn, dest_dir, components=components)
  File "/home/bdice/miniforge3/lib/python3.10/site-packages/conda_package_handling/conda_fmt.py", line 46, in extract
    _extract(str(fn), str(dest_dir), components=components)
  File "/home/bdice/miniforge3/lib/python3.10/site-packages/conda_package_handling/streaming.py", line 35, in _extract
    stream = package_streaming.stream_conda_component(
  File "/home/bdice/miniforge3/lib/python3.10/site-packages/conda_package_streaming/package_streaming.py", line 133, in stream_conda_component
    raise LookupError(f"didn't find {component_name} component in {filename}")
LookupError: didn't find info-linux-64_nanoarrow-0.4.0-py310h2372a71_0 component in /mnt/c/Users/bdice/Downloads/linux-64_nanoarrow-0.4.0-py310h2372a71_0.conda

The problem is that the filename must be changed to nanoarrow-0.4.0-py310h2372a71_0.conda to match the component names, which are named like info-nanoarrow-0.4.0-py310h2372a71_0.tar.zst.

The line linked below is trying to find a component named the same way as the file.

_extract(str(fn), str(dest_dir), components=components)

Is it reasonable to require that the filename matches the names of the components? I am a bit surprised that renaming the file would make it impossible to extract.

Conda Info

conda version : 23.11.0

Conda Config

channels:
  - conda-forge

Conda list

conda-package-handling    2.2.0              pyh38be061_0    conda-forge
conda-package-streaming   0.9.0              pyhd8ed1ab_0    conda-forge

Additional Context

No response

@bdice bdice added the type::bug describes erroneous operation, use severity::* to classify the type label Feb 22, 2024
@github-project-automation github-project-automation bot moved this to 🆕 New in 🧭 Planning Feb 22, 2024
@msarahan
Copy link
Contributor

I think the relevant code is here: https://github.com/conda/conda-package-streaming/blob/main/conda_package_streaming/package_streaming.py#L127

It certainly could be a more flexible scheme, but just matching prefix (info-) might have some unanticipated edge cases.

@bdice
Copy link
Author

bdice commented Feb 22, 2024

just matching prefix (info-) might have some unanticipated edge cases.

Right, that's why I wasn't sure if this was intended behavior. However, it seems like it's quite a stringent requirement for the file to have a particular name in order to be extracted properly. It's certainly not obvious that the file name should have any effect on extracting it (no other compressed format or package format has such a requirement that I am aware of).

@jakirkham
Copy link
Member

It has to do with how the files are downloaded. The headers have issues

Clicking the download link and letting the browser handle the download doesn't work. Copying the link from Anaconda and using another download tool (like curl or wget) does work

There's more context in issue: conda/infrastructure#868

@mfansler
Copy link

However, it seems like it's quite a stringent requirement for the file to have a particular name in order to be extracted properly. It's certainly not obvious that the file name should have any effect on extracting it (no other compressed format or package format has such a requirement that I am aware of).

I'd second the sentiment here. Coupling the ability to uncompress with the file name is unnecessarily fragile. Aside from getting the website to serve downloads without changing names, I'd really like to see the sensitivity to filename engineered out of the format.

@dholth
Copy link
Contributor

dholth commented Jun 11, 2024

We will fix anaconda.org and will close this ticket when it is deployed.

@dholth
Copy link
Contributor

dholth commented Jul 2, 2024

This should be fixed on anaconda.org. Let us know if it is working for you.

@dholth dholth closed this as completed Jul 2, 2024
@github-project-automation github-project-automation bot moved this from 🆕 New to 🏁 Done in 🧭 Planning Jul 2, 2024
@bdice
Copy link
Author

bdice commented Jul 2, 2024

This appears to work now! Thank you @dholth.

@github-actions github-actions bot added the locked [bot] locked due to inactivity label Dec 30, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 30, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
locked [bot] locked due to inactivity type::bug describes erroneous operation, use severity::* to classify the type
Projects
Archived in project
5 participants