-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enum used in map functions will raise a RecursionError with dill. #2643
Comments
I'm running into this as well. (Thank you so much for reporting @jorgeecardona — was staring at this massive stack trace and unsure what exactly was wrong!) |
Hi ! Thanks for reporting :) Until this is fixed on Let me know if such a workaround could help, and feel free to open a PR if you want to contribute ! |
Hi ! I think your RecursionError comes from a different issue @BitcoinNLPer , could you open a separate issue please ? Also which dataset are you using ? I tried loading Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/quentinlhoest/Desktop/hf/datasets/src/datasets/load.py", line 1615, in load_dataset
**config_kwargs,
File "/Users/quentinlhoest/Desktop/hf/datasets/src/datasets/load.py", line 1446, in load_dataset_builder
builder_cls = import_main_class(dataset_module.module_path)
File "/Users/quentinlhoest/Desktop/hf/datasets/src/datasets/load.py", line 101, in import_main_class
module = importlib.import_module(module_path)
File "/Users/quentinlhoest/.virtualenvs/hf-datasets/lib/python3.7/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
File "<frozen importlib._bootstrap>", line 983, in _find_and_load
File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 728, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/Users/quentinlhoest/.cache/huggingface/modules/datasets_modules/datasets/CodedotAI___code_clippy/d332f69d036e8c80f47bc9a96d676c3fa30cb50af7bb81e2d4d12e80b83efc4d/code_clippy.py", line 66, in <module>
url_elements = results.find_all("a")
AttributeError: 'NoneType' object has no attribute 'find_all' |
Describe the bug
Enums used in functions pass to
map
will fail at pickling with a maximum recursion exception as described here: uqfoundation/dill#250 (comment)In my particular case, I use an enum to define an argument with fixed options using the
TraininigArguments
dataclass as base class and theHfArgumentParser
. In the same file I use ads.map
that tries to pickle the content of the module including the definition of the enum that runs into the dill bug described above.Steps to reproduce the bug
Expected results
The known problem with dill could be prevented as explained in the link above (workaround.) Since
HFArgumentParser
nicely uses the enum class for choices it makes sense to also deal with this bug under the hood.Actual results
Environment info
datasets
version: 1.8.0The text was updated successfully, but these errors were encountered: