Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support ABC and Enums #450

Closed
wants to merge 18 commits into from
Closed

Conversation

anivegesana
Copy link
Contributor

@anivegesana anivegesana commented Jan 27, 2022

  1. Support ABC classes (supersedes Issue 332 abc data #427)
  2. Support Enum classes
  3. Replace _dict_from_dictproxy with cloudpickle equivalent that doesn't save the members of the parent classes

Fixes #250, fixes #332, fixes #365


This PR was very large, so it was broken into pieces:

  1. Python 3 Metaclasses: Python 3 Metaclasses [Support ABC and Enums - Part 1] #577
  2. Abstract Base Classes: Abstract Base Classes [Support ABC and Enums - Part 2] #580
  3. Enums (minimal features): pending
  4. Enums (complicated features): pending

@gabyx
Copy link

gabyx commented May 5, 2022

Any progress on this?

@anivegesana
Copy link
Contributor Author

For most cases, the PR as it is works. But for a couple of edge cases (like ABC classes that redefine __new__ or __prepare__) some work needs to be done. I can guarantee that any class that is successfully pickled on this PR will continue to work when I fix those edge cases (and will likely still unpickle on master, but I didn't verify this), so if you need the feature urgently, you can pin dill to this commit and use it, but I do not guarantee that it will work for all ABCs. If you find out your ABC doesn't work, it will help me debug and it would be much appreciated.

@anivegesana
Copy link
Contributor Author

anivegesana commented May 5, 2022

@mmckerns Since demand is high, do you think that we should pull this PR (since it works in most cases) and open another PR where I tackle the remaining cases, so that people can use this feature in dill 0.3.5?

My only concern is that the issue concerning ABCs with __new__ produces the error on unpickling as opposed to pickling, so at minimum I want to detect the issue during pickling to prevent silent failures.

@mmckerns
Copy link
Member

mmckerns commented May 6, 2022

I am updating the way dill builds and installs, and not taking any new features for 0.3.5. The release is done, and just needs testing. I expect it to drop tomorrow. So, short answer, no, not for 0.3.5.

@mmckerns
Copy link
Member

mmckerns commented May 6, 2022

Also, I'd rather not make mistakes and then have to backtrack and break things backward. It's better to be careful and a bit slow.

@anivegesana
Copy link
Contributor Author

anivegesana commented May 6, 2022

In the process of trying to detect when it broke, I was able to find the bug. Now, the only case that I am aware of that this PR isn't working on is date enums. I think this will be ready for 0.3.6.

@anivegesana
Copy link
Contributor Author

@mmckerns I have reason to believe that, on the master branch, __new__ is broken for all classes, not just ABCs. I think we don't have any test cases that concern __new__, so I didn't catch it earlier. These few lines fix it: https://github.com/uqfoundation/dill/pull/450/files#diff-bdd40cc870892c410adce8c1b155daf5e6c5b5e7192e751f171e6dae6a578dc1R1622-R1628

If you aren't done making the release, can we add that to the release? Although it is not that common, I am sure that someone will pickle a class with __new__ in it and will cause an error. Tested it on mystic/examples3/test_lub_expected_error1.py and it doesn't break the other inner function test case.

@mmckerns
Copy link
Member

mmckerns commented May 6, 2022

These few lines fix it: https://github.com/uqfoundation/dill/pull/450/files#diff-bdd40cc870892c410adce8c1b155daf5e6c5b5e7192e751f171e6dae6a578dc1R1622-R1628

I'm done with changes and most of the testing, and will cut a PR to update the build before cutting the release. I haven't done the release yet, and may not until tomorrow. Go ahead and put the change into a PR, and I'll look at it. I can likely pull it before the release.

@anivegesana
Copy link
Contributor Author

anivegesana commented May 7, 2022

@mmckerns Not urgent. There is no way to pickle enums that look like this without additional information:

class AutoNumber(IntEnum):
    def __new__(cls):
        value = len(cls.__members__) + 1
        obj = int.__new__(cls, value)
        obj._value_ = value
        return obj

class Color(AutoNumber):
    red = ()
    green = ()
    blue = ()

The values assigned to the class attributes get updated with the enum value itself, so it is impossible to recover the constructor arguments. Should there be a hook or register function to define this new form of reduction, or should we just leave this feature as unsupported. cloudpickle doesn't support this either, but it generates the error during unpickling, which I find undesirable.

Since most people are looking for the core functionality of this PR and other libraries in the same domain don't consider this to be a problem, I will open this PR and let you take a look at it after dill 0.3.5 is out.

EDIT: This is now fixed.

@anivegesana anivegesana marked this pull request as ready for review May 7, 2022 04:09
@anivegesana
Copy link
Contributor Author

anivegesana commented May 7, 2022

Thoughts about making this part of the API public so that others can register hooks for pickling custom metaclasses that are not supported by this PR, like Django models?

https://github.com/uqfoundation/dill/pull/450/files#diff-bdd40cc870892c410adce8c1b155daf5e6c5b5e7192e751f171e6dae6a578dc1R1962-R1966

@anivegesana
Copy link
Contributor Author

I guess this PR has been "finished" for a while. The problem is that this PR tackles ABCs and Enums in the same PR because without a definite plan on how to incorporate both of them, it would have been hard to support both. But now, other PRs are leading to lots of merge conflicts and it might be better off to close this PR and reopen many smaller PRs that address smaller incomplete subsets of ABCs and Enums. It would help avoid bugs like #486 because it would be easier to merge individual pieces independent of others and not cause problems. If you haven't looked at this PR yet, that is the route I will take rather than creating another merge commit.

@leogama
Copy link
Contributor

leogama commented Aug 5, 2022

Do you want me to try to manually merge this PR with master?

@anivegesana
Copy link
Contributor Author

Sure. You can give it a shot.

@leogama
Copy link
Contributor

leogama commented Aug 6, 2022

It think I succeeded with the 3-way manual merge. It was easier than I thought. Opened a PR to your fork.

Merge branch 'master' into enums
@anivegesana
Copy link
Contributor Author

anivegesana commented Feb 13, 2023

There were many problems with this PR due to its size. It began to become too hefty and difficult to keep up with master to the point that I just gave up. Because this feature was demanded by many people, they started to fork this branch and stop updating dill in order to pickle ABCs and enums. This means that I probably should keep backwards compatibility with older commits on this branch, to prevent people from getting upset. New additions to the enum module in Python 3.11 make everything much more complicated. And it seems like other people who want to contribute fixes in order to add the functionality to master are blocked because they would be blocked by this unmaintainable monster.

Here is my plan. I will not keep this branch up to date with master. I will update it to work with Python 3.11 enums, while still supporting pickles from older versions of this branch. I can then ask how is this PR generally speaking (not a full review, just a general idea of if it is readable and is likely to be approved later). Then, I will break it up into smaller pieces for final approval.

@mmckerns Does this seem like a good plan? This will require a large amount of work and I probably won't be able to put any significant dent in it until the summer. I believe that this is a more important task than #534, so I will look into this instead.

There seem to be a lot of new additions in 3.11 and I need to read through them to understand what is possible and make sure I cover everything. A quick summary makes it seem like the only new features that need to be added (i.e., change how the enum class is pickled) are global_enum and FlagBoundary, but I need to make sure I am not missing anything. Maybe splitting it up first would be more realistic. Is this a feature that you are willing to prioritize?

@mmckerns
Copy link
Member

@anivegesana (this also a good note for @leogama): I've been hired in the past essentially to merge large unwieldy PRs. It can be a nasty task. Your instinct is correct in that small (the smaller the better) PRs are easy to merge. However, I strongly suggest you only pull off one or two small PRs at a time. Essentially, you want them as small as possible... and you want to also (1) propagate the changes back into this PR, and (2) not have too many small open PRs as it will become a management nightmare as each small PR has some small changes to it before it is merged. What you want ultimately is to get this PR to merge, incrementally through small PRs moving it closer to master. It will take time, so don't be in a rush. If you are most interested in this feature, then go for it.

@anivegesana
Copy link
Contributor Author

anivegesana commented Feb 14, 2023

I am thinking about breaking it into 3-4 pieces: "special metaclass framework utilities" that will allow for the customization of how metaclasses are pickled, ABCs, pre-3.11 enums, and 3.11 enums (if new changes are significant). Let me know if a different breakdown is preferable.

@mmckerns
Copy link
Member

"special metaclass framework utilities": this is a good place to start... and to do in one or more PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants