Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Descriptors not inherited on Windows #97

Closed
Naikless opened this issue Jan 27, 2022 · 9 comments
Closed

Descriptors not inherited on Windows #97

Naikless opened this issue Jan 27, 2022 · 9 comments

Comments

@Naikless
Copy link

I define a class with a custom descriptor that works very similar to a property and a method that makes use of the multiprocessing module when working with this descriptor:

class MyClass():
    
    class Descriptor:
        def __init__(self, value=None):
            self.value = value
        def __get__(self, instance, owner=None):
            return self.value
   
    @classmethod
    def init_descriptor(cls,value):
        cls.descr = cls.Descriptor(value)
    
    def __init__(self):    
        self.init_descriptor('descriptor')

    def fun(self):
        def f(x):
            return self.descr
                
        with Pool(4) as p:
            collect = list(p.imap(f, range(5)))
            p.clear()
            
        return collect

The descriptor gets initialized once an instance of that class is built. Calling MyClass().fun() returns the expected result.

Now I define a subclass:

class MySubClass(MyClass):
    def __init__(self):
        super().__init__()

While running MySubClass().descr still works as expected, MySubClass().fun() results in

AttributeError: 'MySubClass' object has no attribute 'descr'

Mind you, this issue only comes up on a Windows machine!

I have tested it with a regular property instead and they work fine, so I wonder if I have to define any other special attributes for my descriptor to get it inherited properly.

Is this even an issue with pathos.multiprocess or a flaw in the the descriptor protocol?

@mmckerns
Copy link
Member

Well, multiprocessing works differently on windows, as there's no fork on windows. So, multiprocess uses the multiprocess.reduciton.ForkingPickler derived class instead of the dill.Pickler.

Can you follow up with:

  1. let me know what OS and versions of (python, multiprocess, dill) you have tried
  2. try refactoring to move the Pool outside the class
  3. see what happens when passing f to dill.pickles and/or dill.check and/or dill.copy
  4. before running fun, do this: import dill; dill.settings['recurse'] = True
  5. before running fun, do this: import dill; dill.detect.trace(True)

That should help me gather a bit more information.

@Naikless
Copy link
Author

Naikless commented Jan 28, 2022

Thanks for the quick response!

  1. So far, I have tested it on Windows 10 using dill 0.3.4 and multiprocess 0.70.12.2 in a python 3.6.13, 3.7.11 and 3.8.12 environment. My Linux system where everything works has python 3.7.11, dill 0.3.4 and multiprocess 0.70.12.2 on it, although I suppose that is irrelevant.
  2. If I put fun at module level and just hand my class instance in as an argument, the same behavior (MyClass works, MySubClass does not) is observed. Was this what you had in mind?
  3. When passing f to dill.pickles, it returns True in any case. dill.check always returns None and dill.copy returns the function object.
  4. No change when setting this.
  5. I have compared the trace log for MyClass and MySubClass.

EDIT: The difference between the two logs is that in the case for the subclass, the log contains additional entries not present in the log for the parent class that are listed below (multiple entries omitted). ..... indicate passages of identical log entries for both cases.

..............

# D2
# T2
D2: <dict object at 0x000001A29FCA5600>
F1: <function MySubClass.__init__ at 0x000001A29FC7A5E0>
Co: <code object __init__ at 0x000001A29FB4A500, file "pathToScriptFile", line 64>
# Co
D3: <dict object at 0x000001A29FB12D80>
# D3
Ce: <cell at 0x000001A29FA75760: type object at 0x000001A29F215A70>
T5: <class '__main__.MySubClass'>
# T5
# Ce
D2: <dict object at 0x000001A29FCAFE80>
# D2
# F1
D2: <dict object at 0x000001A29FA7C3C0>

.............

F1: <function MyClass.Descriptor.__get__ at 0x000001A29FC7A310>
Co: <code object __get__ at 0x000001A29FB24EA0, file "pathToScriptFile", line 32>
F2: <function rebuild_exc at 0x00000206680D2A60>
# Co
# F2
D3: <dict object at 0x000001A29FB12D80>
T1: <class 'AttributeError'>
# D3
F2: <function _load_type at 0x0000020667D74DC0>
D2: <dict object at 0x000001A29FCA5900>
# F2
# D2
# T1
# F1
# D2
F2: <function rebuild_exc at 0x00000206680D2A60>
# F2
# T2
T1: <class 'AttributeError'>
Cm: <classmethod object at 0x000001A29F8FA7F0>
F2: <function _load_type at 0x0000020667D74DC0>
T1: <class 'classmethod'>
# F2
# T1
# T1
F1: <function MyClass.init_descriptor at 0x000001A29FC70CA0>
Co: <code object init_descriptor at 0x000001A29FB4A030, file "pathToScriptFile", line 35>
F2: <function rebuild_exc at 0x00000206680D2A60>
# Co
# F2
D3: <dict object at 0x000001A29FB12D80>
T1: <class 'AttributeError'>
# D3
F2: <function _load_type at 0x0000020667D74DC0>
# F2

....................

(p.imap(f, range(5)))
F2: <function rebuild_exc at 0x0000020A5A6F2A60>

@mmckerns
Copy link
Member

Can you edit your last post to identify which trace belongs to which situation? It'd also be good to compare the trace for the failure case on Windows vs success on Linux.

@Naikless
Copy link
Author

Naikless commented Jan 28, 2022

I tried to clarify this, but I could also attach the full logs.

On Linux, the first section of the log is different, probably due to the different implementation of multiprocess. The following lines are only present on Windows:

D2: <dict object at 0x000001A29FC88A00>
T4: <class 'multiprocess.process.AuthenticationString'>
# T4
# D2
T4: <class 'multiprocess.context.SpawnProcess'>
# T4
D2: <dict object at 0x000001A29FC85340>
D2: <dict object at 0x000001A29FC88940>
T4: <class 'multiprocess.process.AuthenticationString'>
# T4
# D2
F2: <function worker at 0x000001A29FC16430>
# F2
T4: <class 'multiprocess.queues.SimpleQueue'>
# T4
F2: <function rebuild_pipe_connection at 0x000001A29FC160D0>
# F2
T4: <class 'multiprocess.reduction.DupHandle'>
# T4
D2: <dict object at 0x000001A29FC8D800>
# D2
D2: <dict object at 0x000001A29FC8D580>
# D2
T4: <class 'multiprocess.synchronize.Lock'>
# T4
D2: <dict object at 0x000001A29FC9D180>
# D2
D2: <dict object at 0x000001A29FC9D440>
# D2
D2: <dict object at 0x000001A29FC88980>
# D2
# D2
D2: <dict object at 0x000001A29FCA31C0>
T4: <class 'multiprocess.process.AuthenticationString'>
# T4
# D2
T4: <class 'multiprocess.context.SpawnProcess'>
# T4
D2: <dict object at 0x000001A29FC57080>
D2: <dict object at 0x000001A29FC88A00>
T4: <class 'multiprocess.process.AuthenticationString'>
# T4
# D2
F2: <function worker at 0x000001A29FC16430>
# F2
T4: <class 'multiprocess.queues.SimpleQueue'>
# T4
F2: <function rebuild_pipe_connection at 0x000001A29FC160D0>
# F2
T4: <class 'multiprocess.reduction.DupHandle'>
# T4
D2: <dict object at 0x000001A29FC9D580>
# D2
D2: <dict object at 0x000001A29FC8D4C0>
# D2
T4: <class 'multiprocess.synchronize.Lock'>
# T4
D2: <dict object at 0x000001A29FC9D480>
# D2
D2: <dict object at 0x000001A29FC9D9C0>
# D2
D2: <dict object at 0x000001A29FC8D1C0>
# D2
# D2
D2: <dict object at 0x000001A29FCA3200>
T4: <class 'multiprocess.process.AuthenticationString'>
# T4
# D2
T4: <class 'multiprocess.context.SpawnProcess'>
# T4
D2: <dict object at 0x000001A29FC88980>
D2: <dict object at 0x000001A29FCA31C0>
T4: <class 'multiprocess.process.AuthenticationString'>
# T4
# D2
F2: <function worker at 0x000001A29FC16430>
# F2
T4: <class 'multiprocess.queues.SimpleQueue'>
# T4
F2: <function rebuild_pipe_connection at 0x000001A29FC160D0>
# F2
T4: <class 'multiprocess.reduction.DupHandle'>
# T4
D2: <dict object at 0x000001A29FC9DE80>
# D2
D2: <dict object at 0x000001A29FC8D3C0>
# D2
T4: <class 'multiprocess.synchronize.Lock'>
# T4
D2: <dict object at 0x000001A29FC9DD00>
# D2
D2: <dict object at 0x000001A29FCA3480>
# D2
D2: <dict object at 0x000001A29FCA3280>
# D2
# D2
D2: <dict object at 0x000001A29FCA3640>
T4: <class 'multiprocess.process.AuthenticationString'>
# T4
# D2
T4: <class 'multiprocess.context.SpawnProcess'>
# T4
D2: <dict object at 0x000001A29FC8D1C0>
D2: <dict object at 0x000001A29FCA3200>
T4: <class 'multiprocess.process.AuthenticationString'>
# T4
# D2
F2: <function worker at 0x000001A29FC16430>
# F2
T4: <class 'multiprocess.queues.SimpleQueue'>
# T4
F2: <function rebuild_pipe_connection at 0x000001A29FC160D0>
# F2
T4: <class 'multiprocess.reduction.DupHandle'>
# T4
D2: <dict object at 0x000001A29FCA35C0>
# D2
D2: <dict object at 0x000001A29FC9D180>
# D2
T4: <class 'multiprocess.synchronize.Lock'>
# T4
D2: <dict object at 0x000001A29FCA3440>
# D2
D2: <dict object at 0x000001A29FCA3900>
# D2
D2: <dict object at 0x000001A29FCA36C0>
# D2
# D2
F1: <function starargs.<locals>.<lambda> at 0x000001A29FC945E0>
F2: <function _create_function at 0x000001A29FB86820>
# F2
Co: <code object <lambda> at 0x000001A29FBCCC90, file "pathToMiniconda3\lib\site-packages\pathos\helpers\mp_helper.py", line 15>
F2: <function _create_code at 0x000001A29FB868B0>
# F2

After this, the log is almost the same, except for several occurrences of

F2: <function rebuild_exc at 0x00000206680D2A60>
# Co
# F2
D3: <dict object at 0x000001A29FB12D80>
T1: <class 'AttributeError'>
# D3
F2: <function _load_type at 0x0000020667D74DC0>

on the Windows System, which I assume are symptoms of the issue.

@Naikless
Copy link
Author

Naikless commented Jan 30, 2022

I was able to narrow the issue down a little further. I believe the problem lies in the fact, that descr is only created when __init__ is called. If I alter the first lines of MyClass like this:

class MyClass():
    
    class Descriptor:
        def __init__(self, value=None):
            self.value = value
        def __get__(self, instance, owner=None):
            return self.value
    descr = Descriptor('class default')

The above code will then run, although the result will be 'class default instead of descriptor.

Could you tell me how the objects are copied/reinitialized in the worker processes?

FWIW, you can reproduce the exact same behavior with Ipython's autoreload feature, which only replaces each objects __class__ attribute with the updated class. On attribute lookup, only the default value already existing in the class definition can be found (Or an AttributeError occurs, if no such default is defined, as in the original example). Maybe something similar happens during multiprocessing?

@mmckerns
Copy link
Member

I'm not sure what you mean about how objects are "copied/reinitialized in the worker processes". I think you are asking about how objects are passed tot he worker process. They are serialized (i.e. "pickled") with dill, which turns them into unique strings... and then passed across the process boundary... and then the string is converted back to an object (i.e. "unpickled"). A common point of failure is that the serializer can't pickle (or sometimes unpickle) the object correctly... and you either get an error -- or worse, the serialization "works" but it doesn't do something correctly (like save the instance attribute value, and instead only holds the default). Basically, on the new process, a new class will be created, and then the values of the attributes and other state will be loaded. It can call __init__.

@mmckerns
Copy link
Member

With regard to descriptor serialization, see dill: uqfoundation/dill#450 and similar.

@Naikless
Copy link
Author

Alright, playing around with this led me to the conclusion that my original implementation using the descriptor protocol was flawed anyway, because I actually want to store everything at instance level. I have changed my code accordingly and multiprocessing works without issue now.

While this solves my actual problem, it is still curious that the above works for a class but not a subclass.

@mmckerns
Copy link
Member

I'm going to close this as resolved. Feel free to reopen if there's more to discuss here.

@mmckerns mmckerns added this to the multiprocess-0.70.15 milestone Jan 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants