Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: Datahandler now functions as a class #156

Merged
merged 33 commits into from
Jun 9, 2021

Conversation

swkeemink
Copy link
Member

This involves quite a lot of changes.
This is classed as an API change as anyone using custom dataloaders in
our previous way will either have to stay with an older version of
FISSA, or migrate their datahandlers to this new method.

The tutorial notebook now extensively explains this at the end.

Another important change this brings is that for multiprocessing the datahandler class instance is now passed into the function for multiprocessing instead of defined as a global library, which will help with solving #147.

@swkeemink swkeemink mentioned this pull request Jun 3, 2021
@scottclowe scottclowe added the vX.Y.0 New Minor Release This should be included in a minor release label Jun 3, 2021
@scottclowe scottclowe self-requested a review June 3, 2021 17:29
Copy link
Member

@scottclowe scottclowe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes requested. Main thing is I want to see the two DataHandler classes in the same file. When that's done, the unit tests should also be refactored into the same file. I think it would be better to use an abstract class, though this is not strictly necessary.

I've not read the notebook changes yet because they don't render on GitHub. Presumably the notebook changes are only to do with using a custom datahandler?

from .. import roitools

datahandler = DataHandler()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this. datahandler is the name of the module and DataHandler the name of the class. Should not make the name of the instance of DataHandler clash with the name of the module. Need to use a different name for the instance, such as

Suggested change
datahandler = DataHandler()
dh = DataHandler()

(which obviously has implications for later code in the file which needs datahandler changed to dh)

Maybe we should consider renaming the module instead though.

Copy link
Member Author

@swkeemink swkeemink Jun 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not read the notebook changes yet because they don't render on GitHub. Presumably the notebook changes are only to do with using a custom datahandler?

Yes. With an explanation of how to use it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should consider renaming the module instead though.
The case of datahandler = DataHandler() is actually consistent with how use core.Experiment() class (we call exp=fissa.Experiment() or experiment=fissa.Experiment() a few times in the examples). But I agree it is not the best naming. Now I import DataHandler from datahandler, and then make an DataHandler object called datahandler. Yuk.

We could rename the module to extraction (following Suite2p) and keep the class-name, as well as use your dh suggestion, so it becomes

Suggested change
datahandler = DataHandler()
from ..extraction import DataHandler()
dh = DataHandler()

from .. import roitools

datahandler = DataHandler()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for the other file.



def image2array(image):
"""Open a given image file as a PIL.Image instance.
class DataHandler(datahandler.DataHandler):
Copy link
Member

@scottclowe scottclowe Jun 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to see this being done as a subclass. But I'd actually like to see the two data handlers put in the same file.

class DataHandler():
    # existing code as in other file

class DataHandlerFrameByFrame(DataHandler):
    # code as you have in this file

That would be much tidier. Also, you've got this weird thing at the moment where you can do

import datahandler_framebyframe
x = datahandler_framebyframe.DataHandler()
y = datahandler_framebyframe.datahandler.DataHandler()
x.__class__ == y.__class__  # Returns False and x and y have different behaviours
# Not behaviour that a user of this API would expect.

Also, I think it might be better coding practice to use an abstract class (DataHandlerAbstract) and have both DataHandler and DataHandlerFrameByFrame inherit from the abstract class instead of having DataHandlerFrameByFrame inherit from DataHandler. Because they are more like siblings than parent/child. Also then we expect custom made data handler objects to inherit from DataHandlerAbstract, but they don't have to inherit from DataHandler.

You should write DataHandlerAbstract to have all the methods used it DataHandler defined as raise NotImplementedError(). Any methods which are common to all data handlers can be implemented for real and then the children don't have to overwrite those methods.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Then we can also merge the datahandler test files.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About an abstract class -- I will implement it (and it will have only NotImplementedErrors, because all the functions are different for framebyframe!), but might not use it in the tutorial. It makes a lot more sense for most users to only edit one or two of the functions used in the standard DataHandler. I'll just make a comment for advanced users that they can define an entirely new handler from DataHandlerAbstract.

@scottclowe
Copy link
Member

Also, commit f9efaff is off-topic and should be in a different PR concerning verbosity changes. Sorry I missed that before!

@swkeemink
Copy link
Member Author

Also, commit f9efaff is off-topic and should be in a different PR concerning verbosity changes. Sorry I missed that before!

Actually it does fit here because it fixes an error I introduced myself earlier in this PR.>.>

@scottclowe
Copy link
Member

Also, commit f9efaff is off-topic and should be in a different PR concerning verbosity changes. Sorry I missed that before!

Actually it does fit here because it fixes an error I introduced myself earlier in this PR.>.>

Oh, I see. I missed it because it's not part of the PR's diff. Could you rebase and squash f9efaff into the commit that did the thing that it is undoing?

@swkeemink
Copy link
Member Author

I updated this PR. The datahandler module has been renamed to the extraction module. In the end I opted for datahandler = DataHandler() for the class declarations though, as dh = DataHandler() was much less intuitive to read the code with generally. Personally I quite like it like this and it is consistent with how we declare FISSA experiments, but happy to hear name change suggestions.

Some possibilities
datahandler = DataReader()

datahandler = ExtractTool()

@swkeemink
Copy link
Member Author

Aha! I can also rename the current DataHandler() class to DataHandlerTifffile(). This is already like that for the pillow data loader and the custom data loader after all.

@swkeemink
Copy link
Member Author

Aha! I can also rename the current DataHandler() class to DataHandlerTifffile(). This is already like that for the pillow data loader and the custom data loader after all.

This is now done. I'm happy with this to be merged into master and then we can do the other PRs.

This involves quite a lot of changes.
This is classed as an API change as anyone using custom dataloaders in
our previous way will either have to stay with an older version of
FISSA, or migrate their datahandlers to this new method.
I will describe this in detail in the tutorial notebook that is to
follow in another commit.
@swkeemink swkeemink force-pushed the datahandler_class branch from fb0cc12 to a1d0c45 Compare June 7, 2021 15:21
@swkeemink swkeemink force-pushed the datahandler_class branch from a1d0c45 to f55c8f7 Compare June 7, 2021 15:28
fissa/core.py Outdated Show resolved Hide resolved
fissa/core.py Outdated Show resolved Hide resolved
fissa/core.py Outdated Show resolved Hide resolved
fissa/core.py Outdated Show resolved Hide resolved
Comment on lines 70 to 71
self.datahandler = DataHandlerTifffile()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't something that's a common dependency across tests, so it should not be defined here. If there's something wrong with this line of code, it should only be breaking the one or two tests that use a datahandler and not every test that's part of this class.

Suggested change
self.datahandler = DataHandlerTifffile()

self.assert_equal(actual, self.expected)


class TestRois2Masks(BaseTestCase):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class TestRois2Masks(BaseTestCase):
class TestRois2MasksTifffile(BaseTestCase):

from ..extraction import DataHandlerTifffile, DataHandlerPillow
from .. import roitools

class TestImage2Array(BaseTestCase):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to stop these classes colliding with the same name!

Suggested change
class TestImage2Array(BaseTestCase):
class TestImage2ArrayTifffile(BaseTestCase):

os.remove('test.tif')


class TestRois2Masks(BaseTestCase):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class TestRois2Masks(BaseTestCase):
class TestRois2MasksPillow(BaseTestCase):

self.datahandler.rois2masks(polys3d, self.data)


class TestImage2Array(BaseTestCase):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class TestImage2Array(BaseTestCase):
class TestImage2ArrayPillow(BaseTestCase):

fissa/extraction.py Outdated Show resolved Hide resolved
@swkeemink swkeemink force-pushed the datahandler_class branch from 7f5a4bf to 4e952e0 Compare June 7, 2021 17:34
@codecov
Copy link

codecov bot commented Jun 9, 2021

Codecov Report

Merging #156 (53af2ab) into master (50a0ca5) will increase coverage by 0.28%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #156      +/-   ##
==========================================
+ Coverage   92.68%   92.97%   +0.28%     
==========================================
  Files           9        8       -1     
  Lines         779      811      +32     
  Branches      158      161       +3     
==========================================
+ Hits          722      754      +32     
  Misses         29       29              
  Partials       28       28              
Flag Coverage Δ
unittests 92.84% <100.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
fissa/core.py 96.16% <100.00%> (+0.25%) ⬆️
fissa/extraction.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 50a0ca5...53af2ab. Read the comment docs.

@scottclowe scottclowe merged commit cbfffd2 into rochefort-lab:master Jun 9, 2021
@swkeemink swkeemink deleted the datahandler_class branch June 29, 2021 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
vX.Y.0 New Minor Release This should be included in a minor release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants