-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Specify storage usability #1
Conversation
I didn't notice that you were editing the same file XD
What do you mean by this? |
I think we can merge yours first, then break the doc into two parts: (1) what objects are allowed to be stored? (what my PR concerns itself with) and (b) what do they look like/what can we do with them after they're stored? (what your PR concerns itself with) I should also be more precise here. I mean to say that the back-ends should leverage |
ok but doesn't it mean that my definition of Actually I'm gonna merge mine anyway and we can change it. |
Yes and also yes 😂 |
Can we start a bit more abstract?
|
I don't know what you mean. Feel free to append it to the text.
Yes, I stopped working today after my first PR but otherwise I would have added it. I guess you can safely add this part.
I would say that's a requirement for the tools using the storage interface and not part of the specs for the storage interface.
It's an excellent point, but we have to properly define what lazy loading is.
YES! I guess we can pretty much take over the same definition for json? I'm ok with almost copy and paste the conversions from Overall, I think you can open new PR with these points or add them to this PR. |
I would like to have one PR per topic, this hopefully simplifies the discussion a bit. So if Liam agrees he can either integrate my suggestions or he can merge his pull request and I open a new one with the changes discussed here. |
To me it is very important that we have both definitions somewhere - maybe even both in this file to define what do we promise for a software using the storage interface and what do we require from a software which wants to use the storage interface. |
ok that's true; I reformulate my point: It sounds to me like it's just a question of either: save(my_object) or save(my_object.to_dict()) Correct? I don't have a very very strong feeling, but to me it sounds like the addition of |
I am strongly in favor of the first option, the question is primarily what happens internally and do we call it Now we have to define to interfaces, how can a developer make sure that their objects can be stored using our storage interface? This should be very simple with minimal requirements. The second interface is: What is required to implement a new file format in our storage interface? How do we define the mapping of data types defined by the file format to the mapping of data types supported by python? Both of these interfaces should be defined in our storage specs. |
ok then let's add About |
In other words: Feel free to open new PRs overwriting what I've already pushed. I don't take it personally. |
I am a strong advocate of the first option. To the extent that I don't want to use the |
the question about dump or save is a question of target demographic; Who do we envision to be interacting with the code? Will it be full-on software developers (imho, unlikely) or scientists that know a little bit about programming (which I think is the target audience for pyiron). "save" is linguistically simple and what it does should be easily understood by users. It doesn't require knowledge about python serialisation which usually uses syntax "dump". So perhaps the easiest way to define syntax and expected interfaces would be which level of user is present at different levels of the interfaces. Then, defining stuff should be easier, since we can say that there is a certain level of expected knowledge regarding users interacting with specific parts of our code. |
Ok, I resolved the merge conflict and updated this with my dream specs.
I went a step further and broke this down to three levels: user, generic interface, a particular storage back-end. This gives us some nice flexibility, as the user-facing interface can implement things like
IMO this is what kills storage in At the extrema, I could see how some clear subset of objects are blacklisted. E.g. a back-end might not be capable of handling "cyclic" objects ( |
added concept of default data types that automatically support (json, hdf5) storage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me
Beyond the points included in this pull request:
|
What should be the requirement for something to be usable in our storage? Just like we follow
concurrent.futures.Executor
, I think we should followpickle
for requiring inputs. Of course what the output should look like (hierarchical, etc.) is separate