-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmenting the interface #177
Comments
Interesting. I hadn't thought of splitting it up like this, but that might work nicely with some preliminary work @ExpandingMan was doing on supporting more of a key-value store interface. Basically, right now we're assuming a filesystem interface includes two things:
I think the IO interface is a strict requirement, but the tree interface could easily be a hash table or some other associative datastructure. Perhaps tangential to this particular issue, but I think it'd be kinda cools if you could map "filesystem" operations directly to datastructure ops. |
It's been quite a while since I looked at this. Basically I was interested in supporting S3, which is a key-value store (see AWSS3.jl). It works as is, but is more than a little hacky. There's a whole bunch of things that go horribly wrong on remote filesystems that are not necessarily related to the posted issue. In a perfect world, I wouldn't think generalizing this package to things like key-value stores or HTTP makes much sense at all. It is built around a tree-like abstraction in which directories are nodes and I think that's fine, especially since that's how actual file systems actually work. The problem is that in real life, for better or worse, S3 (and even now S3-compatible key-value storage alternatives) are really important and arguably becoming even more important. So maybe it's worth doing? I don't know. I still see the S3 use case as important, HTTP is probably stretching it way further. |
FWIW, I have usecases in S3, HTTP, and S3 derivatives (google cloud, minio, etc). The main things I want to do are:
I'm not so concerned about e.g. The idea about HTTPPath is mostly to support a similar interface to read from HTTP "file stores", which are somehow quite common in large geospatial datasets. In that case I might not have Overall the idea is to make getting data from arbitrary filesystem-like data stores painless and easy. |
There seem to be three principal modes with which people access data in files:
ls
,joinpath
,du
, etc)mv
,tempdir
, etc)As far as I can tell, the FilePathsBase API doesn't currently make a formal distinction between these three APIs. Would it make sense to do so?
This way, things like HTTP paths can simply opt in to the pure-reading interface, whereas a local path could also implement the writing and manipulating interfaces. We can then also have nice interface tests for that, and it would probably make it conceptually easier to implement random filesystems, like zip files (which wouldn't support a tempdir, for example).
The text was updated successfully, but these errors were encountered: