-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] Mounting multiple archives to the same path #219
Comments
Hi! This sounds to me like what you want is very similar, if not identical, to "incremental backup" functionality, i.e. the ability to add a new snapshot of a directory to a DwarFS image, but only storing the changes relative to the previous snapshot. I'm not entirely sure, though, because I don't really understand how you achieve this with creating multiple archives and then merge-mounting them. It'd be good to have a more detailed example of exactly what you're doing. As for the "incremental backup" functionality, that's been requested before and it's definitely something I want to add. See #18, #208. |
Here is the unholy shell script i use: dwarfs -o workers=16 -o allow_root -o readonly comp/collection1.dwarfs ./mount/collection1/
dwarfs -o workers=16 -o allow_root -o readonly comp/collection2.dwarfs ./mount/collection2/
dwarfs -o workers=16 -o allow_root -o readonly comp/blalange1.dwarfs ./mount/blalange1/
dwarfs -o workers=16 -o allow_root -o readonly comp/collection3.dwarfs ./mount/collection3/
dwarfs -o workers=16 -o allow_root -o readonly comp/rantonse1.dwarfs ./mount/rantonse1/
dwarfs -o workers=16 -o allow_root -o readonly comp/collection4.dwarfs ./mount/collection4/
dwarfs -o workers=16 -o allow_root -o readonly comp/collection5.dwarfs ./mount/collection5/
dwarfs -o workers=16 -o allow_root -o readonly comp/blalange2.dwarfs ./mount/blalange2/
dwarfs -o workers=16 -o allow_root -o readonly comp/collection6.dwarfs ./mount/collection6/
sudo mergerfs -o cache.files=partial,dropcacheonclose=true,allow_other \
./mount/collection1:./mount/rantonse1:./mount/collection2:./mount/blalange1:./mount/collection3:./mount/collection4:./mount/collection5:./mount/blalange2:./mount/collection6 \
./pywb/collections/main/archive/ I think you understand why it would be great to have this implemented into dwarfs |
That part was clear from your description. I'm more interested in how you actually build the individual archives. I assume you're creating those from the writable layer in the merged file system? |
Ah, sorry about that. No, i create the archives separetely. |
in my (irrelevant) opinion this functionality should be left to specialized union filesystems, like mergerfs you use in your script, or overlayfs. There is a whole list of special considerations when it comes to having multiple filesystem at one location, one of them being filename clashes. |
And that means? Assume I know nothing about your data (or exactly what mergerfs does in your use case). Is there any overlap between the individual images? Or are they completely separate sets of files? There's probably a dozen different ways to implement "mounting multiple archives to the same path". I've looked at the mergerfs README for the last 15 minutes and it's unclear to me what exactly it does. I understand the overlayfs/unionfs approach, but mergerfs is apparently different from that. How does it behave if the same path exists in multiple branches but with different contents? |
in my understanding, the traditional way (overlayfs/unionfs) is to have one bottom filesystem and one or more on top, whereas mergerfs uses a merge policy (similar to git merge) that creates a virtual combination of filesystems |
Each dwarfs file contains its own set of files, there is only one copy of each file across all the dwarfs files.
One idea i have of how this might be implemented is this: For example, the user has two dwarfs files that look like this:
And the resulting mount directory would look like this
Im horrible at explaining things |
Kind of like ratarmount's union mounting system |
Your reply definitely helps, though! :) The problem I'm having is the open questions this leaves. And I agree with @silentnoodlemaster that special/different cases should be left to special tools. The one thing that is definitely ugly about your use case is that you have a myriad of So what I'd like to implement, and this is likely going to happen sooner than the incremental-backup feature, is a way to add to (and remove from) a running dwarfs process additional mounts that will share the same cache. I just haven't figured out all the details yet. And then I need to find the time to do it. So don't hold your breath just yet. |
Here's a quick brain dump, feel free to comment, I'd definitely appreciate feedback. None of these will implement any kind of "merging", though. Mounting multiple DwarFS imagesSingle mount of multiple imagesA single mount of multiple file systems (will show up as one FUSE mount) in a single process; shared cache
The contents of each DwarFS image would be accessible at
The Multiple mounts sharing the same process/cacheMultiple mounts of multiple file systems (will show up as multiple FUSE mounts) in a single process; shared cache
Options that cannot be changed at run-time will report an error. I'm definitely open for suggestions regarding a name different than Multiple mounts with distinct process/cacheMultiple mounts of multiple file systems in multiple processes (current behaviour); exclusive caches |
That implementation seems like the cleanest and most user friendly alternative. |
It would be really convenient if you could mount multiple archives to the same folder.
I use dwarfs to compress warc's and because they are quite large and take a long time to compress, decompressing and recompressing them is not really an option. So i create a new archive every few weeks, i then mount each of them to their own folder and then use mergerfs to mount all of them to a single folder. This is really impractical and it would be great to see this feature implemented into the program.
The text was updated successfully, but these errors were encountered: