Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is rmlint still maintained? #670

Open
StatusCode404 opened this issue Jun 14, 2024 · 21 comments
Open

Is rmlint still maintained? #670

StatusCode404 opened this issue Jun 14, 2024 · 21 comments

Comments

@StatusCode404
Copy link

Hi All,
Just checking (again but this year) if rmlint is still maintained and supported by anyone?

There aren't that many responses to issues and there hasn't been a new tagged release since August last year.

@RichLewis007
Copy link

Good question. I see that @cebtenzzre (https://github.com/cebtenzzre) has approved pull requests in the last 6 months so he is one maintainer. Let's ask if he needs help with the backlog of issues and pull requests.

The develop branch is years out of date, but has had some useful features added to it in the past that I'd like to use. How can we get them into the main branch?

@yarikoptic
Copy link
Collaborator

yarikoptic commented Jul 24, 2024

Ping on this @sahib and @cebtenzzre , a number of PRs send to be trivial and could be merged really fast and please all rmlint users. Got here after seeing @mih showing rmlint in action to dedupdicate annex keys.

Cheers and let me know if we could be of some quick help ;-)

@RichLewis007
Copy link

If the maintainers will not reply, and are not maintaining this project, what are our options? Should we make a fork of it, and try to get the community to move there to actively make the fixes and changes needed?

@sahib
Copy link
Owner

sahib commented Oct 13, 2024

Original author here. I gave over maintenance to @cebtenzzre some time ago. As far as I can tell he's not responding either, which I can't blame him for. Open source work is seldom rewarding. I can give somebody else access to the repo to sift through the PRs, but this person should have a track of doing some open source work already, as I don't want another xz incident. Ideally even more than one person.

This person won't be me though since I don't posess the time and motivation to do so and also would write rmlint a lot differently today. Please answer in this issue if somebody steps up to do the job.

Forking is an option of course too, but many upstream packages have this repo as source.

@yarikoptic
Copy link
Collaborator

I could probably give a hand with some trivial PRs and issues triage, may be an occasional upload to Debian, but not much beyond that. I have some track of FOSS development/maintenance (and also unfortunately abandoning as well ;) )

@sahib
Copy link
Owner

sahib commented Oct 16, 2024

@yarikoptic Much appreciated. I would prefer if there's one additional person to reduce the risk of getting into the current situation.

@CodingWithAnxiety
Copy link
Collaborator

Hey there.

Need another person? I'm an arch Linux user and would be available to help maintain.

@sahib
Copy link
Owner

sahib commented Dec 4, 2024

Hey there.

Need another person? I'm an arch Linux user and would be available to help maintain.

Cool! Thanks for raising your hand. Do you have some experience maintaining C applications? I see some Python experience which will be helpful for the test suite and UI.

@yarikoptic Still in? If yes I could give you guys access.

@cebtenzzre Please raise your hand if you do not want your access to be revoked.

@yarikoptic
Copy link
Collaborator

yes, but only to a very limited degree as described above.

@fermino
Copy link
Collaborator

fermino commented Dec 4, 2024

Hey guys! Another arch linux user here (impressive what a deleted AUR package can do!).
I'm willing to put in some time helping maintain this, maybe reviewing PRs and helping keep the test suite running :)

As a side note, I think that probably the sanest idea for now is to try keep things working and focus on fixing known bugs rather than trying to add new features, mostly because we're all new on this project :)

@yarikoptic
Copy link
Collaborator

also might be worth for @sahib to establish some "gatekeeping" e.g. that every PR must be approved by some other contributor first to be able to merge. (although likely they might be not "hard enforced" or I am a super user everywhere... dang... but example could be https://github.com/citeproc-py/citeproc-py)

@fermino
Copy link
Collaborator

fermino commented Dec 4, 2024

It's a good idea! A 2-3 contributor approval + CI required to pass in the repo should a pretty robust filter.

@sahib
Copy link
Owner

sahib commented Dec 4, 2024

I did setup those rules for master and develop. I would recommend though that there is only 1 required approval as, this can easily lead to stagnation otherwise. I also recommend to enable a positive CI check before merging, but this requires some work first sd the original CI site (TravisCI) vanished.

@yarikoptic @CodingWithAnxiety @fermino: You should have collaboration invites now.

@fermino
Copy link
Collaborator

fermino commented Dec 5, 2024

Awesome, thank you!

Regarding CI, I will look into that. I'm guessing it shouldn't be too hard to migrate it from Travis. Free github action minutes should probably be plenty for now!

@fermino
Copy link
Collaborator

fermino commented Dec 5, 2024

@sahib probably a silly question but anyways: master is the latest branch, right? (Just making sure because I see the develop branch has other stuff but it's one year behind master).

@CodingWithAnxiety
Copy link
Collaborator

Hey there.

Need another person? I'm an arch Linux user and would be available to help maintain.

Cool! Thanks for raising your hand. Do you have some experience maintaining C applications? I see some Python experience which will be helpful for the test suite and UI.

@yarikoptic Still in? If yes I could give you guys access.

@cebtenzzre Please raise your hand if you do not want your access to be revoked.

Hi,

I mostly have python experience under my belt, though I'm still learning C and C++. I'd mostly be interested in helping testing and squashing bugs.

I'll keep my eyes on PRs and issues and see if I can't occasionally lend out a hand.

I will accept the invention once I am home. <3

@sahib
Copy link
Owner

sahib commented Dec 5, 2024

@sahib probably a silly question but anyways: master is the latest branch, right? (Just making sure because I see the develop branch has other stuff but it's one year behind master).

develop is supposed to be the current working version with newest features and fixes. master is usually the one with the latest stable, released software. PRs would go to develop first, on release you rebase or merge to master. You are of course free to use a different branching model, but I think it is worth to revive and streamline the develop branch.

@fermino
Copy link
Collaborator

fermino commented Dec 7, 2024

@sahib thanks for the info! I'm trying to figure out what to do with develop, mostly because I would not like to ship and release something not deemed stable (given user data is at stake :p).

I see that most of the commits (or at least the ones I looked up) are new features, am I right? So in that case maybe the best way would be to start off master (specially about some build fixes for rolling release distros I've been looking at) and then go about integrating the things from develop to master. Any thoughts?

@sahib
Copy link
Owner

sahib commented Dec 11, 2024

@fermino Sorry, bit late. Yes, seems like most features landed on develop, but some fixes are also on master, so the two need to be merged. First step would be to put this merged state on a separate branch, as most people compiling from source will master, but the docs mentions develop. Once that new branch seems stable it can be moved to develop.

Upstream distros will not update until a new tag/release is pushed.

@vassilit
Copy link

vassilit commented Jan 3, 2025

would write rmlint a lot differently today

This may be off-topic here, but I would be very interested to have a bit more detail on what rmlint could have been if you had started it in 2025.

And thanks a lot for rmlint, this is a useful software that I trust and have found useful.
I also wrote my own software from scratch several times that share similar features. That is why I am interested in your own feedback after several years of experience of a larger and well-written project like rmlint.

@sahib
Copy link
Owner

sahib commented Jan 3, 2025

would write rmlint a lot differently today

This may be off-topic here, but I would be very interested to have a bit more detail on what rmlint could have been if you had started it in 2025.

And thanks a lot for rmlint, this is a useful software that I trust and have found useful. I also wrote my own software from scratch several times that share similar features. That is why I am interested in your own feedback after several years of experience of a larger and well-written project like rmlint.

Hmm, this probably would deserve a longer post, but here's what comes to mind:

  • Do one thing well: Remove all the other lint stuff, like empty files, nonstripped etc. This appeared useful to me back then, but is more annoying in hindsight, especially since it makes the implementation harder.
  • Also allow integration for tools that find similar images, or other specialized use cases (i.e make the hashing-core exchangeable in the architecture)
  • Do not offer a UI: That was only because I got interested in GTK and cairo if I'm honest. I do not mind it as separate project, but this attracts the wrong user base to a power tool.
  • Only offer one checksum implementation and remove paranoia mode. Modern checksums like blake are fast enough and very sturdy. There is really no need in having an additional paranoia mode, considering it made things rather complex.
  • Remove xattr support. That stuff never worked right.
  • Heavily reduce the amount of knobs in the command line interface. More is not better. Especially if you run out of letters.
  • Base the new implementation based on io_uring. The current multi-threaded hasher has some needless complexity.
  • Write it in a modern language that allows easy cross-compilation. This would reduce bugs, make developer life less miserable during debugging and speed would be pretty much the same with Go/Rust/Zig. I would pick Go. We had so many bugs because we made the memory management awkward and I want that life time back. I only knew C back then and Go was not yet available.
  • Less outputs. The progressbar should be a proper ncurses or bubbletea one. No python output or CSV. There are tools for that.
  • Write the benchmarks before optimizing. ;-) This has too often turned to an exercise in optimization uncritical paths. Also do not care so much about rotational disk anymore. They are already an edge case now, but have been the norm back then.

There were good ideas though:

  • Keeping it non-interactive.
  • Merging directories (should be default)
  • Have a JSON output for scripting
  • Replay as an idea is actually not bad.
  • The test suite using black box tests of the actual binary is nice.
  • Distinction between original and duplicate and tooling to decide.

Maybe this can also serve as inspiration for the current maintainers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants