-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Database support #134
Comments
Hi @a18090 Thank you for using HomeGallery and sharing your issues here. And congratulations you are the first person I know who is using 600k media files. That is incredible! I am using only 100k and a friend is using 400k. So you are leading the list! So you report that your database is corrupt? You can rebuild your database via the cli running the importer via The rebuild will rescan your files, extract all meta files and create previews (and skipping it if they already exist) and is rebuilding the database. See internals of the doc for details of the building blocks. Since I do not have experience with 600k that would be the answer for the theory. If it will work on your machine I do not know. Regarding another database backend like mysql etc: The design decision for the database was to not using a standard RDBMS database but load the whole information into the browser to have a snappy user experience. This decision comes with a tradeoff: It can not scale endless. So I am curious how your experience is with 600k media? I would appreciate if you share this information:
Hope that helps |
Hi @xemle 1. How is your general setup (OS or docker) 2.Your main work flow (desktop or mobile) 3.How is the loading time of your database? 4.General user experience doing searches in the UI (are the result shown "fast" enough)? Regarding video conversion, I am wondering if I can try to load the source file directly if it is a LAN environment. The program can only process the exif data. The main problems encountered can be referred to as follows: Thanks again xemle, I don't know JavaScript, I tried AI to help me with database support, but I found it ridiculous since I only know Python programming. |
Hi @a18090 thank you for the detailed answer. Were you able to fix your database issue?
I am surprised that the tag image works but others not. It needs some investigation why it is partially working.
Currently serving original files such as videos are not supported even if the browser could playback the videos. This issue has been addressed several times like #96, #25. Since I develop this gallery for my needs I do have more old non native supported video formats for the browser and like more to separate the original files from the gallery files. Maybe some day there will be a plugin system where it will be easy to extend and customize this functionality.
Please try to reduce the
There is a search term for it:
Have you tried to run
So what is your biggest pain currently with HomeGallery? Can you work with it? Did you tried other self hosted galleries? How do they perform on you large media set? What features do you prefer in HomeGallery? Which features do you like in others? |
Hi @xemle My database doesn't seem to have been repaired, but it's not a serious problem, I'll probably exclude the video files next time I rebuild so it works faster (since I have almost 110,000 videos) haha. I am curious about the two entry methods of "year" and "tags", but I will try these two pages again the next time I re-import, and then I will take a look at the log and chrome responses. The video problem is not very serious. After all, I will reduce the bit rate and preset when converting to increase the conversion speed. I'm going to try reducing api.server.concurrent to 1 and test again. I like HomeGallery's face search and similar image search functions. These two functions are very helpful. I will try to restart HomeGallery again in the next days. This time I may regularly back up the database file to reduce the risk. I've tried other apps and there are some issues. I feel like there may be a database issue on several of the self-hosted galleries I've used. |
Hi @xemle I was rebuilding the database and I tested directly The server memory has 64G, and 18G is currently used. |
Hi @a18090 Thank you for your report. From the logs I can not say a lot. The rebuild command I can see that the You can try to set 8 GB in your gallery config for the database creation. Maybe it helps
To fix your side I need further investigation of node's memory management and need to check what can be improved to keep the memory consumption low. |
Hi @xemle thank you for your reply,
It seems to be a problem during operation? I try During operation, the cached data enters the memory directly and is not written to database.db.
gallery.config.yml
|
Hi @a18090 Thank you for your logs. It seams that the memory consumption on the heap grows to much at the end of your logs. As I wrote earlier: I can not explain currently why the high consumption is required. Currently the time to investigate for such case is also very limited since I am using my spare time to work on the plugin system to open the gallery to custom functions. I am very sorry but I can not help here currently. I keep it in my mind because it bugs me that the consumption is that high while in theory it should be slim. |
I did some analysis regarding memory consumption. IMHO I was not able to detect a memory leak or greater memory issue. Except that the database building process needs much memory since it loads all the data into memory. My database with 100k requires 200 MB uncompressed JSON data and the database creation succeeds with 750 MB heap size. The process can be optimized in a stream way which should require less memory since it does not need to load the whole database into the memory while creating the database. This would enable the building of larger galleries with more than 400k images on less memory. The current workaround is to provide more heap space to the database building process. Still on the server and client side the gallery needs to load the database into the memory but I guess it is more efficient since it is read only. @a18090 Is this memory issue still relevant for you? |
Hi @xemle Memory is not very critical for me, because my server has 128G memory, and computers and mobile phones generally have 12-16G. Most of the time it is fine. When using Chrome to access it on the PC, I noticed that the stack will be very large, but I recently encountered a problem that prevented me from using home-gallery recently. My database.db will crash directly after running for a while, causing database.db to be unable to read, as shown below
db.bak is the database.db I backed up some time ago.
I will resume running the test again after the update, and I will clear the log and pay attention to |
Hi @a18090 Thank you for the update. I guess you identified an bug and it seems that the bug is quite old and my tests were not covering it. I will provide an fix in the next days. So the error |
Hi @xemle I found an interesting situation when I ran it again. When I open the browser with (Laptop) Chrome, if I enter the year tab, it will enter a waiting state if it is waiting for loading. This is the command output log.
If I use Chrome on a desktop computer to access the site, this problem will not occur. There is no continuous log output, clicks are normal, and Chrome will not lose response. Chrome I suspect this may be a page problem. I'll try to check it out and see if I can help you (of course I'm not sure if this is possible, haha) |
Hi @a18090 thank you for your update. From your latest console logs I read that the database can be loaded with 259386 entries which sound good. A friend of mine has 400k image which should work, too. Regarding your previous error
I found the issue. The error happens when the database could not be read. Unfortunately the error message with the cause is swallowed by this bug so I can not tell why the database can not be read. A following fix will will change that. My best guess is that the database can not be read due memory issues on the server component. Especially while you importing a new and even larger database. Than the server has to keep 2 large version in memory for a short time of period. This issue could be fixed with the environment variable I doubt that the database itself is corrupt because on database creation the new database is written to a temporary file which is than renamed to the target database filename. This is a usual way to provide a kind of atomic file creation. The rename should only happen on no error cases. Would you mind to check your database with |
Hi @xemle I've removed the problematic database and restored the backup, then rebuilt the database and now by But that number doesn't seem right, I looked for JPGs by listing the files and there exists about 462626 files The full file count with video is roughly 597817 |
Can you check also the file index with |
(base) root@cdn:/data/ssd/glh/config# zcat tdl.idx | jq .data[].filename | wc -l (base) root@cdn:/data/ssd/glh/config# zcat database.db | jq .data[].id | wc -l (base) root@cdn:/data/hdd/tdl# find . |wc -l (base) root@cdn:/data/hdd/tdl# find . -type f -name *.jpg |wc -l |
Hi @a18090 thank you for the numbers. To summarize
The diff between jpg files and indexed files is that the gallery also index other files like meta files The more important question is why there is a big gab between media files and index files since there should not be so much meta files. My best guess is that the import was done in multiple steps. Currently the algorithm does not recover correctly the files after the input process is restarted (Did not find time/interest do implement it yet). Therefore it is recommended to rerun the full import after all media files have been processed. Would you mind to rerun a full import via |
Thanks, no problem, I'll try it. |
Hi @a18090 I've updated the master with a stream based database creation. This should be less memory demanding and your 400.000 images should work better to read and to update. Please try out if your have any issues with the newest version |
Hi @xemle This is resource occupancy I noticed an error occurred
I will also try multi-platform testing later Thanks for your efforts
|
Many thanks for providing this project,
I encountered a very difficult problem when using it, that is, the amount of data is too large, about 600,000 images and videos, and the local database file frequently presses CTRL + C. After re-running, it will prompt me that the database is damaged.
This is a nightmare, I'm wondering if I can put the database into mysql or MariaDB so that when I rerun it it won't affect my data.
I judged that I might have been interrupted while he was writing, so this problem occurred, but I have so many images that it may take a week to process each time.
Thanks again
The text was updated successfully, but these errors were encountered: