Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Database support #134

Open
a18090 opened this issue Apr 20, 2024 · 20 comments
Open

Database support #134

a18090 opened this issue Apr 20, 2024 · 20 comments

Comments

@a18090
Copy link

a18090 commented Apr 20, 2024

Many thanks for providing this project,

I encountered a very difficult problem when using it, that is, the amount of data is too large, about 600,000 images and videos, and the local database file frequently presses CTRL + C. After re-running, it will prompt me that the database is damaged.
This is a nightmare, I'm wondering if I can put the database into mysql or MariaDB so that when I rerun it it won't affect my data.
I judged that I might have been interrupted while he was writing, so this problem occurred, but I have so many images that it may take a week to process each time.

Thanks again

@xemle
Copy link
Owner

xemle commented Apr 20, 2024

Hi @a18090

Thank you for using HomeGallery and sharing your issues here. And congratulations you are the first person I know who is using 600k media files. That is incredible! I am using only 100k and a friend is using 400k. So you are leading the list!

So you report that your database is corrupt? You can rebuild your database via the cli running the importer via ./gallery.js run import. The exact call or the cli depends on your platform.

The rebuild will rescan your files, extract all meta files and create previews (and skipping it if they already exist) and is rebuilding the database. See internals of the doc for details of the building blocks.

Since I do not have experience with 600k that would be the answer for the theory. If it will work on your machine I do not know.

Regarding another database backend like mysql etc: The design decision for the database was to not using a standard RDBMS database but load the whole information into the browser to have a snappy user experience. This decision comes with a tradeoff: It can not scale endless. So I am curious how your experience is with 600k media? I would appreciate if you share this information:

  • How is your general setup (OS or docker)
  • Your main work flow (desktop or mobile)
  • How is the loading time of your database?
  • General user experience doing searches in the UI (are the result shown "fast" enough)?

Hope that helps

@a18090
Copy link
Author

a18090 commented Apr 21, 2024

Hi @xemle

1. How is your general setup (OS or docker)
I have tried docker, but there are some problems, because my network traffic will be fragmented when using wireguard in my environment, so I used OS direct deployment (binary), I used SSD for caching, and images and videos are in HDD.
Among them, I tried defining the video as h265 encoding in the conf file (trying to reduce the video file size), because my source file is about 4T, and the converted file is about 1.6T. When indexing such a large number of files, the CPU and The memory consumption is not too high, but the ffmpeg CPU usage is very high during video conversion.

2.Your main work flow (desktop or mobile)
My Android can still handle the "Year" page when the data volume exceeds 200,000. After that, I can only enter the website through "tags". On the desktop, this problem occurs when the data volume is around 400,000, and I cannot enter the "Year" page. It can be entered through "tags". I have tried Apple but it is good and bad. I can't judge.
If you directly enter the "Year" tab in the desktop environment, Chrome will occupy more than 2G of RAM and will be stuck for about 20 minutes and cannot be operated. However, it is normal to enter "tags" and load the data before entering "Year".
Android: 8G RAM + Snapdragon 8G3 + Chrome
Desktop: 16G RAM + Intel 1260P + Chrome
Apple: iPhone 13Pro + Chrome

3.How is the loading time of your database?
My database is on an SSD and loads very quickly, probably taking less than a minute before starting to process new media as my data keeps growing.

4.General user experience doing searches in the UI (are the result shown "fast" enough)?
The search can basically be processed at the second level. Most of the time, I am indifferent. For example, similar face searches and file name searches are almost similar.
I remember that my database file was about 200M when it was damaged, and the index was about 200-400M.

Regarding video conversion, I am wondering if I can try to load the source file directly if it is a LAN environment. The program can only process the exif data.
Because I found that most videos are h264 or newer, Chrome should be able to play them directly (there may be audio playback issues).

The main problems encountered can be referred to as follows:
1.
The API server may disappear suddenly, but I don't know when it crashes. I am running it in a docker environment.
"I guess this may have something to do with my concurrency being too high."
2.
When I entered the website through a non-"tags" page (I added all EXIFs containing GPS to tag: geo), I found that the map could not load all the data properly, and sometimes it might be missing.
I try to do it regularly search/latitude%20in%20[0.0000:90.0000]%20and%20longitude%20in%20[90.0000:180.0000],Then manually add tags:geo
3.
Usually I execute it directly through ./home-gallery -c home-gallery.conf, and then "start" directly.

Thanks again xemle, I don't know JavaScript, I tried AI to help me with database support, but I found it ridiculous since I only know Python programming.

@xemle
Copy link
Owner

xemle commented Apr 22, 2024

Hi @a18090

thank you for the detailed answer. Were you able to fix your database issue?

2.Your main work flow (desktop or mobile) My Android can still handle the "Year" page when the data volume exceeds 200,000. After that, I can only enter the website through "tags". On the desktop, this problem occurs when the data volume is around 400,000, and I cannot enter the "Year" page. It can be entered through "tags". I have tried Apple but it is good and bad. I can't judge. If you directly enter the "Year" tab in the desktop environment, Chrome will occupy more than 2G of RAM and will be stuck for about 20 minutes and cannot be operated. However, it is normal to enter "tags" and load the data before entering "Year". Android: 8G RAM + Snapdragon 8G3 + Chrome Desktop: 16G RAM + Intel 1260P + Chrome Apple: iPhone 13Pro + Chrome

I am surprised that the tag image works but others not. It needs some investigation why it is partially working.

Regarding video conversion, I am wondering if I can try to load the source file directly if it is a LAN environment. The program can only process the exif data. Because I found that most videos are h264 or newer, Chrome should be able to play them directly (there may be audio playback issues).

Currently serving original files such as videos are not supported even if the browser could playback the videos. This issue has been addressed several times like #96, #25. Since I develop this gallery for my needs I do have more old non native supported video formats for the browser and like more to separate the original files from the gallery files.

Maybe some day there will be a plugin system where it will be easy to extend and customize this functionality.

The main problems encountered can be referred to as follows:

  1. The API server may disappear suddenly, but I don't know when it crashes. I am running it in a docker environment. "I guess this may have something to do with my concurrency being too high."

Please try to reduce the extractor.apiServer.concurrent setting to 1 if you face problems.

  1. When I entered the website through a non-"tags" page (I added all EXIFs containing GPS to tag: geo), I found that the map could not load all the data properly, and sometimes it might be missing. I try to do it regularly search/latitude%20in%20[0.0000:90.0000]%20and%20longitude%20in%20[90.0000:180.0000],Then manually add tags:geo

There is a search term for it: exists(geo) which matches only media with geo information. Or not exists(geo) for media without geo information.

  1. Usually I execute it directly through ./home-gallery -c home-gallery.conf, and then "start" directly.

Have you tried to run ./home-gallery -c home-gallery.conf run server? It does not make a difference but it will just start the server.

Thanks again xemle, I don't know JavaScript, I tried AI to help me with database support, but I found it ridiculous since I only know Python programming.

So what is your biggest pain currently with HomeGallery? Can you work with it? Did you tried other self hosted galleries? How do they perform on you large media set? What features do you prefer in HomeGallery? Which features do you like in others?

@a18090
Copy link
Author

a18090 commented Apr 23, 2024

Hi @xemle

My database doesn't seem to have been repaired, but it's not a serious problem, I'll probably exclude the video files next time I rebuild so it works faster (since I have almost 110,000 videos) haha.

I am curious about the two entry methods of "year" and "tags", but I will try these two pages again the next time I re-import, and then I will take a look at the log and chrome responses.

The video problem is not very serious. After all, I will reduce the bit rate and preset when converting to increase the conversion speed.

I'm going to try reducing api.server.concurrent to 1 and test again.

I like HomeGallery's face search and similar image search functions. These two functions are very helpful. I will try to restart HomeGallery again in the next days. This time I may regularly back up the database file to reduce the risk.

I've tried other apps and there are some issues.
Photoprism: It is very smooth when managing a many files. The cache occupies about ~180G. The video processing method "transcode on click", but his face recognition accuracy is a big problem (For me), and his The database has a file size of 4G and I expect this will increase in the future.

I feel like there may be a database issue on several of the self-hosted galleries I've used.
When using photoprism, I reconstructed the database nearly three times before I found the solution, because when enabled with docker-compose, each Ctrl+C interruption may cause the database to be damaged at any time, and it is irreversible.
This is perhaps my biggest problem with self-hosted galleries, but I'm also trying to find various ways to solve it, such as regularly backing up the database file or minimizing the convenience of Ctrl+C.

@a18090
Copy link
Author

a18090 commented May 5, 2024

{"level":20,"time":"2024-05-04T17:36:54.060Z","pid":1358080,"hostname":"cdn","module":"stream.pipe","levelName":"debug","duration":41646,"msg":"Processed entry dir cache 211 (+1)"} {"level":20,"time":"2024-05-04T17:36:54.060Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1589.9 MB, heapTotal: 1067.0 MB, heapUsed: 986.9 MB, external: 42.8 MB, arrayBuffers: 40.6 MB"} {"level":20,"time":"2024-05-04T17:37:11.586Z","pid":1358080,"hostname":"cdn","module":"stream.pipe","levelName":"debug","duration":17526,"msg":"Processed entry dir cache 214 (+3)"} {"level":20,"time":"2024-05-04T17:37:21.964Z","pid":1358080,"hostname":"cdn","module":"stream.pipe","levelName":"debug","duration":10378,"msg":"Processed entry dir cache 215 (+1)"} {"level":20,"time":"2024-05-04T17:37:36.514Z","pid":1358080,"hostname":"cdn","module":"stream.pipe","levelName":"debug","duration":14551,"msg":"Processed entry dir cache 216 (+1)"} {"level":20,"time":"2024-05-04T17:37:36.514Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 2731.6 MB, heapTotal: 2087.8 MB, heapUsed: 2030.5 MB, external: 153.8 MB, arrayBuffers: 151.6 MB"} {"level":20,"time":"2024-05-04T17:37:56.699Z","pid":1358080,"hostname":"cdn","module":"stream.pipe","levelName":"debug","duration":20185,"msg":"Processed entry dir cache 217 (+1)"} {"level":20,"time":"2024-05-04T17:38:13.818Z","pid":1358080,"hostname":"cdn","module":"stream.pipe","levelName":"debug","duration":17119,"msg":"Processed entry dir cache 219 (+2)"} {"level":20,"time":"2024-05-04T17:38:13.818Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 3906.7 MB, heapTotal: 3241.7 MB, heapUsed: 3163.1 MB, external: 90.8 MB, arrayBuffers: 88.6 MB"} {"level":20,"time":"2024-05-04T17:38:31.090Z","pid":1358080,"hostname":"cdn","module":"stream.pipe","levelName":"debug","duration":17272,"msg":"Processed entry dir cache 220 (+1)"} {"level":20,"time":"2024-05-04T17:39:10.659Z","pid":1358080,"hostname":"cdn","module":"stream.pipe","levelName":"debug","duration":39569,"msg":"Processed entry dir cache 226 (+6)"} {"level":20,"time":"2024-05-04T17:39:10.659Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 4555.3 MB, heapTotal: 3800.8 MB, heapUsed: 3705.3 MB, external: 185.8 MB, arrayBuffers: 183.6 MB"} {"level":30,"time":"2024-05-04T17:39:10.660Z","pid":1358080,"hostname":"cdn","module":"extractor","levelName":"info","duration":144649,"msg":"Processed 146217 entries (#146217, +23441, processing 115142 and queued 62 entries)"} {"level":20,"time":"2024-05-04T17:39:35.163Z","pid":1358080,"hostname":"cdn","module":"stream.pipe","levelName":"debug","duration":24504,"msg":"Processed entry dir cache 227 (+1)"} {"level":20,"time":"2024-05-04T17:39:51.912Z","pid":1357985,"hostname":"cdn","module":"cli.spawn","levelName":"debug","spawn":{"env":{"GALLERY_LOG_LEVEL":"trace","GALLERY_LOG_JSON_FORMAT":"true"},"command":"/tmp/caxa/home-gallery/master/home-gallery/node/bin/node","args":["/tmp/caxa/home-gallery/master/home-gallery/gallery.js","extract","--index","/data/ssd/glh/config/tdl.idx"],"pid":1358080,"code":null,"signal":"SIGABRT","cmd":"GALLERY_LOG_LEVEL=trace GALLERY_LOG_JSON_FORMAT=true /tmp/caxa/home-gallery/master/home-gallery/node/bin/node /tmp/caxa/home-gallery/master/home-gallery/gallery.js extract --index /data/ssd/glh/config/tdl.idx"},"duration":29365830,"msg":"Executed cmd GALLERY_LOG_LEVEL=trace GALLERY_LOG_JSON_FORMAT=true /tmp/caxa/home-gallery/master/home-gallery/node/bin/node /tmp/caxa/home-gallery/master/home-gallery/gallery.js extract --index /data/ssd/glh/config/tdl.idx"} {"level":30,"time":"2024-05-04T17:39:51.913Z","pid":1357985,"hostname":"cdn","module":"cli.spawn","levelName":"info","msg":"Cli extract --index /data/ssd/glh/config/tdl.idx exited by signal SIGABRT"}

Hi @xemle

I was rebuilding the database and I tested directly
./gallery -c gallery.config.yml run
But the data was incomplete. The webpage only showed ~270,000 images (I excluded videos). Later I tried re-indexing the small files, but it reported an error. It seemed that the memory overflowed?

The server memory has 64G, and 18G is currently used.

@xemle
Copy link
Owner

xemle commented May 6, 2024

Hi @a18090

Thank you for your report.

From the logs I can not say a lot. The rebuild command ./gallery -c gallery.config.yml run misses the command. Are you running server or import?

I can see that the heapUsed is a lot with about 4GB (heapUsed: 3705.3 MB). To be honest by algorithm I do not know why the heap consumption is so high (also on my side). My assumption is that the memory consumption of the database creation step should not that high since the end product are only a few 100 MB but at the end it is. I was able to fix the issue by increasing the --max-old-space-size node option for the database creation. This might not be enough on your side...

You can try to set 8 GB in your gallery config for the database creation. Maybe it helps

database:
  maxMemory: 8192

To fix your side I need further investigation of node's memory management and need to check what can be improved to keep the memory consumption low.

@a18090
Copy link
Author

a18090 commented May 8, 2024

Hi @xemle

thank you for your reply,
I have configured it in the settings file "gallery.config.yml"

database:
   file: '{configDir}/database.db'
   maxMemory: 40960

It seems to be a problem during operation?

I try cat gallery.log |grep heapTotal

During operation, the cached data enters the memory directly and is not written to database.db.

{"level":20,"time":"2024-05-04T16:56:10.906Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1569.5 MB, heapTotal: 1077.9 MB, heapUsed: 1036.1 MB, external: 27.6 MB, arrayBuffers: 25.4 MB"}
{"level":20,"time":"2024-05-04T16:57:52.387Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1461.6 MB, heapTotal: 968.8 MB, heapUsed: 930.8 MB, external: 28.7 MB, arrayBuffers: 26.5 MB"}
{"level":20,"time":"2024-05-04T16:58:33.444Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1591.6 MB, heapTotal: 1098.9 MB, heapUsed: 1057.1 MB, external: 28.3 MB, arrayBuffers: 26.1 MB"}
{"level":20,"time":"2024-05-04T16:59:40.639Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1567.6 MB, heapTotal: 1074.5 MB, heapUsed: 1032.9 MB, external: 27.0 MB, arrayBuffers: 24.8 MB"}
{"level":20,"time":"2024-05-04T17:02:46.419Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1594.4 MB, heapTotal: 1100.1 MB, heapUsed: 1060.8 MB, external: 28.4 MB, arrayBuffers: 26.2 MB"}
{"level":20,"time":"2024-05-04T17:05:10.364Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1516.5 MB, heapTotal: 1021.8 MB, heapUsed: 970.3 MB, external: 20.4 MB, arrayBuffers: 18.2 MB"}
{"level":20,"time":"2024-05-04T17:09:11.229Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1533.8 MB, heapTotal: 1037.5 MB, heapUsed: 986.1 MB, external: 21.8 MB, arrayBuffers: 19.6 MB"}
{"level":20,"time":"2024-05-04T17:13:21.719Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1517.0 MB, heapTotal: 1020.7 MB, heapUsed: 974.6 MB, external: 21.3 MB, arrayBuffers: 19.1 MB"}
{"level":20,"time":"2024-05-04T17:16:34.564Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1623.2 MB, heapTotal: 1126.8 MB, heapUsed: 1073.8 MB, external: 22.5 MB, arrayBuffers: 20.3 MB"}
{"level":20,"time":"2024-05-04T17:17:10.270Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1488.8 MB, heapTotal: 991.7 MB, heapUsed: 943.5 MB, external: 20.1 MB, arrayBuffers: 17.9 MB"}
{"level":20,"time":"2024-05-04T17:17:54.771Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1625.5 MB, heapTotal: 1128.3 MB, heapUsed: 1080.7 MB, external: 22.7 MB, arrayBuffers: 20.5 MB"}
{"level":20,"time":"2024-05-04T17:18:44.165Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1542.6 MB, heapTotal: 1045.3 MB, heapUsed: 1005.2 MB, external: 28.1 MB, arrayBuffers: 25.9 MB"}
{"level":20,"time":"2024-05-04T17:19:59.286Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1537.4 MB, heapTotal: 1040.1 MB, heapUsed: 999.7 MB, external: 27.7 MB, arrayBuffers: 25.5 MB"}
{"level":20,"time":"2024-05-04T17:21:13.734Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1559.5 MB, heapTotal: 1062.2 MB, heapUsed: 1018.5 MB, external: 24.9 MB, arrayBuffers: 22.7 MB"}
{"level":20,"time":"2024-05-04T17:22:55.906Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1451.1 MB, heapTotal: 953.5 MB, heapUsed: 884.3 MB, external: 20.6 MB, arrayBuffers: 18.4 MB"}
{"level":20,"time":"2024-05-04T17:27:09.221Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1508.3 MB, heapTotal: 1012.0 MB, heapUsed: 966.9 MB, external: 22.0 MB, arrayBuffers: 19.8 MB"}
{"level":20,"time":"2024-05-04T17:30:00.773Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1571.2 MB, heapTotal: 1057.0 MB, heapUsed: 1005.2 MB, external: 20.7 MB, arrayBuffers: 18.5 MB"}
{"level":20,"time":"2024-05-04T17:31:24.513Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1557.7 MB, heapTotal: 1042.8 MB, heapUsed: 946.9 MB, external: 20.2 MB, arrayBuffers: 18.0 MB"}
{"level":20,"time":"2024-05-04T17:32:22.989Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1719.9 MB, heapTotal: 1204.9 MB, heapUsed: 1158.5 MB, external: 25.1 MB, arrayBuffers: 22.9 MB"}
{"level":20,"time":"2024-05-04T17:34:29.942Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1503.8 MB, heapTotal: 1024.0 MB, heapUsed: 948.0 MB, external: 21.9 MB, arrayBuffers: 19.6 MB"}
{"level":20,"time":"2024-05-04T17:36:12.414Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1748.3 MB, heapTotal: 1223.9 MB, heapUsed: 1195.1 MB, external: 21.2 MB, arrayBuffers: 19.0 MB"}
{"level":20,"time":"2024-05-04T17:36:54.060Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1589.9 MB, heapTotal: 1067.0 MB, heapUsed: 986.9 MB, external: 42.8 MB, arrayBuffers: 40.6 MB"}
{"level":20,"time":"2024-05-04T17:37:36.514Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 2731.6 MB, heapTotal: 2087.8 MB, heapUsed: 2030.5 MB, external: 153.8 MB, arrayBuffers: 151.6 MB"}
{"level":20,"time":"2024-05-04T17:38:13.818Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 3906.7 MB, heapTotal: 3241.7 MB, heapUsed: 3163.1 MB, external: 90.8 MB, arrayBuffers: 88.6 MB"}
{"level":20,"time":"2024-05-04T17:39:10.659Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 4555.3 MB, heapTotal: 3800.8 MB, heapUsed: 3705.3 MB, external: 185.8 MB, arrayBuffers: 183.6 MB"}

gallery.config.yml

configDir: '/data/ssd/glh/config'
cacheDir: '/data/ssd/glh/cache'

sources:
  - dir: '/data/hdd/tdl'
    excludes:
      - '*.mp4'
      - '*.m4a'
      - '*.mov'
    index: '{configDir}/{basename(dir)}.idx'
    maxFilesize: 20M

extractor:
  apiServer:
    url: http://127.0.0.1:3001
    timeout: 30
    concurrent: 1
  geoReverse:
    url: https://nominatim.openstreetmap.org
  useNative:
    - ffprobe
    - ffmpeg

storage:
  dir: '{cacheDir}'

database:
  file: '{configDir}/database.db'
  maxMemory: 40960
events:
  file: '{configDir}/events.db'

server:
  port: 3000
  host: '0.0.0.0'
  openBrowser: false
  auth:
    users:
      - xxx: '{SHA}xxxx'
  basePath: /
  watchSources: true

logger:
  - type: console
    level: info
  - type: file
    level: debug
    file: '/data/ssd/glh/config/gallery.log'

@xemle
Copy link
Owner

xemle commented May 13, 2024

Hi @a18090

Thank you for your logs. It seams that the memory consumption on the heap grows to much at the end of your logs.

As I wrote earlier: I can not explain currently why the high consumption is required. Currently the time to investigate for such case is also very limited since I am using my spare time to work on the plugin system to open the gallery to custom functions.

I am very sorry but I can not help here currently. I keep it in my mind because it bugs me that the consumption is that high while in theory it should be slim.

@xemle
Copy link
Owner

xemle commented Jun 30, 2024

I did some analysis regarding memory consumption. IMHO I was not able to detect a memory leak or greater memory issue. Except that the database building process needs much memory since it loads all the data into memory.

My database with 100k requires 200 MB uncompressed JSON data and the database creation succeeds with 750 MB heap size.

The process can be optimized in a stream way which should require less memory since it does not need to load the whole database into the memory while creating the database. This would enable the building of larger galleries with more than 400k images on less memory.

The current workaround is to provide more heap space to the database building process.

Still on the server and client side the gallery needs to load the database into the memory but I guess it is more efficient since it is read only.

@a18090 Is this memory issue still relevant for you?

@a18090
Copy link
Author

a18090 commented Jul 10, 2024

Hi @xemle
First of all thank you for your efforts

Memory is not very critical for me, because my server has 128G memory, and computers and mobile phones generally have 12-16G. Most of the time it is fine. When using Chrome to access it on the PC, I noticed that the stack will be very large, but I recently encountered a problem that prevented me from using home-gallery recently.

My database.db will crash directly after running for a while, causing database.db to be unable to read, as shown below

(base) root@cdn:/data/ssd/glh/config# ls -la
total 1181468
drwxr-xr-x 1 root root       204 Jun 20 16:11 .
drwxr-xr-x 1 root root       102 Jun 18 06:12 ..
-rw-r--r-- 1 root root 140766901 Jun 20 14:19 database.db
-rw-r--r-- 1 root root 136943966 Jun  1 07:48 db.bak
-rw-r--r-- 1 root root     38402 May  8 15:12 dump.log
-rw-r--r-- 1 root root      4890 Apr 29 01:09 events.db
-rw-r--r-- 1 root root 889740155 Jun 20 16:09 gallery.log
-rw-r--r-- 1 root root  42275624 Jun 20 16:11 tdl.idx
-rw-r--r-- 1 root root     20407 Jun 18 06:01 tdl.idx.0618-3XYu.journal
-rw-r--r-- 1 root root     18330 Jun 20 16:11 tdl.idx.0620-JHz2.journal

(base) root@cdn:/data/ssd/glh/config# file database.db
database.db: gzip compressed data, from Unix, original size modulo 2^32 537025063

(base) root@cdn:/data/ssd/glh/config# file db.bak
db.bak: gzip compressed data, from Unix, original size modulo 2^32 521730731 gzip compressed data, unknown method, ASCII, extra field, has comment, encrypted, from FAT filesystem (MS-DOS, OS/2, NT), original size modulo 2^32 521730731

db.bak is the database.db I backed up some time ago.
I have tried to restore it to database.db, but I found that it will continue to crash and report errors after running for a while.

(base) root@cdn:/data/ssd/glh# ./gallery -c gallery.config.yml
✔ Gallery main menu · server
[2024-07-10 12:55:47.553]: cli.spawn Run cli with server
[2024-07-10 12:55:47.863]: server.auth Set basic auth for users: xxx
[2024-07-10 12:55:47.867]: server Your own Home Gallery is running at http://localhost:3000
[2024-07-10 12:55:47.868]: server.cli Run cli with run import --initial --update --watch...
[2024-07-10 12:55:48.064]: cli.run Import online sources: /data/hdd/tdl
[2024-07-10 12:55:48.064]: cli.task.import Run import in watch mode. Start watching source dirs for file changes: /data/hdd/
/tmp/caxa/home-gallery/master/home-gallery/node_modules/@home-gallery/server/dist/api/database/read-database.js:78
        return cb(err);
               ^

TypeError: cb is not a function
    at /tmp/caxa/home-gallery/master/home-gallery/node_modules/@home-gallery/server/dist/api/database/read-database.js:78:16
    at /tmp/caxa/home-gallery/master/home-gallery/node_modules/@home-gallery/server/dist/api/database/read-database.js:13:14
    at /tmp/caxa/home-gallery/master/home-gallery/node_modules/@home-gallery/database/dist/database/read-database.js:9:14
    at /tmp/caxa/home-gallery/master/home-gallery/node_modules/@home-gallery/common/dist/fs/read-json-gzip.js:14:12
    at node:internal/util:519:12
    at process.processTicksAndRejections (node:internal/process/task_queues:82:21)

Node.js v20.13.1
Error: Error: Server exited with code 1 and signal null

I will resume running the test again after the update, and I will clear the log and pay attention to

@xemle
Copy link
Owner

xemle commented Jul 10, 2024

Hi @a18090

Thank you for the update. I guess you identified an bug and it seems that the bug is quite old and my tests were not covering it. I will provide an fix in the next days. So the error cb is not a function should vanish than and you can retest

@a18090
Copy link
Author

a18090 commented Jul 12, 2024

Hi @xemle

I found an interesting situation when I ran it again.

When I open the browser with (Laptop) Chrome, if I enter the year tab, it will enter a waiting state if it is waiting for loading.

This is the command output log.

Connected to server events
event-store.ts:44 Applied 13 events and updated 0 entries in 0ms
ApiService.ts:127 Syncing database entries with offline database...
ApiService.ts:119 Found 1 from 1 trees out of sync in 1ms
ApiService.ts:121 Fetching 1 missing tree objects
ApiService.ts:95 Fetched 1 trees and 0 entries for offline database
ApiService.ts:95 Fetched 40 trees and 0 entries for offline database
ApiService.ts:95 Fetched 50 trees and 265 entries for offline database
search-store.ts:54 update query to 1 entries from 1 entries
ApiService.ts:95 Fetched 25 trees and 6244 entries for offline database
search-store.ts:54 update query to 261 entries from 261 entries
ApiService.ts:95 Fetched 57 trees and 9789 entries for offline database
search-store.ts:54 update query to 6473 entries from 6473 entries
ApiService.ts:95 Fetched 21 trees and 6915 entries for offline database
search-store.ts:54 update query to 13170 entries from 13170 entries
ApiService.ts:95 Fetched 0 trees and 9800 entries for offline database
search-store.ts:54 update query to 19572 entries from 19572 entries
ApiService.ts:95 Fetched 0 trees and 9323 entries for offline database
search-store.ts:54 update query to 28157 entries from 28157 entries
ApiService.ts:95 Fetched 0 trees and 1408 entries for offline database
ApiService.ts:95 Fetched 0 trees and 1456 entries for offline database
ApiService.ts:95 Fetched 0 trees and 134 entries for offline database
search-store.ts:54 update query to 39129 entries from 39129 entries
search-store.ts:54 update query to 45404 entries from 45404 entries
search-store.ts:54 update query to 48686 entries from 48686 entries
search-store.ts:54 update query to 70302 entries from 70302 entries
search-store.ts:54 update query to 80169 entries from 80169 entries
search-store.ts:54 update query to 113196 entries from 113196 entries
search-store.ts:54 update query to 124086 entries from 124086 entries
search-store.ts:54 update query to 138204 entries from 138204 entries
search-store.ts:54 update query to 153505 entries from 153505 entries
search-store.ts:54 update query to 158739 entries from 158739 entries
search-store.ts:54 update query to 169621 entries from 169621 entries
search-store.ts:54 update query to 178623 entries from 178623 entries
search-store.ts:54 update query to 186852 entries from 186852 entries
search-store.ts:54 update query to 195096 entries from 195096 entries
search-store.ts:54 update query to 203734 entries from 203734 entries
search-store.ts:54 update query to 213179 entries from 213179 entries
search-store.ts:54 update query to 218025 entries from 218025 entries
search-store.ts:54 update query to 224955 entries from 224955 entries
search-store.ts:54 update query to 227904 entries from 227904 entries
search-store.ts:54 update query to 231714 entries from 231714 entries
search-store.ts:54 update query to 234353 entries from 234353 entries
search-store.ts:54 update query to 236388 entries from 236388 entries
search-store.ts:54 update query to 239641 entries from 239641 entries
search-store.ts:54 update query to 241315 entries from 241315 entries
search-store.ts:54 update query to 243367 entries from 243367 entries
search-store.ts:54 update query to 246433 entries from 246433 entries
search-store.ts:54 update query to 249929 entries from 249929 entries
search-store.ts:54 update query to 251537 entries from 251537 entries
search-store.ts:54 update query to 253763 entries from 253763 entries
search-store.ts:54 update query to 255617 entries from 255617 entries
**search-store.ts:54 update query to 257054 entries from 257054 entries**
faces.ts:35 Face search: Took 19ms to select, 103ms to calculate, to sort 86ms, to map 0ms
faces.ts:35 Face search: Took 21ms to select, 88ms to calculate, to sort 88ms, to map 0ms
faces.ts:35 Face search: Took 20ms to select, 96ms to calculate, to sort 88ms, to map 0ms
faces.ts:35 Face search: Took 19ms to select, 96ms to calculate, to sort 86ms, to map 0ms
faces.ts:35 Face search: Took 21ms to select, 97ms to calculate, to sort 88ms, to map 0ms
...

update query to 92212 entries from 258659 entries
search-store.ts:54 update query to 92212 entries from 258660 entries
search-store.ts:54 update query to 92213 entries from 258661 entries
search-store.ts:54 update query to 92214 entries from 258662 entries
search-store.ts:54 update query to 92214 entries from 258663 entries
search-store.ts:54 update query to 92215 entries from 258664 entries
...

update query to 92523 entries from 259386 entries
faces.ts:35 Face search: Took 21ms to select, 101ms to calculate, to sort 93ms, to map 0ms
faces.ts:35 Face search: Took 21ms to select, 93ms to calculate, to sort 96ms, to map 0ms
...
more...

If I use Chrome on a desktop computer to access the site, this problem will not occur. There is no continuous log output, clicks are normal, and Chrome will not lose response.
And Chrome will enter a continuous state, memory usage increases by 3G, decreases by 1.4G, increases by 3G, and decreases by 1.xG

Chrome

I suspect this may be a page problem. I'll try to check it out and see if I can help you (of course I'm not sure if this is possible, haha)

@xemle
Copy link
Owner

xemle commented Jul 12, 2024

Hi @a18090 thank you for your update.

From your latest console logs I read that the database can be loaded with 259386 entries which sound good. A friend of mine has 400k image which should work, too.

Regarding your previous error

[2024-07-10 12:55:48.064]: cli.task.import Run import in watch mode. Start watching source dirs for file changes: /data/hdd/
/tmp/caxa/home-gallery/master/home-gallery/node_modules/@home-gallery/server/dist/api/database/read-database.js:78
        return cb(err);
               ^

TypeError: cb is not a function
    at /tmp/caxa/home-gallery/master/home-gallery/node_modules/@home-gallery/server/dist/api/database/read-database.js:78:16
    at /tmp/caxa/home-gallery/master/home-gallery/node_modules/@home-gallery/server/dist/api/database/read-database.js:13:14
    at /tmp/caxa/home-gallery/master/home-gallery/node_modules/@home-gallery/database/dist/database/read-database.js:9:14
    at /tmp/caxa/home-gallery/master/home-gallery/node_modules/@home-gallery/common/dist/fs/read-json-gzip.js:14:12
    at node:internal/util:519:12
    at process.processTicksAndRejections (node:internal/process/task_queues:82:21)

Node.js v20.13.1
Error: Error: Server exited with code 1 and signal null

I found the issue. The error happens when the database could not be read. Unfortunately the error message with the cause is swallowed by this bug so I can not tell why the database can not be read. A following fix will will change that.

My best guess is that the database can not be read due memory issues on the server component. Especially while you importing a new and even larger database. Than the server has to keep 2 large version in memory for a short time of period. This issue could be fixed with the environment variable NODE_OPTIONS=--max-old-space-size=4096 to run all node processes with max 4GB memory, e.g. by NODE_OPTIONS=--max-old-space-size=4096 ./gallery.js run server for the server.

I doubt that the database itself is corrupt because on database creation the new database is written to a temporary file which is than renamed to the target database filename. This is a usual way to provide a kind of atomic file creation. The rename should only happen on no error cases.

Would you mind to check your database with zcat database.db | jq .data[].id | wc -l? This should print the numbers of database entries and should prove that the data is correct.

@a18090
Copy link
Author

a18090 commented Jul 14, 2024

Hi @xemle

I've removed the problematic database and restored the backup, then rebuilt the database and now by zcat database.db | jq .data[].id | wc -l He has 357198

But that number doesn't seem right, I looked for JPGs by listing the files and there exists about 462626 files

The full file count with video is roughly 597817

@xemle
Copy link
Owner

xemle commented Jul 14, 2024

Can you check also the file index with zcat files.idx | jq .data[].filename | wc -l? How many files are indexed by the gallery?

@a18090
Copy link
Author

a18090 commented Jul 15, 2024

(base) root@cdn:/data/ssd/glh/config# zcat tdl.idx | jq .data[].filename | wc -l
461589

(base) root@cdn:/data/ssd/glh/config# zcat database.db | jq .data[].id | wc -l
357198

(base) root@cdn:/data/hdd/tdl# find . |wc -l
599809

(base) root@cdn:/data/hdd/tdl# find . -type f -name *.jpg |wc -l
460337

@xemle
Copy link
Owner

xemle commented Jul 17, 2024

Hi @a18090

thank you for the numbers. To summarize

  • 599809 Dir and files in total
  • 460337 jpg files
  • 461589 files are indexed (and seen) by the gallery
  • 357198 media files are created

The diff between jpg files and indexed files is that the gallery also index other files like meta files

The more important question is why there is a big gab between media files and index files since there should not be so much meta files.

My best guess is that the import was done in multiple steps. Currently the algorithm does not recover correctly the files after the input process is restarted (Did not find time/interest do implement it yet). Therefore it is recommended to rerun the full import after all media files have been processed.

Would you mind to rerun a full import via ./gallery.js run import and check the numbers again?

@a18090
Copy link
Author

a18090 commented Jul 18, 2024

Thanks, no problem, I'll try it.

@xemle
Copy link
Owner

xemle commented Jul 30, 2024

Hi @a18090 I've updated the master with a stream based database creation. This should be less memory demanding and your 400.000 images should work better to read and to update.

Please try out if your have any issues with the newest version

@a18090
Copy link
Author

a18090 commented Aug 5, 2024

Hi @xemle
I solved the file count issue and database.db grew to ~491236 Maybe png or other formats(My file also added ~490629:find . -type f -name *.jpg |wc -l
My server didn't have nodejs so I used your binary "Jul 25 14:14 version".

This is resource occupancy
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
16951 root 20 0 12.2g 1.0g 43708 S 0.0 0.8 16:02.33 node ....../gallery.js server
16962 root 20 0 14.0g 2.6g 42752 S 0.0 2.1 2:17.11 node ....../gallery.js run import --initial --update --watch

I noticed an error occurred

{"level":50,"time":"2024-08-05T11:23:06.096Z","pid":11579,"hostname":"cdn","module":"database.media.date","levelName":"error","msg":"Could not create valid ISO8601 date '2023-01-21T00:11:77' from '\"2023:01:21 00:11:77\"' of entry e799fe8:tdl:xxx/File/IMG_2406.JPEG: RangeError: Invalid time value"}

I will also try multi-platform testing later

Thanks for your efforts

Hi @a18090

thank you for the numbers. To summarize

  • 599809 Dir and files in total
  • 460337 jpg files
  • 461589 files are indexed (and seen) by the gallery
  • 357198 media files are created

The diff between jpg files and indexed files is that the gallery also index other files like meta files

The more important question is why there is a big gab between media files and index files since there should not be so much meta files.

My best guess is that the import was done in multiple steps. Currently the algorithm does not recover correctly the files after the input process is restarted (Did not find time/interest do implement it yet). Therefore it is recommended to rerun the full import after all media files have been processed.

Would you mind to rerun a full import via ./gallery.js run import and check the numbers again?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants