-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect date added to database when GPS data is set incorrectly #143
Comments
Hi @jgyprime thank you for using HomeGallery and I am glad that you like it. Further, thank you for reporting your issue with the date. You did a great job nailing the problem and provided a test picture. Awesome. Yes. My assumption was: If there is a date provided by GPS, it should be quite accurate. However your picture has 1) no further GPS coordinates and 2) the date Do you think it would be sufficient to allow the GPS date only if GPS coordinates are available? This would keep the basic assumption but will check it in detail... |
After removing the gps date info (as I said above) the indexation has restarted. My NAS is a Terramster F4-421 |
Sure. |
Alright. Please push me if you reach problems. It bugs me that there is a problem which should not be there in theory. Since I do not face the problem I need an external push and someone who really want to have it solved. Thank you for the details of your system. It helps to know the target systems.
Awesome. Currently I am implementing a plugin system. When I stumble across this part I will ensure that the GPS date will only taken if there is also a GPS position. In the meanwhile if you find a better strategy to identify the date, please let me know. |
In the meantime, the indexation finished I have a few questions:
|
Do you have lots of binary duplicates? Do you have files which lead to the same SHA1 checksum?
No, there are no limits. Neither in file count nor in folder depth. All files should be considered. Do you use any file filter which excludes some of the files?
The file needs to be unique by OS filename for the file indexer and unique by SHA1 for the database. Same SHA1 is handled as duplicate and file data are merged. There are corner cases with side cars of duplicate files, I can go in depth with that if requested. But basically if you just copy a image/folder byte-by-byte from one place to another OS path these files are duplicates. Even if later if they are renamed since there file content is unchanged and contains the same data. This is a design decision with the goal to show only unique media by the assumption that most people have no clue how many duplicates they are storing and IMHO it does not give any value to show pictures twice. To identify the files which are indexed you can dump information from index files
This should print the count of your files which should be about 400k according to your provided information. To identify the entries from the database you can run
To identify unique database entries you can run
The later should than print about 100k according to your provided information. Maybe it is worth reading the internals of the gallery to gain further insights and to clarify further questions. Thank you for reporting your experience and questions. |
One more thing: HomeGallery imports the files in chunks to deal with internal limitations and to provide early feedback (show images in the browser). So the media import might also in a intermediate state and not all your files are imported yet? This import process can be restarted and does not need to be run in one single run. |
Hi @jgyprime I like to inform you that the newest master contains stream based database creation which requires less memory. So your 400K should be now fine to be processed and updated. |
@jgyprime Further, I am happy to announce the first experimental plugin feature in the current master! See docs.home-gallery.org/plugin for further details! With plugins you can easily "fix" the geo data issue by your own database mapper. |
Thank you for this great and amazing software.
I've been using it with almost 4 TB of personal photos (only photos).
I think that I have more than 500k photos there...
But I think I found something that can be improved.
After the initial indexation of photos finished (it took several days on my low powered Celeron NAS), I observed that a lot of my photos were added to the database incorrectly, with 1970 as year...
And I started investigating the reason.
For example, in the photo I uploaded, the GPS data is set incorrectly in the photo exif:
When added to gallery, the date is set to 1970 (date is taken from GPS info in exif)...
I do not know how that GPS date got there, but I can assume that the phone tried to get the GPS date and time, but because the GPS on the phone was disabled, it got back to a default value of something from 1970...
I also found the source of the problem in the source code here:
https://github.com/xemle/home-gallery/blob/master/packages/database/src/media/date.js#L44
const dateKeys = ['GPSDateTime', 'SubSecDateTimeOriginal', 'DateTimeOriginal', 'CreateDate']
If I remove the 'GPSDateTime' item from line 44, then everything works correctly after rebuilding and re-indexing the database.
What do you think?
Is an improvement possible in this case?
For example:
Unfortunately, my knowledge of the js language is very close to 0, so I would prefer for someone with enough knowledge to find a potential implementation here.
Thank you for reading my very long post.
Thank you for creating such a nice software.
The text was updated successfully, but these errors were encountered: