You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If the image to be "OCRed" has more than one '.' in filename, some parts of the resulting filename are truncated.
E.g.:
something.else.png --> something.pred.txt instead of something.else.pred.txt
The text was updated successfully, but these errors were encountered:
Right, that's a little bit annoying, I've struggled with that myself before. In ocropus, the image file names contain information on preprocessing (e.g. 001.bin.png) that have to be ignored. If we change the current behaviour, we might brake support for legacy datasets. I don't know if ocr4all needs this - @chreul ?
Maybe we could either implement a command line switch to toggle file extension handling or just ignore a specific set of strings (bin, raw, nrm, maybe col?).
OCR4all currently indeed needs this but we could just use a small wrapper / postprocessing script for this (and the newly written back end manages files different anyways) so changing this wouldn't really be a problem for OCR4all.
Well, in my opinion the current behaviour is unexpected for newcomers like myself.
I (and I assume any other newcomer) like the idea to change this - any additional command line switch would be ok, of course.
If the image to be "OCRed" has more than one '.' in filename, some parts of the resulting filename are truncated.
E.g.:
something.else.png --> something.pred.txt instead of something.else.pred.txt
The text was updated successfully, but these errors were encountered: