-
Notifications
You must be signed in to change notification settings - Fork 395
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Source of scripts/Fraktur etc. #39
Comments
Also see tesseract-ocr/tessdata#65 |
Is langdata obsolete as langdata_lstm exists? |
langdata files are appropriate for tesseract 3 or for legacy/base versions using tesseract 4. They can also be used for finetuning which requires a smaller input training text. |
As @Shreeshrii already said, The I fixed the description for 4.00 |
Fraktur Tesseract OCR is what I am looking for,.... I installed VietOCR v5.5.2 and Tesseract 4.1.0 on my mac, and now I am trying to find help on how to train it better.... there are too many OCR errors... How would I go about training the software? Can anyone help? I am a total retard, ...sadly,.... and I do not even know how I was able to install the two components so far..... and this training step is nowhere explained Any help into the right direction would be greatly appreciated |
In the meantime newer Fraktur models are available. There is a description of the training process for those models in the Wiki. As soon as the training is finished, I'll add the results to tessdata_contrib. |
@mikegerber, can we close this issue? |
While the files in the top directory seem to come from the sources in the langdata repository, the source for some of the files in
scripts/
is unclear:scripts/Fraktur.traineddata
has no matching file in langdata,scripts/Japanese.traineddata
also, etc.The Data-Files wiki article does not mention
scripts/Fraktur
.This adds to the confusion of the
frk
language (not actually frankish, but Fraktur), theFraktur
script and the legacy modeldeu_frak
in the tessdata repository.The text was updated successfully, but these errors were encountered: