-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
calmari/1.0: Fix 1.0.x models for Python 3.11 #348
Comments
Related issue in ocrd_calamari is here: OCR-D/ocrd_calamari#91 |
Hi @mikegerber, I've just made two commits to the 1.0 branch: the first is trying to fix the regex problem and the second to make all the tests run without warning. Could you please test if this works with ocrd_calamari? |
…r issue Calamari-OCR#348)" This reverts commit 13b544b.
Unfortunately this only got called for the default parameters, not the ones read from the model on disk. I've had another look and opened PR #349. That PR fixes the issue for me! |
(I have 2 other issues with 1.0.x - more NumPy noise and another small issue with noise in the output. If you want to release another 1.0.x version maybe wait a little bit, I still need to investigate if it's Calamari or ocrd_calamari.) |
* Revert "Move global flags to the start of regular expressions (fix for issue #348)" This reverts commit 13b544b. * Move global flags to the start of regular expressions Fix regex replacements for models where global flags were put at the end of the pattern strings. These patterns are invalid as of Python 3.11.
Thanks a lot – I didn't have an old model so I was just guessing where to fix the regexes... If you have any other suggestions, I'll gladly include them in the 1.0.7 release! |
Yeah I should have linked our historic model so you can reproduce :) If you need a working model for 1.0 in the future: https://qurator-data.de/calamari-models/GT4HistOCR/2019-12-11T11_10+0100/model.tar.xz (only for old prints/Fraktur) |
I'll try to debug today! |
The other issues:
So I think you could release 1.0.7 when #350 is merged and this issue can be closed too :) |
Nevermind, there's still lots of DeprecationWarnings I'd like to take a look first (other than the ones @andbue thankfully already fixed) |
Sorry @mikegerber – I did not see #350 when releasing 1.0.7. I'll merge that soon. So perhaps we should do another 1.0.8 ... More than anything else we urgently need to backport the recently added support for TF SavedModel format to all the older branches – because HDF5 models stop working across the Python 3.8 / 3.9 boundary IIRC. The main problem with that is we cannot just increase the version tag of the older models retroactively (as was done with 5→6 in master). I have discussed this with @andbue and he is inclined to implement the auto-conversion without version update there. |
done. I suggest we keep this issue open to track progress with the SavedModel format conversion in 1.x (and the other older branches). |
Since Calamari 2.3 (or rather, the initial implementation by @andbue for SavedModel format) already includes a new model/checkpoint version 6 (rather than an unversioned optional variant), there is no chance of supporting this in 1.x – as there is no way to increase the model version with an integer between 2 (Calamari 1.x) and 3 (Calamari 2.0). But the good news is that I made a lot of progress:
The only problem is that I cannot migrate @mikegerber's model qurator-gt4histocr due to #362. So we are stuck with 1.x for that, where none of this is of any help. In 2.x on the other hand, there might even be other (non-public) models out there which have the same problem – even amongst newer model versions (I did see this in calamari_models). So we could really use a So if we want that, we should rename this issue to cover the calamari/2.x case exclusively – otherwise close (as @mikegerber's solution has already been merged in 1.0.7). |
We have old 1.0.x models that wouldn't run using the Calamari 1.0.x branch on Python 3.11, as the
replacements
use regexen now considered invalid in Python 3.11:E.g. in our
0.ckpt.json
:The global
(?u)
regex flag needs to go in front. This script fixes it:https://github.com/OCR-D/ocrd_calamari/blob/master/ocrd_calamari/fix_calamari1_model.py
The question is if you want this "upgrading" procedure to go into the 1.0 branch's modeling loading code?
(I haven't checked any other 1.0 models, but I am somewhat sure that these replacements weren't customized by us and came from Calamari itself.)
The text was updated successfully, but these errors were encountered: