Failed precondition: Python interpreter state is not initialized. #369

MGJamJam · 2024-11-03T23:42:44Z

Hello!
When training a model I get the following error:

INFO     2024-11-03 19:13:30,783                         FOLD 0: INFO     2024-11-03 19:13:30,684 calamari_ocr.ocr.training.trai: Training finished
INFO     2024-11-03 19:13:30,884                         FOLD 0: 2024-11-03 19:13:30.833537: W tensorflow/core/kernels/data/generator_dataset_op.cc:107] Error occurred when finalizing GeneratorDataset iterator: Failed precondition: Python interpreter state is not initialized. The process may be terminated.
INFO     2024-11-03 19:13:30,884                         FOLD 0: 	 [[{{node PyFunc}}]]
CRITICAL 2024-11-03 19:13:39,935             tfaip.util.logging: Uncaught exception
Traceback (most recent call last):
  File "/home/fablab/miniconda3/envs/test_gpu/bin/calamari-cross-fold-train", line 8, in <module>
    sys.exit(run())
  File "/home/fablab/miniconda3/envs/test_gpu/lib/python3.7/site-packages/calamari_ocr/scripts/cross_fold_train.py", line 13, in run
    return main(parse_args())
  File "/home/fablab/miniconda3/envs/test_gpu/lib/python3.7/site-packages/calamari_ocr/scripts/cross_fold_train.py", line 31, in main
    trainer.run()
  File "/home/fablab/miniconda3/envs/test_gpu/lib/python3.7/site-packages/calamari_ocr/ocr/training/cross_fold_trainer.py", line 321, in run
    pool.map_async(train_individual_model, run_args).get()
  File "/home/fablab/miniconda3/envs/test_gpu/lib/python3.7/multiprocessing/pool.py", line 657, in get
    raise self._value
  File "/home/fablab/miniconda3/envs/test_gpu/lib/python3.7/multiprocessing/pool.py", line 121, in worker
    result = (True, func(*args, **kwds))
  File "/home/fablab/miniconda3/envs/test_gpu/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar
    return list(map(*args))
  File "/home/fablab/miniconda3/envs/test_gpu/lib/python3.7/site-packages/calamari_ocr/ocr/training/cross_fold_trainer.py", line 53, in train_individual_model
    verbose=run_args.get("verbose", False),
  File "/home/fablab/miniconda3/envs/test_gpu/lib/python3.7/site-packages/calamari_ocr/utils/multiprocessing.py", line 83, in run
    raise Exception("Error: Process finished with code {}".format(process.returncode))
Exception: Error: Process finished with code -9

Can you help me understand what might cause this error and what its impact is? Is this error even relevant, as it is printed after the Training finished message?

I am using:

WSL with Ubuntu 22.04.3 LTS.
Python 3.7
Tensorflow 2.6.0
Cuda 11.2
cuDNN 8
calamari 2.2.2

The training command I used was:

CUDA_VISIBLE_DEVICES=0 calamari-cross-fold-train \
    --train PageXML \
    --train.images "training_data_senat_reduced/*.png" \
    --temporary_dir calamari_cd_training_output_warmstart_gothic_03_11 \
    --keep_temporary_files True \
    --scenario.tensorboard_logger_history_size 50 \
    --device.gpus 0 \
    --codec.include {string.digits + string.ascii_letters} \
    --best_models_dir "calamari_cf_training_03_11" \
    --weights "calamari_models_experimental/deep3_htr-gothic/0.ckpt.json" \
              "calamari_models_experimental/deep3_htr-gothic/1.ckpt.json" \
              "calamari_models_experimental/deep3_htr-gothic/2.ckpt.json" \
              "calamari_models_experimental/deep3_htr-gothic/3.ckpt.json" \
              "calamari_models_experimental/deep3_htr-gothic/4.ckpt.json" \
   --n_augmentations=5 \
   --network deep3 \
   |& tee output_cf_03_11.txt

The text was updated successfully, but these errors were encountered:

andbue · 2024-11-04T09:24:14Z

I have the same error, it does not affect the training process (it occurs in the training process after the training is finished). So far I was not able to pin down the reason for the error in calamari, it might be fixed in newer tensorflow versions (cf. tensorflow/tensorflow#24570).

MGJamJam · 2024-11-04T21:44:04Z

Thanks for the quick answer 😄

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failed precondition: Python interpreter state is not initialized. #369

Failed precondition: Python interpreter state is not initialized. #369

MGJamJam commented Nov 3, 2024

andbue commented Nov 4, 2024

MGJamJam commented Nov 4, 2024

Failed precondition: Python interpreter state is not initialized. #369

Failed precondition: Python interpreter state is not initialized. #369

Comments

MGJamJam commented Nov 3, 2024

andbue commented Nov 4, 2024

MGJamJam commented Nov 4, 2024