Question regarding error metrics/dataset creation #15

nsrishankar · 2022-01-24T17:37:18Z

I had a few questions/clarifications regarding the hdf5 dataset that was linked on the notebook:

I ran the notebook for training from scratch using the existing hdf5 and obtained a CER of ~0.09 using just a single model (and not an ensemble).
When creating the hdf5 from scratch and running the training procedure my CER is similar to the best/second best models (~0.16-0.18).

So, as far as I can see the main difference would be in the dataset generation/preprocessing steps or the tokenizer:
a. In the notebook there's a comment that the pretained models used a vocab size of 100 as opposed to 99 (95 characters + SOS/EOS/PAD/UNK tokens)- is there an additional token used here?
b. Was the generation procedure of the hdf5 that was linked/on the google drive a little different?

Thank you!

him4318 · 2022-03-18T16:05:13Z

I also don't remember exactly the first iteration but I am working on a paper with the different experiments involving pre-processing steps that will help the community. I will update you once it is finalized.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question regarding error metrics/dataset creation #15

Question regarding error metrics/dataset creation #15

nsrishankar commented Jan 24, 2022

him4318 commented Mar 18, 2022

Question regarding error metrics/dataset creation #15

Question regarding error metrics/dataset creation #15

Comments

nsrishankar commented Jan 24, 2022

him4318 commented Mar 18, 2022