-
Notifications
You must be signed in to change notification settings - Fork 361
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Testing , bad results even on training sample after convergence #28
Comments
I solved it with the following modification in model.py, line 352, without any explanations of why it should be like that... I searched a lot... #if not forward_only: |
Yeah it would work on the training data but i don't think it can be considered a fix, by setting learning phase to 1 always means that we are in training mode, so any layer that has a different behavior in train/test will be set to train even if we are testing . |
Yes. If you are setting that flag to 1 during test phase, it basicaly means when you receive a test batch, you are doing the same thing as in training: subtracting some mean over the test set. While that's not that inconsitent between training and testing, doing that is kind of unfair, since presumably we should only use a test point's own information to classify that, without looking at some statistics over a batch of test examples. Sorry that I'm busy for a ddl, will look into the code later. |
because of the difference of BN between training and testing |
I trained the model with step perplexity = 1.006652, error = 0.0082. Then tried to test results using svt and iiit5k dataset. But for both dataset i got 100% incorrect results, which is totally unexpected. So, i used the trained model given, but still got same results. i use keras 1.1.1 and Tf 0.12.1. I used distance as well and tried other datasets as well. Any help? This was an important project for me, please help. |
remove tf.gfile.Exists(ckpt.model_checkpoint_path) from model.py. |
I meet the same issue with Alexjap. Could anyone find the root cause? Train result: Test result: |
I think what seed93 said might make sense, maybe it is related to the Batch normalisation behavior but i didn't have time to test without it to see if things change. |
@Alexjap I use the pull requests code and found this bug. Change |
@seed93 i quickly checked the code you mentioned, if we change the code like that we set the CNN model in testing(freeze weights) when we are training and vice versa, it looks a bit strange to me |
i looked into the code and found 'false' argument in "model.py" (line 204-211) while debugging. The problem was actually the system was unable to load the trained model. So, i edited the code little bit and found that now the model trained by me is loaded and it's working. But still the accuracy is low (12-15% for both svt & iiit5k test dataset). The problem is in this argument : "batchnormalization_3_running_mean:0 NOT trainable" , batchnormalization_3_running_std:0 NOT trainable" . This happened because: new tf & keras version can't calculate mean & standard deviation from these two arguments. So does for pre-trained model. And since models are binary files, there is no room to change them. Also, in test phase, the system is giving accurate results for first input of a mini-batch but not for rest of data. This was strange to me. |
@raoweijin , i faced same problem and somehow solved it with this: remove tf.gfile.Exists(ckpt.model_checkpoint_path) from model.py .. @shraju024 is right. |
Solved this problem with SivanKe#1 |
@NourozR Now I face same problen,“remove tf.gfile.Exists(ckpt.model_checkpoint_path)”can solve the problem?This method just loads the model。 |
Hi Guys, Please help me. While training the code with test data. I am getting generating first batch. It is not going showing the step train and step loss :(. I gave all the parameters mentioned in the training steps. Epoch ........ 0 |
Right now by following the instructions on the Readme:
Training procedure seems to converge (perplexity around 1 on toy example),
but when we test on the same data ( the toy example itself) the results are quite bad , Is anyone experiencing this behavior as well? I tried to look into the bucketing part of the code , i'm not sure why the bucketing in evaluation and in training differ but that doesn't seem to be the cause anyway ( tried with same bucketing and still bad results)
The version of keras and tensorflow are the reccomended ones ( Keras 1.1.1 and tf 0.11.0)
The text was updated successfully, but these errors were encountered: