-
-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Overfitting with CNN model? #13
Comments
Hi @datistiquo , sorry for the late response, Looking into the model configuration you can see that the dropout is disabled by default for CNNs, The second important think that comes to my mind is the maximum length of training sequence. Imagine situation when you have a small training dataset and one or only several sentences are very long, like 50 tokens and the rest sentences are short (also those from tests). In this case short sentences are padded by a lot of placeholder tokens and it can be a strong signal in making the final decision. This area is also worth investigating. I hope it will help, |
I will check this. I also think that the margin plays a huge role with contrastive loss. Actually, have you normalized your word vectors before input? Maybe that is my issue too since I have not normalized them. maybe I try this out. |
Right now I use a simple MSE or simple contrastive loss. But I feel that I need to do a pairwise or triplet or even a listwise loss to do better? Also, my metric to evaluae is just precision but ranking metric like precision at k is more reasonable for IR I think! |
Hey @tlatkowski Why are you using in your CNN Network just the distance as output? Have you tried feeding the distance to a sigmoid layer? Or instead of using distance using directly the sigmoid layer? |
Hey,
I try the CNN model for my own data and I don't know what is going on there. I really hope you can get me some advices.
I use the model for sentences Matching for IR. I get good reuslts for the trained data but for out of scope I get very high confidences with not related sentences. Even for an empty string I get confidences of 1 for several sentences!
I have not so much data so I do augmenation. Do you have any recipe for the augmenation?
Thank you!
The text was updated successfully, but these errors were encountered: