-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Epochs and loss during training #20
Comments
Hi wywhu, If you look at the code, masks are created during the training phase, so the mismatch between window_size and actual sequence length shouldn't be a problem. However, I wrote this code 4 years ago, so this is just speculation. There is no fixed answer as to what number of epochs works best, as your dataset is different from what I had used. You can try to separate the cost into visit_cost and emb_cost (see line 133 of the source code), see how they behave, then select the epoch you like. This of course involves some coding. Hope this helps, |
Thanks Ed. I have another question about interpreting the code representations. In your paper, it says that "we trained ReLU(W_c), a non-negative matrix, to represent the meaning of .......", and "we can find the top k code that have the largest values for the i-th coordinates by argsort(W_c[i, :])[1, k]". I am confused, should I look at W_c or ReLU(W_c) in the argsort operation? |
Actually, you are correct. You should look at ReLU(W_c) in the argsort operation, which guarantees non-negativity. |
Hi Ed,
I am training embedding using your default hyperparameters, except window_size. The minimum number of visits in my dataset is 2, but I set window_size=3 as I suppose your code can handle the inconsistency between window_size and actual sequence length. Am I right?
I also noticed that the mean_cost was the minimum at the 2nd epoch then it started increasing. Although I read in your paper that the number of epochs does not hurt the code representations very much, I am not sure which epoch should I choose after finished training. Should I used the minimum cost one, or the one from the last epoch?
The text was updated successfully, but these errors were encountered: