Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VAE part of model #61

Open
fgvbrt opened this issue Mar 17, 2017 · 6 comments
Open

VAE part of model #61

fgvbrt opened this issue Mar 17, 2017 · 6 comments

Comments

@fgvbrt
Copy link

fgvbrt commented Mar 17, 2017

Hi, it looks like that this code actually train not VAE model but simple auto-encoder model. Here are reasons:

  1. Epsilon std is 0.01 https://github.com/maxhodak/keras-molecules/blob/master/molecules/model.py#L58 when it should be 1. I assume that it is safe to say that there is almost no sampling.
  2. KL loss should be very small because there is mean operation https://github.com/maxhodak/keras-molecules/blob/master/molecules/model.py#L78. In that case there is mean along feature and sequence shape. But both of this should be summed to obtain right KL loss relative to crossentropy loss.
  3. The picture in readme also indicates that, because not all regions in latent space are covered by points. And authors wrote in paper that they observed this when they trained simple auto-encoder model.

May be it makes sense to simple train autoencoder model and compare results.

@sbaurdlp
Copy link

Hi,

I'm working on a similar problem but with protein sequences rather than molecules

You mention epsilon_std is not 1, which also seems quite strange to me
Yet, I found it was often the case in other codes (for example the Keras tutorial on VAE)
When I changed mine from 1.0 to 1e-3 some months ago, it allowed the model to learn (it didnt before)

Would you say VAE arent suited for that problem ?

Regards,
Sebastien

@larry0x
Copy link

larry0x commented Oct 23, 2018

Hi Sebastien, do you have any update one the issue regarding epsilon_std?

I am trying to implement the same model in PyTorch and encountered the save problem. If I set epsilon_std to 1, the model refuses to learn anything (loss stagnates at very high values.)

If I change this value to 0, the VAE effectively degenerates to a simple AE. It learns very fast, recovering input sequences almost perfectly. But just like with any other simple AEs, the latent space it produces is sparse and it generates garbage when interpolating or decoding randomly sampled latent variables.

If I pick small, non-zero epsilon_std values, the result is between the two scenarios - the model learns better than when epsilon_std is set to 1 but not as good as when it is set to zero. In none of the cases the model works as good as described in Aspuru-Guzik's paper.

@chaoyan1037
Copy link

@lyu18
I encountered the same problem as you. And I looked into the code of original paper and found out that they anneal the epsilon_std. Maybe this can help the model training. I will try it quickly.

@larry0x
Copy link

larry0x commented Apr 23, 2019

@allenallen1037 That makes a lot of sense! Please let me know if you get any results. Thanks

@chaoyan1037
Copy link

@lyu18 It helps to improve the reconstruction accuracy when training. This is expected since it is some kind of tradeoff between AE and VAE. But the KL divergence loss is quite large, which means the latent space may not be smooth. I will do more investigation when finishing training.

@maxime-langevin
Copy link

@allenallen1037 I've encountered the same problem as you. Did you found a workaround that helped you solve it? Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants