Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add random seed of training for reproducibility #51

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

hsiaoyi0504
Copy link
Contributor

As title

@hsiaoyi0504
Copy link
Contributor Author

For issue #50

@pechersky
Copy link
Collaborator

Could you also change the generator-based approach? Specifically, there is a random.shuffle call here that can be seeded: https://github.com/maxhodak/keras-molecules/blob/master/molecules/vectorizer.py. Additionally, perhaps seed should be a flag that can be passed into train.py and train_gen.py.

@hsiaoyi0504
Copy link
Contributor Author

Sure, I will work on it.

@hsiaoyi0504
Copy link
Contributor Author

I don't think it's a good thing to do this (I mean add flag) , according to the comment here, by default Keras's model.compile() sets the shuffle argument as True. You should the set numpy seed before importing keras. Then, adding a flag would make code messy.

@hsiaoyi0504
Copy link
Contributor Author

Oh, I got it. However, I think it's two different things. Random seed for numpy and random seed for random.

@hsiaoyi0504
Copy link
Contributor Author

Already done (only one line code lol)

@pechersky
Copy link
Collaborator

I meant something like that SmilesDataGenerator.init could take a seed kwarg, which would be passed in at train_gen.py, using some sort of flag. In that case, it's ok to pass it in as acquired from a flag. Regarding train.py, you could move the keras import statements into the main function, with argparsing of whether there is a seed flag or not.

Let's also pull the requirements commit out of this PR, and I'll accept it in the other PR.

@hsiaoyi0504
Copy link
Contributor Author

Is it ok?

This way, someone can pass in the seed. This also moves the keras loading into main.
This way, someone can pass in the seed. This also moves the keras loading into main, and seeds the SmilesDataGenerator.
@pechersky
Copy link
Collaborator

I've committed a couple changes to train and train_gen to take the seed as a cli parameter. Could you test that they work as you expect?

@hsiaoyi0504
Copy link
Contributor Author

I test using tensorflow backend, and I found it doesn't work. After a quick search of this, it seems still open issue. Besides, I failed to execute using theano backend. I am still checking what happened.

@pechersky
Copy link
Collaborator

pechersky commented Dec 23, 2016 via email

@hsiaoyi0504
Copy link
Contributor Author

It looks like the train.py works fine using theano backend, so I think we can wait the keras update in the future. I will now start checking the train_gen.py part.

@hsiaoyi0504
Copy link
Contributor Author

I am still unable to execute the train_gen.py. What is the input of this file?

@hsiaoyi0504
Copy link
Contributor Author

Oh, I find that I used wrong input file. The train_gen.py doesn't need preprocessing.py.

@pylang
Copy link

pylang commented Apr 1, 2017

#2743 is still an issue for me, despite all the suggestions. Restarting the notebook gives different results as well.

keras 2.0.2
numpy 1.11.2
tensorflow 1.0.0

@hsiaoyi0504
Copy link
Contributor Author

It turns out that it is due to keras-team/keras#2280. I gave up trying these things long time ago. It seems that it can't be fixed in an easy manner.

@pylang
Copy link

pylang commented Apr 1, 2017

Thanks for the information. Non-reproducible results is a serious issue in keras imo.

@@ -5,14 +5,11 @@
import h5py
import numpy as np

from molecules.model import MoleculeVAE
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't working for mw

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants