Add random seed of training for reproducibility #51

hsiaoyi0504 · 2016-12-21T10:09:55Z

As title

hsiaoyi0504 · 2016-12-21T10:11:20Z

For issue #50

pechersky · 2016-12-21T13:58:52Z

Could you also change the generator-based approach? Specifically, there is a random.shuffle call here that can be seeded: https://github.com/maxhodak/keras-molecules/blob/master/molecules/vectorizer.py. Additionally, perhaps seed should be a flag that can be passed into train.py and train_gen.py.

hsiaoyi0504 · 2016-12-21T14:03:22Z

Sure, I will work on it.

hsiaoyi0504 · 2016-12-21T14:07:41Z

I don't think it's a good thing to do this (I mean add flag) , according to the comment here, by default Keras's model.compile() sets the shuffle argument as True. You should the set numpy seed before importing keras. Then, adding a flag would make code messy.

hsiaoyi0504 · 2016-12-21T14:13:17Z

Oh, I got it. However, I think it's two different things. Random seed for numpy and random seed for random.

hsiaoyi0504 · 2016-12-21T14:15:59Z

Already done (only one line code lol)

pechersky · 2016-12-21T15:50:42Z

I meant something like that SmilesDataGenerator.init could take a seed kwarg, which would be passed in at train_gen.py, using some sort of flag. In that case, it's ok to pass it in as acquired from a flag. Regarding train.py, you could move the keras import statements into the main function, with argparsing of whether there is a seed flag or not.

Let's also pull the requirements commit out of this PR, and I'll accept it in the other PR.

hsiaoyi0504 · 2016-12-22T05:55:41Z

Is it ok?

This way, someone can pass in the seed. This also moves the keras loading into main.

This way, someone can pass in the seed. This also moves the keras loading into main, and seeds the SmilesDataGenerator.

pechersky · 2016-12-22T13:57:57Z

I've committed a couple changes to train and train_gen to take the seed as a cli parameter. Could you test that they work as you expect?

hsiaoyi0504 · 2016-12-23T14:21:06Z

I test using tensorflow backend, and I found it doesn't work. After a quick search of this, it seems still open issue. Besides, I failed to execute using theano backend. I am still checking what happened.

pechersky · 2016-12-23T14:57:55Z

Do you at least get consistent shuffling? Is it just the weight initialization that is not seeded?

…

On Fri, Dec 23, 2016 at 9:21 AM, hsiao yi ***@***.***> wrote: I test using tensorflow backend, and I found it doesn't work. After a quick search of this, it seems still open issue <keras-team/keras#2280>. Besides, I failed to execute using theano backend. I am still checking what happened. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#51 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AFGDhjBjitQKZOHcEO4NUhnAhsVSS7-pks5rK9jSgaJpZM4LSxtK> .

hsiaoyi0504 · 2016-12-24T04:43:36Z

It looks like the train.py works fine using theano backend, so I think we can wait the keras update in the future. I will now start checking the train_gen.py part.

hsiaoyi0504 · 2016-12-24T06:44:44Z

I am still unable to execute the train_gen.py. What is the input of this file?

hsiaoyi0504 · 2016-12-24T06:47:53Z

Oh, I find that I used wrong input file. The train_gen.py doesn't need preprocessing.py.

pylang · 2017-04-01T00:21:57Z

#2743 is still an issue for me, despite all the suggestions. Restarting the notebook gives different results as well.

keras 2.0.2
numpy 1.11.2
tensorflow 1.0.0

hsiaoyi0504 · 2017-04-01T02:48:35Z

It turns out that it is due to keras-team/keras#2280. I gave up trying these things long time ago. It seems that it can't be fixed in an easy manner.

pylang · 2017-04-01T03:31:53Z

Thanks for the information. Non-reproducible results is a serious issue in keras imo.

ghost · 2020-08-26T11:18:04Z

train.py

@@ -5,14 +5,11 @@
 import h5py
 import numpy as np

-from molecules.model import MoleculeVAE


This isn't working for mw

hsiaoyi0504 added 4 commits November 9, 2016 20:15

use https rather than git://

1ea9e5f

update scikit-learn import

ca26f24

fixed conflict

859401a

Add random seed for training

d6c6b8a

Also set a seed for vectorizer.py

1b5d581

pechersky and others added 3 commits December 21, 2016 10:51

Merge branch 'master' into master

4037bde

add random seed kwarg in class SmilesDataGenerator

b5a0669

remove random seed global initialization

b78c965

pechersky added 2 commits December 22, 2016 08:52

Make train.py seed random from args

4193e5c

This way, someone can pass in the seed. This also moves the keras loading into main.

Make train_gen.py seed random from args

d312987

This way, someone can pass in the seed. This also moves the keras loading into main, and seeds the SmilesDataGenerator.

pylang mentioned this pull request Apr 1, 2017

No reproducible using tensorflow backend keras-team/keras#2280

Closed

ghost reviewed Aug 26, 2020

View reviewed changes

train.py

@@ -5,14 +5,11 @@

import h5py

import numpy as np

from molecules.model import MoleculeVAE

Copy link

ghost Aug 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't working for mw

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add random seed of training for reproducibility #51

Add random seed of training for reproducibility #51

hsiaoyi0504 commented Dec 21, 2016

hsiaoyi0504 commented Dec 21, 2016

pechersky commented Dec 21, 2016

hsiaoyi0504 commented Dec 21, 2016

hsiaoyi0504 commented Dec 21, 2016

hsiaoyi0504 commented Dec 21, 2016

hsiaoyi0504 commented Dec 21, 2016

pechersky commented Dec 21, 2016

hsiaoyi0504 commented Dec 22, 2016

pechersky commented Dec 22, 2016

hsiaoyi0504 commented Dec 23, 2016

pechersky commented Dec 23, 2016 via email

hsiaoyi0504 commented Dec 24, 2016

hsiaoyi0504 commented Dec 24, 2016

hsiaoyi0504 commented Dec 24, 2016

pylang commented Apr 1, 2017 •

edited

Loading

hsiaoyi0504 commented Apr 1, 2017

pylang commented Apr 1, 2017 •

edited

Loading

ghost Aug 26, 2020

Add random seed of training for reproducibility #51

Are you sure you want to change the base?

Add random seed of training for reproducibility #51

Conversation

hsiaoyi0504 commented Dec 21, 2016

hsiaoyi0504 commented Dec 21, 2016

pechersky commented Dec 21, 2016

hsiaoyi0504 commented Dec 21, 2016

hsiaoyi0504 commented Dec 21, 2016

hsiaoyi0504 commented Dec 21, 2016

hsiaoyi0504 commented Dec 21, 2016

pechersky commented Dec 21, 2016

hsiaoyi0504 commented Dec 22, 2016

pechersky commented Dec 22, 2016

hsiaoyi0504 commented Dec 23, 2016

pechersky commented Dec 23, 2016 via email

hsiaoyi0504 commented Dec 24, 2016

hsiaoyi0504 commented Dec 24, 2016

hsiaoyi0504 commented Dec 24, 2016

pylang commented Apr 1, 2017 • edited Loading

hsiaoyi0504 commented Apr 1, 2017

pylang commented Apr 1, 2017 • edited Loading

ghost Aug 26, 2020

Choose a reason for hiding this comment

pylang commented Apr 1, 2017 •

edited

Loading

pylang commented Apr 1, 2017 •

edited

Loading