-
Notifications
You must be signed in to change notification settings - Fork 416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stochastic depth example #67
base: master
Are you sure you want to change the base?
Conversation
Thanks! I'm not so sure about the way the implementation uses the layer names... I'd start from the residual networks example and either modify the ElemwiseSumLayer to include a
Possible, but not required for reproducing their paper. I'd suggest to concentrate on that, if possible. |
Actually, the Stochastic Depth code is very clean and easy to read, whether you know Torch or not. I'd suggest to start from this, rather than from the paper, and write a Python script reproducing results for CIFAR-10 (rather than a notebook). It would be cool to compare performance of |
Thanks! I'll have a look and try simplify the code a bit. I didn't realize the same layers were dropped for all mini batch examples. |
Thanks! Don't have time to review this week... anybody else?
It's more complicated to get any speed savings when dropping different layers per example. |
Hi, I have made some big changes (and more commits than necessary =/)! I decided to be a bit more faithful to the paper, as they use residual blocks to evaluate their algorithm, so I used some of the code you wrote here for the residual blocks: I have also reproduced their architecture for CIFAR-10 which is described here: https://github.com/yueatsprograms/Stochastic_Depth/blob/master/ResidualDrop.lua I have implemented both kinds of dropout layers, one that uses Three epochs on the Things I need to do:
|
Don't worry, this can be cleaned up in retrospect if needed. Sounds you've made some good progress, cool!
Be sure to compare to https://github.com/yueatsprograms/Stochastic_Depth/blob/master/ResidualDrop.lua. I think it will be easy to just port their code to Lasagne. |
I should have some experiments to post here in the short future. I'm about to run some exps on CIFAR-10 with their linear decay schedule |
Hi,
https://arxiv.org/abs/1603.09382
This is in reference to an issue I opened recently, #66. I have sent a PR mainly to get feedback (there are some rough edges) and to see if I've made any mistakes in my implementation, rather than this be something that is ready to be accepted. Also, because I cannot access a GPU at the moment, I'm unable to run a proper experiment, e.g. using a very deep network, as this is what this method is designed for.
I'm aware that currently I have written the code so that it is only applicable to convnets -- I will need to make this applicable to dense layers as well.
Thanks