Exploring GAN-CLS and MSGAN with a bird-dataset

I combine the GAN-CLS algorithm by Reed et al. [1] and the MS-GAN regulation term by Mao et al. [2] and experiment with the caltech bird-dataset. I experimented with the GAN architecture proposed by Ledig et al [3].

Usage

Please refer to the READMEs in the folder images, captions, and word2vec_pretrained_model to obtain the necessary data.
Run python process_images.py to resize and normalize the images and generate numpy arrays.
Run python process_captions.py to generate sentence embeddings for the captions.
Upload the generated images vectors, sentence vectors and pretrained word2vec model to a Google Drive account.
Import the jupyter notebook Text2Image_GAN_MS.ipynb in Google Colab and load the data.
Run code snippets in Google Colab.

Results

I trained the GAN model for 960 epochs with the ADAM optimizer [4] for the discriminator and generator with a learning rate of 0.000035 and beta_1=0.5. Most of the synthesized images do depict plausible colors and shapes of birds and there does seem to be a lot of diversity; however, the GAN did have some minor mode collapse problems when generating images based on made up captions as seen below.

Interpolating between sentence vectors

References

[1] Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, and Honglak Lee. Generative adversarial text-to-image synthesis. In Proceedings of The 33rd International Conference on Machine Learning, 2016.
[2] Qi Mao, Hsin-Ying Lee, Hung-Yu Tseng, Siwei Ma, and Ming-Hsuan Yang. Mode Seeking Generative Adversarial Networks for Diverse Image Synthesis, IEEE Conference on Computer Vision and Pattern Recognition, 2019.
[3] Christian Ledig, Lucas Theis, Ference Huszar, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, et al. Photo-realistic single image super-resolution using a generative adversarial network. arXiv preprint arXiv:1609.04802, 2016.
[4] Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. CoRR, abs/1412.6980, 2014.

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
captions		captions
images		images
readme_images		readme_images
trained_model		trained_model
word2vec_pretrained_model		word2vec_pretrained_model
README.md		README.md
Text2Image_GAN_MS.ipynb		Text2Image_GAN_MS.ipynb
final.csv		final.csv
intermediate_results.csv		intermediate_results.csv
process_captions.py		process_captions.py
process_images.py		process_images.py
requirements.txt		requirements.txt
text2image_gan_ms.py		text2image_gan_ms.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Exploring GAN-CLS and MSGAN with a bird-dataset

Usage

Results

Interpolating between sentence vectors

References

About

Releases

Packages

Contributors 2

Languages

AloneTogetherY/text-to-image-synthesis

Folders and files

Latest commit

History

Repository files navigation

Exploring GAN-CLS and MSGAN with a bird-dataset

Usage

Results

Interpolating between sentence vectors

References

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages