CNN-XLA

A collection of CNN models are trained on Cloud TPU by using PyTorch/XLA. The performance of these models are only tested on the CIFAR-10 dataset due to the limited computational resources, but it is easy to modify them to fit in more complex datasets (i.e., ImageNet 2012 classification dataset).

Get Started

Train on GPU
Train on TPU

CNN Models

Model	Input Resolution	Params(M)	MACs(G)	Percentage Correct
AlexNet	32x32	46.76	0.91	84.9%
VGG-11	32x32	28.14	0.17	69.2%
Inception	32x32	-	-	-
ResNet-18	32x32	11.17	0.56	88.3%
DenseNet-121 (k = 12)	32x32	1.0	0.13	90.5%
SE-ResNet-50 (r = 16)	32x32	26.05	1.31	91.4%
MobileNet-V1	32x32	3.22	0.05	85.1%
MobileNet-V2	32x32	2.3	0.1	88.5%

All of the above models are trained for just 20 epochs with a mini-batch size of 256, learning rate of 0.001 and standard data augmentation. Moreover, the Mish activation function is used for better performance.

The goal of this repository is to implement the core concept of a variety of CNN models, so no fancy tricks are used.

Related Repositories

Dive-into-DL-PyTorch
pytorch-cifar
PyTorch/XLA

License

MIT License

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

CNN-XLA

Get Started

CNN Models

Related Repositories

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

CNN-XLA

Get Started

CNN Models

Related Repositories

License