Every year, millions of people are involved in car accidents due to distracted driving. This report explains how it is possible to prevent this problem by exploiting Deep Learning techniques for an images classification task, mainly applying Transfer Learning through the Fine-Tuning method. Several architec- tures are compared: a basic CNN from scratch and some architectures based on a pre-trained model such as MobileNet, VGG16 and VGG19. The use of imagenet weights as initial ones is fundamental to develop a high-performance model. Moreover, to adapt the model to the problem a Sequential Model- Based Optimization is performed, allowing to achieve satisfying results on new images, demonstrating a high level of generalization.
The data are taken from a Kaggle challenge published on 5th April 2016 by State Farm and are composed by one CSV file, a directory containing the training images and a directory containing the test images. They can be found here.
- 1_create_k_folds: loads the data and creates 5-folds so that each driver is present once in the validation and four times in the training set.
- 2_CNN_scratch: development of a convolutional neural network from scratch for the classification of drivers' actions.
- 3_vgg16_fine_tuning: two different fine-tuning strategies from VGG16 for the classification of drivers’ actions. In the first one, the last block of VGG16 and the extra_layers are trained. Instead, in the second one the last two blocks of VGG16 and the extra_layers are trained.
- 4_vgg19_fine_tuning: two different fine-tuning strategies from VGG19 for the classification of drivers’ actions. In the first one, the last block of VGG19 and the extra_layers are trained. Instead, in the second one the last two blocks of VGG19 and the extra_layers are trained.
- 5_mobilenet_fine_tuning: the MobileNet is entirely trained including the extra_layers for the classification of drivers' actions.
- 6_optimization_5Fold_mobilenet: optimization of the best model using SMBO.
- 7_best_config_mobilenet_fine_tuning: training of the model with the best configuration of parameters selected through SMBO.
- 8_test_submission: prediction on the test images and creation of csv file for the Kaggle submission to obtain the Log Loss score.
- 9_predict_demo: prediction of the images of two drivers to create a small demo of the project.
This section presents an application of the project of two new drivers. Each demo consists of 60 randomly selected images with a constant number of frames per class. For each frame is reported the expected class with the relative probability of belonging.
The full video can be found at this link.
The full video can be found at this link.
[1] NHTSA, “Distracted driving,” in https://www.nhtsa.gov/risky-driving/distracted-driving, 2018.
[2] LeCun and Yann, “Lenet-5, convolutional neural networks,” 2013.
[3] State-Farm, “Distracted driver detection,” in https://www.kaggle.com/c/state-farm-distracted-driver-detection/data, 2016.
[4] K. Simonyan and A. Zisserman, “Very Deep Convolutional Network for Large-Scale Image Recognition,” 2015.
[5] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. An- dreetto, and H. Adam, “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” 2017.
[6] T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, “Optuna: A next- generation hyperparameter optimization framework,” in Proceedings of the 25rd ACM SIGKDD International Conference on Knowledge Discovery and Data Min- ing, 2019.