Information about the focal length with which a photo is taken might be obstructed (internet photos) or not available (vintage photos). Inferring the focal length of a photo solely from a monocular view is an ill-posed task that requires knowledge about the scale of objects and their distance to the camera - e.g. scene understanding. I trained a deep learning model to acquire such scene understanding to predict the focal length and open-source the model with this repository.
Focal lengths influence the distortion of an image. Source image credits to Reddit user u/scyshc
I preprocessed the focal lengths of ~15k of my personal image database to convert them to 35mm equivalent using Jeffrey Friedl's LR Plugin. The images were cropped to a square shape and resampled to 256x256. Using that data, I trained an EfficientNet B4 with log-transformed labels and L1 loss, which showed a mean absolute error of 16mm on the hold-out set.
Set up a python environment using requirements.txt
. Commands for the creation of the dataset, training, and prediction are provided in the lauch.json file.
Training data is available upon request.
The pretrained model can be accessed here.
@misc{Metzger2023MLFocalLengths,
author = {Nando Metzger},
title = {MLFocalLengths: Estimating the Focal Length of a Single Image},
year = {2023},
url = {https://github.com/nandometzger/MLFocalLengths},
note = {GitHub repository}
}