These notebooks are one of two ways that the data for this project is processed and manipulated. The other is via Processing sketchbooks (in the sketchbooks/
folder).
The overall process is:
- Prepare and clean data, including resizing images, generating missing images, or basic analysis.
- Create embedding of features (using UMAP)
- Create grid from embedding.
Create Canonical File Order The original file structure for the photos is Box_Name/Photo_Number.png
. Not all the box names are numbers, and not all the photo numbers are only numbers. This code uses a natural sort function to define a canonical order across all the files.
Convert csv to sorted npy Some of the raw data is stored as a .csv
and filenames.txt
pair, where the .csv
contains features (like inceptionv3
or vgg
features) and the filenames.txt
describes which files the features belong to. This notebook uses the canonical image order and sorts the features to follow the same order.
Create Missing Detectron Contour Images Some of the images in the detectron
dataset were missing .json files. Here we make up for that by finding images without any Detectron json, and generating black contour images.
Get Total People Uses the people_in_images
data to generate metadata useful for supervision: the categorical total number of faces in each photo.
Create Resized Images Is a very flexible notebook that can take a folder of images and generate a cropped, resized set of images, or a .npy file containing all the images.
Create Cropped Faces Generates cropped face photos based on the OpenFace data and original images, along with some metadata linking them back to the original OpenFace data.
Create Embeddings ingests all the different data products and outputs embeddings using nine different combinations of UMAP parameters.
Create Grid from Embedding Snaps the point cloud embeddings to a grid. Somewhat experimental and can be fairly slow (10 minutes for 60k points).
Create Mosaic Useful for generating a single plot of the nine different UMAP outputs.
Convert npy to tsv Demo of converting .npy
files to .tsv
for use in Processing sketchbooks.
Combine Depth Saliency and Detectron Using small images from the resizing notebook, this shows a single grid/mosaic of detectron contours, saliency, and depth overlaid on the original photos.