Implement Vision transformers and cyclic re-fine/retrain endpoint #67

DiegoPino · 2025-01-15T00:55:15Z

What?

Since we began our ML explorations, some models and approaches have become more mature. Our existing Image models (MobileNet + YOLO) do a decent job on finding "similarities" (and yolo not bad on image segmentation), but honestly the results are not good enough for actual Field specific matching.

That said, Vision Transformers(ViT) seem to have a better zero-shot and semantic based Similarity Embedding generation and I want to give them a try.

The Google ViT, which also can be refined by re-training on a few extra images (thus the "cyclic" idea) generates an embedding dimension 768, which also matches our Archipelago Strawberryfield Code for SBFlavors.

I will open tomorrow also one for CliP which uses (an idea I had when we started but these people might be smarter than me) same Vector Space for Text and Image, which allows a phrase like "Has a red Car and a blue one" and an image of a "red Car" to be encoded using compatible vectors and thus allows for dot products between "textual representations" and "image" but also "image to image" to be executed. The Apple one, trained on 5Billion images! Might be a good experiment.

All this is to be evaluated and follows the same rules as before. No data is shared to the outside, All vectors are indexed internally

DiegoPino added esmero-nlp Natural Language Processing as API ML Xperiments Distrust through research and (in)validation labels Jan 15, 2025

DiegoPino added this to the 1.5.0 milestone Jan 15, 2025

DiegoPino self-assigned this Jan 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Vision transformers and cyclic re-fine/retrain endpoint #67

Implement Vision transformers and cyclic re-fine/retrain endpoint #67

DiegoPino commented Jan 15, 2025 •

edited

Loading

Implement Vision transformers and cyclic re-fine/retrain endpoint #67

Implement Vision transformers and cyclic re-fine/retrain endpoint #67

Comments

DiegoPino commented Jan 15, 2025 • edited Loading

What?

DiegoPino commented Jan 15, 2025 •

edited

Loading