Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Vision transformers and cyclic re-fine/retrain endpoint #67

Open
DiegoPino opened this issue Jan 15, 2025 · 0 comments
Open
Assignees
Labels
esmero-nlp Natural Language Processing as API ML Xperiments Distrust through research and (in)validation
Milestone

Comments

@DiegoPino
Copy link
Member

DiegoPino commented Jan 15, 2025

What?

Since we began our ML explorations, some models and approaches have become more mature. Our existing Image models (MobileNet + YOLO) do a decent job on finding "similarities" (and yolo not bad on image segmentation), but honestly the results are not good enough for actual Field specific matching.

That said, Vision Transformers(ViT) seem to have a better zero-shot and semantic based Similarity Embedding generation and I want to give them a try.

The Google ViT, which also can be refined by re-training on a few extra images (thus the "cyclic" idea) generates an embedding dimension 768, which also matches our Archipelago Strawberryfield Code for SBFlavors.

I will open tomorrow also one for CliP which uses (an idea I had when we started but these people might be smarter than me) same Vector Space for Text and Image, which allows a phrase like "Has a red Car and a blue one" and an image of a "red Car" to be encoded using compatible vectors and thus allows for dot products between "textual representations" and "image" but also "image to image" to be executed. The Apple one, trained on 5Billion images! Might be a good experiment.

All this is to be evaluated and follows the same rules as before. No data is shared to the outside, All vectors are indexed internally

@DiegoPino DiegoPino added esmero-nlp Natural Language Processing as API ML Xperiments Distrust through research and (in)validation labels Jan 15, 2025
@DiegoPino DiegoPino added this to the 1.5.0 milestone Jan 15, 2025
@DiegoPino DiegoPino self-assigned this Jan 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
esmero-nlp Natural Language Processing as API ML Xperiments Distrust through research and (in)validation
Projects
None yet
Development

No branches or pull requests

1 participant