(WIP) CLIP-PO

Method

Generative

Text -> Image

Sample images using prompt
Generate/augment captions for sampled images that contains descriptive differences (spatial locations, orientations, object summary, etc.)
Rank these sampled image and generated caption pairs for each prompt

Human-Feedback

TODO

Goals:

VSR Random Results

Model Name	Accuracy	Macro Average Recall	Macro Average Precision	Macro Average F1
open_clip ViT-H-14	0.364	0.357	0.354	0.356
open_clip ViT-H-14 w/ VSR DPO	0.883	0.891	0.899	0.895
open_clip ViT-H-14 w/ VSR IPO	0.877	0.886	0.895	0.891

Citations

@software{Ilharco_OpenCLIP_2021,
    author = {Ilharco, Gabriel and Wortsman, Mitchell and Wightman, Ross and Gordon, Cade and Carlini, Nicholas and Taori, Rohan and Dave, Achal and Shankar, Vaishaal and Namkoong, Hongseok and Miller, John and Hajishirzi, Hannaneh and Farhadi, Ali and Schmidt, Ludwig},
    doi = {10.5281/zenodo.5143773},
    month = jul,
    title = {{OpenCLIP}},
    version = {v0.1},
    year = {2021}
}

@article{Liu2022VisualSR,
    title={Visual Spatial Reasoning},
    author={Fangyu Liu and Guy Edward Toh Emerson and Nigel Collier},
    journal={Transactions of the Association for Computational Linguistics},
    year={2023},
}

@inproceedings{yuksekgonul2023when,
    title={When and why Vision-Language Models behave like Bags-of-Words, and what to do about it?},
    author={Mert Yuksekgonul and Federico Bianchi and Pratyusha Kalluri and Dan Jurafsky and James Zou},
    booktitle={International Conference on Learning Representations},
    year={2023},
    url={https://openreview.net/forum?id=KRLUvxh8uaX}
}

@inproceedings{hsieh2023sugarcrepe,
    title={SugarCrepe: Fixing Hackable Benchmarks for Vision-Language Compositionality},
    author={Hsieh, Cheng-Yu and Zhang, Jieyu and Ma, Zixian and Kembhavi, Aniruddha and Krishna, Ranjay},
    booktitle={Thirty-Seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
    year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
clip_rl		clip_rl
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

(WIP) CLIP-PO

Method

Goals:

VSR Random Results

Citations

About

Releases

Packages

Languages

sramshetty/CLIP-PO

Folders and files

Latest commit

History

Repository files navigation

(WIP) CLIP-PO

Method

Goals:

VSR Random Results

Citations

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages