Official implementation of the COFAR: Commonsense and Factual Reasoning in Image Search (AACL-IJCNLP 2022 Paper)
-
Use python >= 3.8.5. Conda recommended : https://docs.anaconda.com/anaconda/install/linux/
-
Use pytorch 1.9.0; CUDA 11.1
To setup environment
conda env create -n kmmt --file kmmt.yml
conda activate kmmt
Images: link.
Image format: Readme.
Training and testing data can be downloaded from the "Dataset Downloads" section in this page.
Also, both oracle and wikified knowledge bases for all categories can be downloaded from the same link above.
Image Feature extraction: Script.
Create folder train_obj_frcn_features/
inside data/cofar_{category}/
folder for corresponding categories and copy image features to this folder.
MS-COCO - pretraining checkpoint can be downloaded from here.
Place the downloaded kmmt_pretrain_checkpoint.pt
in working_checkpoints
folder.
Respective config files are in config/
folder and are automatically loaded.
python main.py --do_train --mode mlm
CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --nproc_per_node 2 --nnodes 1 --node_rank 0 main.py --do_train --mode itm
Download our cofar finetuned checkpoint from here.
Copy the downloaded cofar_itm_final_checkpoint.pt
to working_checkpoints/
folder.
python cofar_eval.py --category brand
Other settings can be changed from config/test_config.yaml
This code and data are released under the MIT license.
If you find this data/code/paper useful for your research, please consider citing.
@inproceedings{cofar2022,
author = "Gatti, Prajwal and
Penamakuri, Abhirama Subramanyam and
Teotia, Revant and
Mishra, Anand and
Sengupta, Shubhashis and
Ramnani, Roshni",
title = "COFAR: Commonsense and Factual Reasoning in Image Search",
booktitle = "AACL-IJCNLP",
year = "2022",
}