You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
So, you might find OpenAI's code produces around 59% accuracy for zero-shot CLIP (vision_model=RN50) on ImageNet with prompt ensembling, but CoOp's code gives only 57.81% for the same model (see Table 7 in the paper).
This difference is caused by using different transforms: OpenAI's code applies Resize(224) to an image while CoOp's code (the previous version) uses Resize((224, 224)). More specifically, the former keeps the image aspect ratio while the latter doesn't. To allow the results produced by CoOp's code to be comparable to OpenAI's code, we have made our transforms consistent with theirs. So the transforms in the config files have now been changed from ["random_flip", "random_translation", "center_crop", "normalize"] to ["random_resized_crop", "random_flip", "normalize"].
If you are using our Dassl-based CoOp code, please update the code to the latest version. If you want to use your own code, you can simple copy CoOp's model code (i.e. CustomCLIP) and do the comparison on the same ground with whatever pipelines you are using.
For your reference, we have rerun CoOp using the new config files and put below the comparison of Table 7's results.
Previous version
Method
RN50
Rn101
ViT-B/32
ViT-B/16
Prompt engineering
55.41
58.72
59.88
64.71
Prompt ensembling
57.81
60.49
62.01
67.31
CoOp
60.46
64.39
64.92
70.13
Current version
Method
RN50
Rn101
ViT-B/32
ViT-B/16
Prompt engineering
58.18
61.26
62.05
66.73
Prompt ensembling
60.41
62.54
63.71
68.74
CoOp
62.95
66.60
66.85
71.92
The text was updated successfully, but these errors were encountered:
So, you might find OpenAI's code produces around 59% accuracy for zero-shot CLIP (
vision_model=RN50
) on ImageNet with prompt ensembling, but CoOp's code gives only 57.81% for the same model (see Table 7 in the paper).This difference is caused by using different transforms: OpenAI's code applies
Resize(224)
to an image while CoOp's code (the previous version) usesResize((224, 224))
. More specifically, the former keeps the image aspect ratio while the latter doesn't. To allow the results produced by CoOp's code to be comparable to OpenAI's code, we have made our transforms consistent with theirs. So the transforms in the config files have now been changed from["random_flip", "random_translation", "center_crop", "normalize"]
to["random_resized_crop", "random_flip", "normalize"]
.If you are using our Dassl-based CoOp code, please update the code to the latest version. If you want to use your own code, you can simple copy CoOp's model code (i.e. CustomCLIP) and do the comparison on the same ground with whatever pipelines you are using.
For your reference, we have rerun CoOp using the new config files and put below the comparison of Table 7's results.
Previous version
Current version
The text was updated successfully, but these errors were encountered: