Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question about data breach #1

Open
hansensamjohn opened this issue Jan 2, 2025 · 1 comment
Open

question about data breach #1

hansensamjohn opened this issue Jan 2, 2025 · 1 comment

Comments

@hansensamjohn
Copy link

An easy question! How can we ensure that there is no data breach when using a pretrained model trained on a series of large datasets, including the COCO dataset with 80 classes and other datasets? Additionally, it seems that U-Recall may be misleading, as all objects in the M-OWODB and S-OWODB experiments appear to be known classes in pretrained model like clip.

@leonnil
Copy link
Contributor

leonnil commented Jan 2, 2025

Hi, @hansensamjohn, thank you for your interest. During the pretraining process, we ensured that all data from the COCO dataset was removed, but as you rightly pointed out, we cannot guarantee that similar data does not appear in other pretraining datasets in some form, which complicates the definition of "unknown" classes and is a common challenge when using pretrained model. We conducted additional tests in recent real-world benchmark and on web images, and our observations indicate that our method still demonstrates strong generalization capabilities. While we recognize concerns in its evaluation metrics and datasets, we ultimately chose to use the widely adopted Open-World dataset, supplemented with the nuScenes dataset for autonomous driving scenarios, in our paper. Selecting more appropriate benchmarks would be an important area for our future exploration and improvement. Please feel free to share any additional thoughts or suggestions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants