Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector

Fu, Yuqian; Wang, Yu; Pan, Yixuan; Huai, Lian; Qiu, Xingyu; Shangguan, Zeyu; Liu, Tong; Fu, Yanwei; Van Gool, Luc; Jiang, Xingqun

doi:10.1007/978-3-031-73636-0_15

Yuqian Fu^13,14,15,
Yu Wang¹³,
Yixuan Pan¹⁶,
Lian Huai¹⁷,
Xingyu Qiu¹³,
Zeyu Shangguan¹⁷,
Tong Liu¹⁷,
Yanwei Fu¹³,
Luc Van Gool^14,15 &
…
Xingqun Jiang¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15116))

Included in the following conference series:

European Conference on Computer Vision

366 Accesses

Abstract

This paper studies the challenging cross-domain few-shot object detection (CD-FSOD), aiming to develop an accurate object detector for novel domains with minimal labeled examples. While transformer-based open-set detectors, such as DE-ViT, show promise in traditional few-shot object detection, their generalization to CD-FSOD remains unclear: 1) can such open-set detection methods easily generalize to CD-FSOD? 2) If not, how can models be enhanced when facing huge domain gaps? To answer the first question, we employ measures including style, inter-class variance (ICV), and indefinable boundaries (IB) to understand the domain gap. Based on these measures, we establish a new benchmark named CD-FSOD to evaluate object detection methods, revealing that most of the current approaches fail to generalize across domains. Technically, we observe that the performance decline is associated with our proposed measures: style, ICV, and IB. Consequently, we propose several novel modules to address these issues. First, the learnable instance features align initial fixed instances with target categories, enhancing feature distinctiveness. Second, the instance reweighting module assigns higher importance to high-quality instances with slight IB. Third, the domain prompter encourages features resilient to different styles by synthesizing imaginary domains without altering semantic contents. These techniques collectively contribute to the development of the Cross-Domain Vision Transformer for CD-FSOD (CD-ViTO), significantly improving upon the base DE-ViT. Experimental results validate the efficacy of our model. Datasets and codes are available at http://yuqianfu.com/CDFSOD-benchmark.

Y. Wang and Y. Pan—Equal contributions.

Part of this work commenced during Dr. Yuqian Fu’s PhD at Fudan University.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Zero-Shot Object Counting with Good Exemplars

AcroFOD: An Adaptive Method for Cross-Domain Few-Shot Object Detection

SAPT: Saliency Augmentation and Unsupervised Pre-trained Model Fusion for Few-Shot Object Detection

Article 28 November 2024

Notes

1.
Despite varying definitions, we adopt the one of [52] that “both open-vocabulary and few-shot belong to open-set except their category representations”.

References

Dosovitskiy, A., et al.: An image is worth $16 \times 16$ words: transformers for image recognition at scale. In: ICLR (2021)
Google Scholar
Drange, G.: Arthropod taxonomy orders object detection dataset (2019). https://doi.org/10.34740/kaggle/dsv/1240192
Fan, D.P., Ji, G.P., Cheng, M.M., Shao, L.: Concealed object detection. TPAMI (2021)
Google Scholar
Fan, Z., Ma, Y., Li, Z., Sun, J.: Generalized few-shot object detection without forgetting. In: CVPR (2021)
Google Scholar
Fu, Y., Fu, Y., Jiang, Y.G.: Meta-FDMixup: cross-domain few-shot learning guided by labeled target data. In: ACM MM (2021)
Google Scholar
Fu, Y., Xie, Y., Fu, Y., Jiang, Y.G.: StyleAdv: meta style adversarial training for cross-domain few-shot learning. In: CVPR (2023)
Google Scholar
Fu, Y., Zhang, L., Wang, J., Fu, Y., Jiang, Y.G.: Depth guided adaptive meta-fusion network for few-shot video recognition. In: ACM MM (2020)
Google Scholar
Gao, Y., Lin, K.Y., Yan, J., Wang, Y., Zheng, W.S.: AsyFOD: an asymmetric adaptation paradigm for few-shot domain adaptive object detection. In: CVPR (2023)
Google Scholar
Gao, Y., Yang, L., Huang, Y., Xie, S., Li, S., Zheng, W.S.: AcroFOD: an adaptive method for cross-domain few-shot object detection. In: ECCV (2022)
Google Scholar
Girshick, R.: Fast R-CNN. In: ICCV (2015)
Google Scholar
Guirguis, K., Meier, J., Eskandar, G., Kayser, M., Yang, B., Beyerer, J.: Niff: alleviating forgetting in generalized few-shot object detection via neural instance feature forging. In: CVPR (2023)
Google Scholar
Guo, Y., et al.: A broader study of cross-domain few-shot learning. In: ECCV (2020)
Google Scholar
Han, G., He, Y., Huang, S., Ma, J., Chang, S.F.: Query adaptive few-shot object detection with heterogeneous graph convolutional networks. In: ICCV (2021)
Google Scholar
Han, G., Huang, S., Ma, J., He, Y., Chang, S.F.: Meta faster R-CNN: towards accurate few-shot object detection with attentive feature alignment. In: AAAI (2022)
Google Scholar
Han, G., Ma, J., Huang, S., Chen, L., Chang, S.F.: Few-shot object detection with fully cross-transformer. In: CVPR (2022)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Google Scholar
Hu, S.X., Li, D., Stühmer, J., Kim, M., Hospedales, T.M.: Pushing the limits of simple pipelines for few-shot learning: external data and fine-tuning make a difference. In: CVPR (2022)
Google Scholar
Inoue, N., Furuta, R., Yamasaki, T., Aizawa, K.: Cross-domain weakly-supervised object detection through progressive domain adaptation. In: CVPR (2018)
Google Scholar
Jia, M., et al.: Visual prompt tuning. In: ECCV (2022)
Google Scholar
Jiang, L., et al.: Underwater species detection using channel sharpening attention. In: ACM MM (2021)
Google Scholar
Kang, B., Liu, Z., Wang, X., Yu, F., Feng, J., Darrell, T.: Few-shot object detection via feature reweighting. In: ICCV (2019)
Google Scholar
Kaul, P., Xie, W., Zisserman, A.: Label, verify, correct: a simple few shot object detection method. In: CVPR (2022)
Google Scholar
Köhler, M., Eisenbach, M., Gross, H.M.: Few-shot object detection: a comprehensive survey. arXiv preprint arXiv:2112.11699 (2021)
Lee, K., et al.: Rethinking few-shot object detection on a multi-domain benchmark. In: ECCV (2022)
Google Scholar
Lester, B., Al-Rfou, R., Constant, N.: The power of scale for parameter-efficient prompt tuning. arXiv preprint arXiv:2104.08691 (2021)
Li, K., Wan, G., Cheng, G., Meng, L., Han, J.: Object detection in optical remote sensing images: a survey and a new benchmark. ISPRS (2020)
Google Scholar
Li, Y., Mao, H., Girshick, R., He, K.: Exploring plain vision transformer backbones for object detection. In: ECCV (2022)
Google Scholar
Lin, T.Y., et al.: Microsoft coco: common objects in context. In: ECCV (2014)
Google Scholar
Luo, X., Wu, H., Zhang, J., Gao, L., Xu, J., Song, J.: A closer look at few-shot classification again. In: ICML (2023)
Google Scholar
Luo, Y., Liu, P., Guan, T., Yu, J., Yang, Y.: Adversarial style mining for one-shot unsupervised domain adaptation. In: NeurIPS (2020)
Google Scholar
Ma, J., Niu, Y., Xu, J., Huang, S., Han, G., Chang, S.F.: DiGeo: discriminative geometry-aware learning for generalized few-shot object detection. In: CVPR (2023)
Google Scholar
Oquab, M., et al.: DINOv2: learning robust visual features without supervision. arXiv preprint arXiv:2304.07193 (2023)
Qiao, L., Zhao, Y., Li, Z., Qiu, X., Wu, J., Zhang, C.: DeFRCN: decoupled faster R-CNN for few-shot object detection. In: ICCV (2021)
Google Scholar
Radford, A., et al.: Learning transferable visual models from natural language supervision. In: ICML (2021)
Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: CVPR (2016)
Google Scholar
Saleh, A., Laradji, I.H., Konovalov, D.A., Bradley, M., Vazquez, D., Sheaves, M.: A realistic fish-habitat dataset to evaluate algorithms for underwater visual analysis. Sci. Rep. (2020)
Google Scholar
Snell, J., Swersky, K., Zemel, R.: Prototypical networks for few-shot learning. In: NeurIPS (2017)
Google Scholar
Song, K., Yan, Y.: A noise robust method based on completed local binary patterns for hot-rolled steel strip surface defects. Appl. Surface Sci. (2013)
Google Scholar
Sun, B., Li, B., Cai, S., Yuan, Y., Zhang, C.: FSCE: few-shot object detection via contrastive proposal encoding. In: CVPR (2021)
Google Scholar
Tang, H., Yuan, C., Li, Z., Tang, J.: Learning attention-guided pyramidal features for few-shot fine-grained recognition. Pattern Recogn. (2022)
Google Scholar
Tseng, H.Y., Lee, H.Y., Huang, J.B., Yang, M.H.: Cross-domain few-shot classification via learned feature-wise transformation. In: ICLR (2020)
Google Scholar
Vinyals, O., Blundell, C., Lillicrap, T., Wierstra, D., et al.: Matching networks for one shot learning. In: NeurIPS (2016)
Google Scholar
Wang, H., Deng, Z.H.: Cross-domain few-shot classification via adversarial task augmentation. arXiv preprint (2021)
Google Scholar
Wang, K., Liew, J.H., Zou, Y., Zhou, D., Feng, J.: Panet: few-shot image semantic segmentation with prototype alignment. In: ICCV (2019)
Google Scholar
Wang, X., Huang, T.E., Darrell, T., Gonzalez, J.E., Yu, F.: Frustratingly simple few-shot object detection. arXiv preprint arXiv:2003.06957 (2020)
Xie, G.S., Xiong, H., Liu, J., Yao, Y., Shao, L.: Few-shot semantic segmentation with cyclic memory network. In: ICCV (2021)
Google Scholar
Xiong, W.: CD-FSOD: a benchmark for cross-domain few-shot object detection. In: ICASSP (2023)
Google Scholar
Yan, X., Chen, Z., Xu, A., Wang, X., Liang, X., Lin, L.: Meta R-CNN: towards general solver for instance-level low-shot learning. In: ICCV (2019)
Google Scholar
Zareian, A., Rosa, K.D., Hu, D.H., Chang, S.F.: Open-vocabulary object detection using captions. In: CVPR (2021)
Google Scholar
Zhang, H., Zhang, L., Qi, X., Li, H., Torr, P.H., Koniusz, P.: Few-shot action recognition with permutation-invariant attention. In: ECCV (2020)
Google Scholar
Zhang, J., Gao, L., Luo, X., Shen, H., Song, J.: DETA: denoised task adaptation for few-shot learning. In: ICCV (2023)
Google Scholar
Zhang, X., Wang, Y., Boularias, A.: Detect every thing with few examples. arXiv preprint arXiv:2309.12969 (2023)
Zhao, S., et al.: Exploiting unlabeled data with vision and language models for object detection. In: ECCV (2022)
Google Scholar
Zhong, Y., et al.: RegionCLIP: region-based language-image pretraining. In: CVPR (2022)
Google Scholar
Zhou, K., Yang, Y., Qiao, Y., Xiang, T.: Domain generalization with mixstyle. In: ICLR (2021)
Google Scholar
Zhou, X., Girdhar, R., Joulin, A., Krähenbühl, P., Misra, I.: Detecting twenty-thousand classes using image-level supervision. In: ECCV (2022)
Google Scholar
Zhuo, L., Fu, Y., Chen, J., Cao, Y., Jiang, Y.G.: TGDM: target guided dynamic mixup for cross-domain few-shot learning. In: ACM MM (2022)
Google Scholar

Download references

Author information

Authors and Affiliations

Fudan University, Shanghai, China
Yuqian Fu, Yu Wang, Xingyu Qiu & Yanwei Fu
ETH Zürich, Zürich, Switzerland
Yuqian Fu & Luc Van Gool
INSAIT, Sofia, Bulgaria
Yuqian Fu & Luc Van Gool
Southeast University, Nanjing, China
Yixuan Pan
BOE Technology, Beijing, China
Lian Huai, Zeyu Shangguan, Tong Liu & Xingqun Jiang

Authors

Yuqian Fu
View author publications
You can also search for this author in PubMed Google Scholar
Yu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yixuan Pan
View author publications
You can also search for this author in PubMed Google Scholar
Lian Huai
View author publications
You can also search for this author in PubMed Google Scholar
Xingyu Qiu
View author publications
You can also search for this author in PubMed Google Scholar
Zeyu Shangguan
View author publications
You can also search for this author in PubMed Google Scholar
Tong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yanwei Fu
View author publications
You can also search for this author in PubMed Google Scholar
Luc Van Gool
View author publications
You can also search for this author in PubMed Google Scholar
Xingqun Jiang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Germany
Stefan Roth
Princeton University, Princeton, NJ, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 7716 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fu, Y. et al. (2025). Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15116. Springer, Cham. https://doi.org/10.1007/978-3-031-73636-0_15

Download citation

DOI: https://doi.org/10.1007/978-3-031-73636-0_15
Published: 05 November 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-73635-3
Online ISBN: 978-3-031-73636-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Zero-Shot Object Counting with Good Exemplars

AcroFOD: An Adaptive Method for Cross-Domain Few-Shot Object Detection

SAPT: Saliency Augmentation and Unsupervised Pre-trained Model Fusion for Few-Shot Object Detection

Notes

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 7716 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Zero-Shot Object Counting with Good Exemplars

AcroFOD: An Adaptive Method for Cross-Domain Few-Shot Object Detection

SAPT: Saliency Augmentation and Unsupervised Pre-trained Model Fusion for Few-Shot Object Detection

Notes

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 7716 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation