Skip to main content

Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector

  • Conference paper
  • First Online:
Computer Vision – ECCV 2024 (ECCV 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15116))

Included in the following conference series:

  • 366 Accesses

Abstract

This paper studies the challenging cross-domain few-shot object detection (CD-FSOD), aiming to develop an accurate object detector for novel domains with minimal labeled examples. While transformer-based open-set detectors, such as DE-ViT, show promise in traditional few-shot object detection, their generalization to CD-FSOD remains unclear: 1) can such open-set detection methods easily generalize to CD-FSOD? 2) If not, how can models be enhanced when facing huge domain gaps? To answer the first question, we employ measures including style, inter-class variance (ICV), and indefinable boundaries (IB) to understand the domain gap. Based on these measures, we establish a new benchmark named CD-FSOD to evaluate object detection methods, revealing that most of the current approaches fail to generalize across domains. Technically, we observe that the performance decline is associated with our proposed measures: style, ICV, and IB. Consequently, we propose several novel modules to address these issues. First, the learnable instance features align initial fixed instances with target categories, enhancing feature distinctiveness. Second, the instance reweighting module assigns higher importance to high-quality instances with slight IB. Third, the domain prompter encourages features resilient to different styles by synthesizing imaginary domains without altering semantic contents. These techniques collectively contribute to the development of the Cross-Domain Vision Transformer for CD-FSOD (CD-ViTO), significantly improving upon the base DE-ViT. Experimental results validate the efficacy of our model. Datasets and codes are available at http://yuqianfu.com/CDFSOD-benchmark.

Y. Wang and Y. Pan—Equal contributions.

Part of this work commenced during Dr. Yuqian Fu’s PhD at Fudan University.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Despite varying definitions, we adopt the one of [52] that “both open-vocabulary and few-shot belong to open-set except their category representations”.

References

  1. Dosovitskiy, A., et al.: An image is worth \(16 \times 16\) words: transformers for image recognition at scale. In: ICLR (2021)

    Google Scholar 

  2. Drange, G.: Arthropod taxonomy orders object detection dataset (2019). https://doi.org/10.34740/kaggle/dsv/1240192

  3. Fan, D.P., Ji, G.P., Cheng, M.M., Shao, L.: Concealed object detection. TPAMI (2021)

    Google Scholar 

  4. Fan, Z., Ma, Y., Li, Z., Sun, J.: Generalized few-shot object detection without forgetting. In: CVPR (2021)

    Google Scholar 

  5. Fu, Y., Fu, Y., Jiang, Y.G.: Meta-FDMixup: cross-domain few-shot learning guided by labeled target data. In: ACM MM (2021)

    Google Scholar 

  6. Fu, Y., Xie, Y., Fu, Y., Jiang, Y.G.: StyleAdv: meta style adversarial training for cross-domain few-shot learning. In: CVPR (2023)

    Google Scholar 

  7. Fu, Y., Zhang, L., Wang, J., Fu, Y., Jiang, Y.G.: Depth guided adaptive meta-fusion network for few-shot video recognition. In: ACM MM (2020)

    Google Scholar 

  8. Gao, Y., Lin, K.Y., Yan, J., Wang, Y., Zheng, W.S.: AsyFOD: an asymmetric adaptation paradigm for few-shot domain adaptive object detection. In: CVPR (2023)

    Google Scholar 

  9. Gao, Y., Yang, L., Huang, Y., Xie, S., Li, S., Zheng, W.S.: AcroFOD: an adaptive method for cross-domain few-shot object detection. In: ECCV (2022)

    Google Scholar 

  10. Girshick, R.: Fast R-CNN. In: ICCV (2015)

    Google Scholar 

  11. Guirguis, K., Meier, J., Eskandar, G., Kayser, M., Yang, B., Beyerer, J.: Niff: alleviating forgetting in generalized few-shot object detection via neural instance feature forging. In: CVPR (2023)

    Google Scholar 

  12. Guo, Y., et al.: A broader study of cross-domain few-shot learning. In: ECCV (2020)

    Google Scholar 

  13. Han, G., He, Y., Huang, S., Ma, J., Chang, S.F.: Query adaptive few-shot object detection with heterogeneous graph convolutional networks. In: ICCV (2021)

    Google Scholar 

  14. Han, G., Huang, S., Ma, J., He, Y., Chang, S.F.: Meta faster R-CNN: towards accurate few-shot object detection with attentive feature alignment. In: AAAI (2022)

    Google Scholar 

  15. Han, G., Ma, J., Huang, S., Chen, L., Chang, S.F.: Few-shot object detection with fully cross-transformer. In: CVPR (2022)

    Google Scholar 

  16. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)

    Google Scholar 

  17. Hu, S.X., Li, D., Stühmer, J., Kim, M., Hospedales, T.M.: Pushing the limits of simple pipelines for few-shot learning: external data and fine-tuning make a difference. In: CVPR (2022)

    Google Scholar 

  18. Inoue, N., Furuta, R., Yamasaki, T., Aizawa, K.: Cross-domain weakly-supervised object detection through progressive domain adaptation. In: CVPR (2018)

    Google Scholar 

  19. Jia, M., et al.: Visual prompt tuning. In: ECCV (2022)

    Google Scholar 

  20. Jiang, L., et al.: Underwater species detection using channel sharpening attention. In: ACM MM (2021)

    Google Scholar 

  21. Kang, B., Liu, Z., Wang, X., Yu, F., Feng, J., Darrell, T.: Few-shot object detection via feature reweighting. In: ICCV (2019)

    Google Scholar 

  22. Kaul, P., Xie, W., Zisserman, A.: Label, verify, correct: a simple few shot object detection method. In: CVPR (2022)

    Google Scholar 

  23. Köhler, M., Eisenbach, M., Gross, H.M.: Few-shot object detection: a comprehensive survey. arXiv preprint arXiv:2112.11699 (2021)

  24. Lee, K., et al.: Rethinking few-shot object detection on a multi-domain benchmark. In: ECCV (2022)

    Google Scholar 

  25. Lester, B., Al-Rfou, R., Constant, N.: The power of scale for parameter-efficient prompt tuning. arXiv preprint arXiv:2104.08691 (2021)

  26. Li, K., Wan, G., Cheng, G., Meng, L., Han, J.: Object detection in optical remote sensing images: a survey and a new benchmark. ISPRS (2020)

    Google Scholar 

  27. Li, Y., Mao, H., Girshick, R., He, K.: Exploring plain vision transformer backbones for object detection. In: ECCV (2022)

    Google Scholar 

  28. Lin, T.Y., et al.: Microsoft coco: common objects in context. In: ECCV (2014)

    Google Scholar 

  29. Luo, X., Wu, H., Zhang, J., Gao, L., Xu, J., Song, J.: A closer look at few-shot classification again. In: ICML (2023)

    Google Scholar 

  30. Luo, Y., Liu, P., Guan, T., Yu, J., Yang, Y.: Adversarial style mining for one-shot unsupervised domain adaptation. In: NeurIPS (2020)

    Google Scholar 

  31. Ma, J., Niu, Y., Xu, J., Huang, S., Han, G., Chang, S.F.: DiGeo: discriminative geometry-aware learning for generalized few-shot object detection. In: CVPR (2023)

    Google Scholar 

  32. Oquab, M., et al.: DINOv2: learning robust visual features without supervision. arXiv preprint arXiv:2304.07193 (2023)

  33. Qiao, L., Zhao, Y., Li, Z., Qiu, X., Wu, J., Zhang, C.: DeFRCN: decoupled faster R-CNN for few-shot object detection. In: ICCV (2021)

    Google Scholar 

  34. Radford, A., et al.: Learning transferable visual models from natural language supervision. In: ICML (2021)

    Google Scholar 

  35. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: CVPR (2016)

    Google Scholar 

  36. Saleh, A., Laradji, I.H., Konovalov, D.A., Bradley, M., Vazquez, D., Sheaves, M.: A realistic fish-habitat dataset to evaluate algorithms for underwater visual analysis. Sci. Rep. (2020)

    Google Scholar 

  37. Snell, J., Swersky, K., Zemel, R.: Prototypical networks for few-shot learning. In: NeurIPS (2017)

    Google Scholar 

  38. Song, K., Yan, Y.: A noise robust method based on completed local binary patterns for hot-rolled steel strip surface defects. Appl. Surface Sci. (2013)

    Google Scholar 

  39. Sun, B., Li, B., Cai, S., Yuan, Y., Zhang, C.: FSCE: few-shot object detection via contrastive proposal encoding. In: CVPR (2021)

    Google Scholar 

  40. Tang, H., Yuan, C., Li, Z., Tang, J.: Learning attention-guided pyramidal features for few-shot fine-grained recognition. Pattern Recogn. (2022)

    Google Scholar 

  41. Tseng, H.Y., Lee, H.Y., Huang, J.B., Yang, M.H.: Cross-domain few-shot classification via learned feature-wise transformation. In: ICLR (2020)

    Google Scholar 

  42. Vinyals, O., Blundell, C., Lillicrap, T., Wierstra, D., et al.: Matching networks for one shot learning. In: NeurIPS (2016)

    Google Scholar 

  43. Wang, H., Deng, Z.H.: Cross-domain few-shot classification via adversarial task augmentation. arXiv preprint (2021)

    Google Scholar 

  44. Wang, K., Liew, J.H., Zou, Y., Zhou, D., Feng, J.: Panet: few-shot image semantic segmentation with prototype alignment. In: ICCV (2019)

    Google Scholar 

  45. Wang, X., Huang, T.E., Darrell, T., Gonzalez, J.E., Yu, F.: Frustratingly simple few-shot object detection. arXiv preprint arXiv:2003.06957 (2020)

  46. Xie, G.S., Xiong, H., Liu, J., Yao, Y., Shao, L.: Few-shot semantic segmentation with cyclic memory network. In: ICCV (2021)

    Google Scholar 

  47. Xiong, W.: CD-FSOD: a benchmark for cross-domain few-shot object detection. In: ICASSP (2023)

    Google Scholar 

  48. Yan, X., Chen, Z., Xu, A., Wang, X., Liang, X., Lin, L.: Meta R-CNN: towards general solver for instance-level low-shot learning. In: ICCV (2019)

    Google Scholar 

  49. Zareian, A., Rosa, K.D., Hu, D.H., Chang, S.F.: Open-vocabulary object detection using captions. In: CVPR (2021)

    Google Scholar 

  50. Zhang, H., Zhang, L., Qi, X., Li, H., Torr, P.H., Koniusz, P.: Few-shot action recognition with permutation-invariant attention. In: ECCV (2020)

    Google Scholar 

  51. Zhang, J., Gao, L., Luo, X., Shen, H., Song, J.: DETA: denoised task adaptation for few-shot learning. In: ICCV (2023)

    Google Scholar 

  52. Zhang, X., Wang, Y., Boularias, A.: Detect every thing with few examples. arXiv preprint arXiv:2309.12969 (2023)

  53. Zhao, S., et al.: Exploiting unlabeled data with vision and language models for object detection. In: ECCV (2022)

    Google Scholar 

  54. Zhong, Y., et al.: RegionCLIP: region-based language-image pretraining. In: CVPR (2022)

    Google Scholar 

  55. Zhou, K., Yang, Y., Qiao, Y., Xiang, T.: Domain generalization with mixstyle. In: ICLR (2021)

    Google Scholar 

  56. Zhou, X., Girdhar, R., Joulin, A., Krähenbühl, P., Misra, I.: Detecting twenty-thousand classes using image-level supervision. In: ECCV (2022)

    Google Scholar 

  57. Zhuo, L., Fu, Y., Chen, J., Cao, Y., Jiang, Y.G.: TGDM: target guided dynamic mixup for cross-domain few-shot learning. In: ACM MM (2022)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 7716 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fu, Y. et al. (2025). Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15116. Springer, Cham. https://doi.org/10.1007/978-3-031-73636-0_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-73636-0_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-73635-3

  • Online ISBN: 978-3-031-73636-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics