Skip to main content

Open-World Few-Shot Object Detection

  • Conference paper
  • First Online:
Advanced Intelligent Computing Technology and Applications (ICIC 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14086))

Included in the following conference series:

Abstract

General object detection has made significant progress under the close-set setting. However, the detector can only support a fixed set of categories and fail to identify unknown objects in real-world scenarios. Therefore, class-agnostic object detection (CAOD) has recently attracted much attention, aiming to localize both known and unknown objects in the image. Since CAOD utilizes the binary label to train the detector, lacking multi-class classification information, and is also incapable to further generalize quickly to the unknown objects of interest, the scalability of this task is limited in more downstream applications. In this paper, we propose a new task termed Open-World Few-Shot Object Detection (OFOD), extending class-agnostic object detection with the few-shot learning ability. Compared with CAOD, OFOD can accurately detect unknown objects with only a few examples. Besides, we propose a new model termed OFDet, built upon a class-agnostic object detector under the two-stage fine-tuning paradigm. OFDet consists of three key components, Class-agnostic Localization Module (CALM) that generates class-agnostic proposals, Base Classification Module (BCM) that classifies objects from classes features, and Novel Detection Module (NDM) that learns to detect novel objects. OFDet detects the novel classes in NDM and localizes the potential unknown proposals in CALM. Furthermore, an Unknown Proposals Selection algorithm is proposed to select more accurate unknown objects. Extensive experiments are conducted on PASCAL VOC and COCO datasets under multiple tasks, CAOD, few-shot object detection (FSOD) and OFOD. The results show that OFDet performs well on the traditional FSOD and CAOD settings as well as the proposed OFOD setting. Specifically for OFOD, OFDet achieves state-of-the-art results on the average recall of unknown classes (32.5%) and obtains high average precision of novel classes (15.7%) under the 30-shot setting of COCO's unknown set 1.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Zheng, X., et al.: DDPNAS: efficient neural architecture search via dynamic distribution pruning. Int. J. Comput. Vision 131(5), 1234–1249 (2023)

    Article  Google Scholar 

  2. Zheng, X., et al.: Migo-NAS: towards fast and generalizable neural architecture search. IEEE Trans. Pattern Anal. Mach. Intell. 43(9), 2936–2952 (2021)

    Article  Google Scholar 

  3. Zhang, S., et al.: You only compress once: towards effective and elastic BERT compression via exploit-explore stochastic nature gradient. arXiv preprint arXiv:2106.02435 (2021)

  4. Zhang, S., Jia, F., Wang, C., Wu, Q.: Targeted hyperparameter optimization with lexicographic preferences over multiple objectives. In: The Eleventh International Conference on Learning Representations (2023)

    Google Scholar 

  5. Zheng, X., Ji, R., Tang, L., Zhang, B., Liu, J., Tian, Q.: Multinomial distribution learning for effective neural architecture search. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1304–1313 (2019)

    Google Scholar 

  6. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, vol. 28 (2015)

    Google Scholar 

  7. Alexe, B., Deselaers, T., Ferrari, V.: Measuring the objectness of image windows. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2189–2202 (2012)

    Article  Google Scholar 

  8. Uijlings, J.R., Van De Sande, K.E., Gevers, T., Smeulders, A.W.: Selective search for object recognition. Int. J. Comput. Vision 104, 154–171 (2013)

    Article  Google Scholar 

  9. Kim, D., Lin, T.Y., Angelova, A., Kweon, I.S., Kuo, W.: Learning open-world object proposals without learning to classify. IEEE Robot. Autom. Lett. 7(2), 5453–5460 (2022)

    Article  Google Scholar 

  10. Bendale, A., Boult, T.: Towards open world recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1893–1902 (2015)

    Google Scholar 

  11. Chen, H., Wang, Y., Wang, G., Qiao, Y.: LSTD: a low-shot transfer detector for object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)

    Google Scholar 

  12. Kang, B., Liu, Z., Wang, X., Yu, F., Feng, J., Darrell, T.: Few-shot object detection via feature reweighting. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8420–8429 (2019)

    Google Scholar 

  13. Wang, X., Huang, T., Gonzalez, J., Darrell, T., Yu, F.: Frustratingly simple few-shot object detection. In: International Conference on Machine Learning, pp. 9919–9928. PMLR (2020)

    Google Scholar 

  14. Yan, X., Chen, Z., Xu, A., Wang, X., Liang, X., Lin, L.: Meta R-CNN: towards general solver for instance-level low-shot learning. In: Proceedingsof the IEEE/CVF International Conference on Computer Vision, pp. 9577–9586 (2019)

    Google Scholar 

  15. Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vision 88(2), 303–338 (2010)

    Article  Google Scholar 

  16. Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48

    Chapter  Google Scholar 

Download references

Acknowledgement

This work was supported by National Key R&D Program of China (No. 2022ZD0118202), the National Science Fund for Distinguished Young Scholars (No. 62025603), the National Natural Science Foundation of China (No. U21B2037, No. U22B2051, No. 62176222, No. 62176223, No. 62176226, No. 62072386, No. 62072387, No. 62072389, No. 62002305 and No. 62272401), and the Natural Science Foundation of Fujian Province of China (No. 2021J01002, No. 2022J06001).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shengchuan Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chen, W., Zhang, S. (2023). Open-World Few-Shot Object Detection. In: Huang, DS., Premaratne, P., Jin, B., Qu, B., Jo, KH., Hussain, A. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2023. Lecture Notes in Computer Science, vol 14086. Springer, Singapore. https://doi.org/10.1007/978-981-99-4755-3_48

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-4755-3_48

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-4754-6

  • Online ISBN: 978-981-99-4755-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics