Open-World Few-Shot Object Detection

Chen, Wei; Zhang, Shengchuan

doi:10.1007/978-981-99-4755-3_48

Wei Chen¹³ &
Shengchuan Zhang¹³

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14086))

Included in the following conference series:

International Conference on Intelligent Computing

1697 Accesses
1 Citations

Abstract

General object detection has made significant progress under the close-set setting. However, the detector can only support a fixed set of categories and fail to identify unknown objects in real-world scenarios. Therefore, class-agnostic object detection (CAOD) has recently attracted much attention, aiming to localize both known and unknown objects in the image. Since CAOD utilizes the binary label to train the detector, lacking multi-class classification information, and is also incapable to further generalize quickly to the unknown objects of interest, the scalability of this task is limited in more downstream applications. In this paper, we propose a new task termed Open-World Few-Shot Object Detection (OFOD), extending class-agnostic object detection with the few-shot learning ability. Compared with CAOD, OFOD can accurately detect unknown objects with only a few examples. Besides, we propose a new model termed OFDet, built upon a class-agnostic object detector under the two-stage fine-tuning paradigm. OFDet consists of three key components, Class-agnostic Localization Module (CALM) that generates class-agnostic proposals, Base Classification Module (BCM) that classifies objects from classes features, and Novel Detection Module (NDM) that learns to detect novel objects. OFDet detects the novel classes in NDM and localizes the potential unknown proposals in CALM. Furthermore, an Unknown Proposals Selection algorithm is proposed to select more accurate unknown objects. Extensive experiments are conducted on PASCAL VOC and COCO datasets under multiple tasks, CAOD, few-shot object detection (FSOD) and OFOD. The results show that OFDet performs well on the traditional FSOD and CAOD settings as well as the proposed OFOD setting. Specifically for OFOD, OFDet achieves state-of-the-art results on the average recall of unknown classes (32.5%) and obtains high average precision of novel classes (15.7%) under the 30-shot setting of COCO's unknown set 1.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Zheng, X., et al.: DDPNAS: efficient neural architecture search via dynamic distribution pruning. Int. J. Comput. Vision 131(5), 1234–1249 (2023)
Article Google Scholar
Zheng, X., et al.: Migo-NAS: towards fast and generalizable neural architecture search. IEEE Trans. Pattern Anal. Mach. Intell. 43(9), 2936–2952 (2021)
Article Google Scholar
Zhang, S., et al.: You only compress once: towards effective and elastic BERT compression via exploit-explore stochastic nature gradient. arXiv preprint arXiv:2106.02435 (2021)
Zhang, S., Jia, F., Wang, C., Wu, Q.: Targeted hyperparameter optimization with lexicographic preferences over multiple objectives. In: The Eleventh International Conference on Learning Representations (2023)
Google Scholar
Zheng, X., Ji, R., Tang, L., Zhang, B., Liu, J., Tian, Q.: Multinomial distribution learning for effective neural architecture search. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1304–1313 (2019)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, vol. 28 (2015)
Google Scholar
Alexe, B., Deselaers, T., Ferrari, V.: Measuring the objectness of image windows. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2189–2202 (2012)
Article Google Scholar
Uijlings, J.R., Van De Sande, K.E., Gevers, T., Smeulders, A.W.: Selective search for object recognition. Int. J. Comput. Vision 104, 154–171 (2013)
Article Google Scholar
Kim, D., Lin, T.Y., Angelova, A., Kweon, I.S., Kuo, W.: Learning open-world object proposals without learning to classify. IEEE Robot. Autom. Lett. 7(2), 5453–5460 (2022)
Article Google Scholar
Bendale, A., Boult, T.: Towards open world recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1893–1902 (2015)
Google Scholar
Chen, H., Wang, Y., Wang, G., Qiao, Y.: LSTD: a low-shot transfer detector for object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)
Google Scholar
Kang, B., Liu, Z., Wang, X., Yu, F., Feng, J., Darrell, T.: Few-shot object detection via feature reweighting. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8420–8429 (2019)
Google Scholar
Wang, X., Huang, T., Gonzalez, J., Darrell, T., Yu, F.: Frustratingly simple few-shot object detection. In: International Conference on Machine Learning, pp. 9919–9928. PMLR (2020)
Google Scholar
Yan, X., Chen, Z., Xu, A., Wang, X., Liang, X., Lin, L.: Meta R-CNN: towards general solver for instance-level low-shot learning. In: Proceedingsof the IEEE/CVF International Conference on Computer Vision, pp. 9577–9586 (2019)
Google Scholar
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vision 88(2), 303–338 (2010)
Article Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar

Download references

Acknowledgement

This work was supported by National Key R&D Program of China (No. 2022ZD0118202), the National Science Fund for Distinguished Young Scholars (No. 62025603), the National Natural Science Foundation of China (No. U21B2037, No. U22B2051, No. 62176222, No. 62176223, No. 62176226, No. 62072386, No. 62072387, No. 62072389, No. 62002305 and No. 62272401), and the Natural Science Foundation of Fujian Province of China (No. 2021J01002, No. 2022J06001).

Author information

Authors and Affiliations

Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University, Xiamen, 361005, People’s Republic of China
Wei Chen & Shengchuan Zhang

Authors

Wei Chen
View author publications
You can also search for this author in PubMed Google Scholar
Shengchuan Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shengchuan Zhang .

Editor information

Editors and Affiliations

Eastern Institute of Technology, Zhejiang, China
De-Shuang Huang
University of Wollongong, North Wollongong, NSW, Australia
Prashan Premaratne
Zhengzhou University of Light Industry, Zhengzhou, China
Baohua Jin
Zhong Yuan University of Technology, Zhengzhou, China
Boyang Qu
University of Ulsan, Ulsan, Korea (Republic of)
Kang-Hyun Jo
Department of Computer Science, Liverpool John Moores University, Liverpool, UK
Abir Hussain

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, W., Zhang, S. (2023). Open-World Few-Shot Object Detection. In: Huang, DS., Premaratne, P., Jin, B., Qu, B., Jo, KH., Hussain, A. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2023. Lecture Notes in Computer Science, vol 14086. Springer, Singapore. https://doi.org/10.1007/978-981-99-4755-3_48

Download citation

DOI: https://doi.org/10.1007/978-981-99-4755-3_48
Published: 30 July 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-4754-6
Online ISBN: 978-981-99-4755-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics