MPF-Net: multi-projection filtering network for few-shot object detection

Chen, Han; Wang, Qi; Xie, Kailin; Lei, Liang; Wu, Xue

doi:10.1007/s10489-024-05556-1

MPF-Net: multi-projection filtering network for few-shot object detection

Published: 14 June 2024

Volume 54, pages 7777–7792, (2024)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Han Chen^1,2,
Qi Wang^1,2,
Kailin Xie²,
Liang Lei² &
…
Xue Wu¹

310 Accesses
Explore all metrics

Abstract

Deep learning-based object detection has made tremendous progress in the field of intelligent vision systems. However, one of its major complaints is the high demand for large amounts of experimental data. Few-shot object detection (FSOD) aims to identify novel objects with only a few training samples. Existing techniques don’t fully explore the potential mapping relationships between support and query features due to the limitation of global matching contrast. In this work, we propose a multi-projection filtering network (MPF-Net) to exploit feature relevance and aggregate the information between multiple scales, ensuring an optimal global representation. Furthermore, we take the lead in proposing a feature contrast filtering paradigm for the classification and regression subtasks in order to fully utilize fine-grained features for contrastive match. The multi-visual contrast approach motivates our model to gracefully handle a variety of difficult detection challenges such as scale discrepancies, occlusions, and feature confusions. MPF-Net accurately perceives features at various scales and adaptively excavates category information. Extensive experiments on PASCAL VOC and MS COCO datasets have demonstrated that our detectors significantly improve upon baseline detectors, especially for extremely low-shot settings (average accuracy improvement is up to 3.5% in 1-shot scenarios and 2.5% in 2-shot scenarios). In general, we propose a novel strategy to construct the few-shot feature space and achieve remarkable results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

FSODv2: A Deep Calibrated Few-Shot Object Detection Network

Article 04 April 2024

Few-Shot Object Detection via Understanding Convolution and Attention

Multi-scale Positive Sample Refinement for Few-Shot Object Detection

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability and access

The datasets generated during and/or analyzed during the current study are available from the corresponding author upon reasonable request.

References

He K, Gkioxari G, Dollár P, Girshick RB (2017) Mask R-CNN. In: ICCV, pp 980–2988. https://doi.org/10.1109/ICCV.2017.322
Redmon J, Divvala SK, Girshick RB, Farhadi A (2016) You only look once: unified, real-time object detection. In: CVPR, pp 779–788. https://doi.org/10.1109/CVPR.2016.91
Ren S, He K, Girshick RB, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: NeuralIPS, pp 91–99
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: CVPR, pp 770–778
Tan M, Pang R, Le QV (2020) Efficientdet: scalable and efficient object detection. In: CVPR, pp 10778–10787. https://doi.org/10.1109/CVPR42600.2020.01079
Huang H, Zhang J, Zhang J, Xu J, Wu Q (2021) Low-rank pairwise alignment bilinear network for few-shot fine-grained image classification. IEEE, TMM vol 23 pp 1666–1680. https://doi.org/10.1109/TMM.2020.3001510
Andrychowicz M, Denil M, Colmenarejo SG, Hoffman MW, Pfau D, Schaul T, de Freitas N (2016) Learning to learn by gradient descent by gradient descent. In: NeuralIPS pp 3981–3989
Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: Precup D, Teh YW (eds), ICML, vol 70 pp 1126–1135
Qiao L, Shi Y, Li J, Tian Y, Huang T, Wang Y (2019) Transductive episodic-wise adaptive metric for few-shot learning. In: ICCV pp 3602–3611. https://doi.org/10.1109/ICCV.2019.00370
Li Y, Liu Z, Yao L, Wang X, Wang C (2021) Attribute-modulated generative meta learning for zero-shot classification. arXiv:2104.10857
Yan X, Chen Z, Xu A, Wang X, Liang X, Lin L (2019) Meta R-CNN: towards general solver for instance-level low-shot learning. In: ICCV pp 9576–9585. https://doi.org/10.1109/ICCV.2019.00967
Wang Y, Ramanan D, Hebert M (2019) Meta-learning to detect rare objects. In: ICCV pp 9924–9933. https://doi.org/10.1109/ICCV.2019.01002
Xiao Y, Marlet R (2020) Few-shot object detection and viewpoint estimation for objects in the wild. In: Vedaldi A, Bischof H, Brox T, Frahm J (eds), ECCV, vol 12362 pp 192–210. https://doi.org/10.1007/978-3-030-58520-4_12
Karlinsky L, Shtok J, Harary S, Schwartz E, Aides A, Feris RS, Giryes R, Bronstein AM (2019) Repmet: representative-based metric learning for classification and few-shot object detection. In: CVPR pp 5197–5206. https://doi.org/10.1109/CVPR.2019.00534
Li Z, Tang J (2015) Weakly supervised deep metric learning for community-contributed image retrieval. IEEE TMM vol 17 pp 1989–1999. https://doi.org/10.1109/TMM.2015.2477035
Li Y, Yao T, Pan Y, Chao H, Mei T (2020) Deep metric learning with density adaptivity. IEEE TMM vol 22 pp 1285–1297. https://doi.org/10.1109/TMM.2019.2939711
Kang B, Liu Z, Wang X, Yu F, Feng J, Darrell T (2019) Few-shot object detection via feature reweighting. In: ICCV pp 8419–8428
Hu H, Bai S, Li A, Cui J, Wang L (2021) Dense relation distillation with context-aware aggregation for few-shot object detection. In: CVPR pp 10185–10194. https://doi.org/10.1109/CVPR46437.2021.01005
Zhang Y, Zhang X, Qiu RC, Li J, Xu H, Tian Q (2021) Semi-supervised contrastive learning with similarity co-calibration. arXiv:2105.07387
Li L, Jin W, Huang Y (2022) Few-shot contrastive learning for image classification and its application to insulator identification. Appl Intell vol 52 pp 6148–6163
Sun B, Li B, Cai S, Yuan Y, Zhang C (2021) FSCE: few-shot object detection via contrastive proposal encoding. In: CVPR pp 7352–7362
Deng C, Wang M, Liu L, Liu Y, Jiang Y (2022) Extended feature pyramid network for small object detection. IEEE, TMM 24:1968–1979. https://doi.org/10.1109/TMM.2021.3074273
Wang M, Ning H, Liu H (2023) Object detection based on few-shot learning via instance-level feature correlation and aggregation. Appl Intell 53:351–368
Li X, Sun Z, Xue J-H, Ma Z (2021) A concise review of recent few-shot meta-learning methods. Neurocomputing 456:463–468
Rusu AA, Rao D, Sygnowski J, Vinyals O, Pascanu R, Osindero S, Hadsell R (2019) Meta-learning with latent embedding optimization. In: ICLR
Hidalgo AC, Ger PM, LDLF (2022) Valentin, Using meta-learning to predict student performance in virtual learning environments. Appl Intell pp 1–14
Hidalgo AC, Ger PM, LDLF (2022) Valentin, Using meta-learning to predict student performance in virtual learning environments. Appl Intell pp 1–14
Sung F, Yang Y, Zhang L, Xiang T, Torr PHS, Hospedales TM (2018) Learning to compare: relation network for few-shot learning. In: CVPR. pp 1199–1208
Snell J, Swersky K, Zemel RS (2017) Prototypical networks for few-shot learning. In: NeuralIPS. pp 4077–4087
Ren M, Triantafillou E, Ravi S, Snell J, Swersky K, Tenenbaum JB, Larochelle H, Zemel RS (2018) Meta-learning for semi-supervised few-shot classification. In: ICLR
Pérez-Rúa J, Zhu X, Hospedales TM, Xiang T (2020) Incremental few-shot object detection. In: CVPR, pp 13843–13852. https://doi.org/10.1109/CVPR42600.2020.01386
Huang L, Dai S, He Z (2023) Few-shot object detection with dense-global feature interaction and dual-contrastive learning. Appl Intell 53:14547–14564
Chen H, Wang Y, Wang G, Qiao Y (2018) LSTD: a low-shot transfer detector for object detection. In: McIlraith SA, Weinberger KQ (eds), AAAI, pp 2836–2843
Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. In: CVPR, pp 6517–6525. https://doi.org/10.1109/CVPR.2017.690
Wang X, Huang TE, Gonzalez J, Darrell T, Yu F (2020) Frustratingly simple few-shot object detection. In: ICML, 119:9919–9928
Chuang C, Robinson J, Lin Y, Torralba A, Jegelka S (2020) Debiased contrastive learning. In: NeurIPS
Cheng M, Wang H, Long Y (2022) Meta-learning-based incremental few-shot object detection. IEEE, TCSVT 32:2158–2169
WuJ, Liu S, Huang D, Wang Y (2020) Multi-scale positive sample refinement for few-shot object detection. In: Vedaldi A, Bischof H, Brox T, Frahm J (eds), ECCV 12361:456–472. https://doi.org/10.1007/978-3-030-58517-4_27
Al-Kaabi K, Monsefi R, Zabihzadeh D (2023) A framework to enhance generalization of deep metric learning methods using general discriminative feature learning and class adversarial neural networks. Appl Intell 53:8693–8711
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: CVPR 8:7132–7141. https://doi.org/10.1109/CVPR.2018.00745
Lin T, Dollár P, Girshick RB, He K, Hariharan B, Belongie SJ (2017) Feature pyramid networks for object detection. In: CVPR pp 936–944. https://doi.org/10.1109/CVPR.2017.106
Yu L, Zhang J, Wu Q (2022) Dual attention on pyramid feature maps for image captioning. IEEE TMM pp 1775–1786. https://doi.org/10.1109/TMM.2021.3072479
Yang S, Wang Y, Chen K, Zeng W, Fei Z (2022) Attribute-aware feature encoding for object recognition and segmentation. IEEE TMM 24:3611–3623. https://doi.org/10.1109/TMM.2021.3103605
Woo S, Park J, Lee J, Kweon IS (2018) CBAM: convolutional block attention module. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds), ECCV 11211:3–19
Zhang T, Lin G, Cai J, Shen T, Shen C, Kot AC (2019) Decoupled spatial neural attention for weakly supervised semantic segmentation. IEEE TMM 21:2930–2941. https://doi.org/10.1109/TMM.2019.2914870
Liu H, Liu F, Fan X, Huang D (2021) Polarized self-attention: towards high-quality pixel-wise regression. arXiv:2107.00782
Emami H, Aliabadi MM, Dong M, Chinnam RB (2021) SPA-GAN: spatial attention GAN for image-to-image translation. IEEE TMM 23:391–401. https://doi.org/10.1109/TMM.2020.2975961
Li J, Pan Z, Liu Q, Wang Z (2021) Stacked u-shape network with channel-wise attention for salient object detection. IEEE TMM 23:1397–1409. https://doi.org/10.1109/TMM.2020.2997192
Park K, Soh JW, Cho NI (2021) A dynamic residual self-attention network for lightweight single image super-resolution. arXiv:2112.04488
Deng J, Dong W, Socher R, Li L, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: CVPR pp 248–255. https://doi.org/10.1109/CVPR.2009.5206848
Liu L, Ma B, Zhang Y, Yi X, Li H (2021) Afd-net: adaptive fully-dual network for few-shot object detection. In: Shen HT, Zhuang Y, Smith JR, Yang Y, Cesar P, Metze F, Prabhakaran B (eds), ACMMM pp 2549–2557. https://doi.org/10.1145/3474085.3475428
Yan D, Huang J, Sun H, Ding F (2023) Few-shot object detection with weight imprinting. Cog Comput pp 1–11
Vu AKN, Nguyen ND, Nguyen ND, Nguyen VT, Ngo TD, Do TT, Nguyen TV (2022) Few-shot object detection via baby learning. Image Vision Comput 120:104398
Xia R, Li G, Huang Z, Meng H, Pang Y (2023) Bi-path combination yolo for real-time few-shot object detection. Pattern Recognition Letters 165:91–97

Download references

Acknowledgements

This research was supported by the National Natural Science Foundation of China (No. 62162008, 62006046, 32125033 and 31960548), Innovation and Entrepreneurship Project for Overseas Educated Talents in Guizhou Province (2022)-04, Guizhou Provincial Basic Research Program (ZK[2022]-108). Program of Introducing Talents of Discipline to Universities of China (111 Program, D20023).

Author information

Authors and Affiliations

State Key Laboratory of Public Big Data, College of Computer Science and Technology, Guizhou University, Guiyang, 550025, Guizhou, China
Han Chen, Qi Wang & Xue Wu
School of Physics and Optoelectronic Engineering, Guangdong University of Technology, Guangzhou, 510006, Guangdong, China
Han Chen, Qi Wang, Kailin Xie & Liang Lei

Authors

Han Chen
View author publications
You can also search for this author inPubMed Google Scholar
Qi Wang
View author publications
You can also search for this author inPubMed Google Scholar
Kailin Xie
View author publications
You can also search for this author inPubMed Google Scholar
Liang Lei
View author publications
You can also search for this author inPubMed Google Scholar
Xue Wu
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

All authors contributed to the conceptualization and design of the study. Material preparation, data collection, and analysis were carried out by Han Chen, Kailin Xie, Qi Wang, Xue Wu, and Liang Lei. The corresponding author is Qi Wang. The first draft of the manuscript was written by Han Chen and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Qi Wang or Liang Lei.

Ethics declarations

Ethics approval and informed consent for data used

Informed consent.

Competing interests

All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Chen, H., Wang, Q., Xie, K. et al. MPF-Net: multi-projection filtering network for few-shot object detection. Appl Intell 54, 7777–7792 (2024). https://doi.org/10.1007/s10489-024-05556-1

Download citation

Accepted: 23 May 2024
Published: 14 June 2024
Issue Date: September 2024
DOI: https://doi.org/10.1007/s10489-024-05556-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MPF-Net: multi-projection filtering network for few-shot object detection

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

FSODv2: A Deep Calibrated Few-Shot Object Detection Network

Few-Shot Object Detection via Understanding Convolution and Attention

Multi-scale Positive Sample Refinement for Few-Shot Object Detection

Explore related subjects

Data availability and access

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Ethics approval and informed consent for data used

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now