Feature enhancement modules applied to a feature pyramid network for object detection

Liu, Min; Lin, Kun; Huo, Wujie; Hu, Lanlan; He, Zhizi

doi:10.1007/s10044-023-01152-0

Feature enhancement modules applied to a feature pyramid network for object detection

Theoretical Advances
Published: 16 February 2023

Volume 26, pages 617–629, (2023)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

Min Liu¹,
Kun Lin ORCID: orcid.org/0000-0002-5991-083X¹,
Wujie Huo¹,
Lanlan Hu¹ &
…
Zhizi He¹

519 Accesses
1 Altmetric
Explore all metrics

Abstract

A feature pyramid network (FPN) improves the ability of an object detection model to detect multiscale targets. However, the simple upsampling used in an FPN is not conducive to the propagation of deep semantic information, and redundant background information is not conducive to object detection. In this paper, we propose two plug-and-play modules for preexisting FPN-based architectures: a channel filtering module (CFM) and a spatial filtering module (SFM). The CFM learns the correlations between channels to improve the feature maps obtained via upsampling. The SFM introduces global information to improve the detection performance of the network. With the CFM and SFM, we improve the average precision (AP) of Faster R-CNN with an FPN by 0.9% to 1.3% on COCO, and we boost the AP of YOLOv5s with PANet by 2.8%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Gated Feature Pyramid Network for Object Detection

Multi-level feature fusion pyramid network for object detection

Article 04 July 2022

Multi-scale Attention-Based Feature Pyramid Networks for Object Detection

Code or data availability

The code and data are available.

References

Zou Z, Shi Z, Guo Y, Ye J (2019) Object detection in 20 years. A survey. arXiv:1905.05055
Wu X, Sahoo D, Hoi SC (2020) Recent advances in deep learning for object detection. Neurocomputing 396:39–64. https://doi.org/10.1016/j.neucom.2020.01.085
Article Google Scholar
Kaur B, Singh S (2021) Object detection using deep learning: a review. In: Proceedings of the international conference on data science, machine learning and artificial intelligence, pp 328–334. https://doi.org/10.1145/3484824.3484889
Zaidi SSA, Ansari MS, Aslam A, Kanwal N, Asghar M, Lee B (2022) A survey of modern deep learning based object detection models. Digit Signal Process. https://doi.org/10.1016/j.dsp.2022.103514
Article Google Scholar
Ren S, He K, Girshick R, Sun J (2017) Faster r-cnn. Towards real-time object detection with region proposal networks. Adv Neural Inf Process Syst. https://doi.org/10.1109/TPAMI.2016.2577031
Article Google Scholar
Cai Z, Vasconcelos N (2017) Cascade r-cnn: Delving into high quality object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6154–6162.https://doi.org/10.1109/CVPR.2018.00644
Li Y, Zheng H, Yan Z, Chen L (2019) Detail preservation and feature refinement for object detection. Neurocomputing 359:209–218. https://doi.org/10.1016/j.neucom.2019.05.086
Article Google Scholar
Qin H, Wu Y, Dong F, Sun S (2022) Dense sampling and detail enhancement network: Improved small object detection based on dense sampling and detail enhancement. IET Comput Vis. https://doi.org/10.1049/cvi2.12089
Article Google Scholar
Yan Z, Zheng H, Li Y (2022) Detail injection with heterogeneous composite backbone network for object detection. Multimed Tools Appl 81(8):11621–11637. https://doi.org/10.1007/s11042-022-12241-3
Article Google Scholar
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) SSD: Single shot multibox detector. In: European conference on computer vision, pp 21–37. https://doi.org/10.1007/978-3-319-46448-0_2
Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125. https://doi.org/10.1109/CVPR.2017.106
Li Z, Peng C, Yu G, Zhang X, Deng Y, Sun J (2018) Detnet: a backbone network for object detection. arXiv:1804.06215
Cao J, Pang Y, Zhao S, Li X (2019) High-level semantic networks for multi-scale object detection. IEEE Trans Circuits Syst Video Technol 30(10):3372–3386. https://doi.org/10.1109/TCSVT.2019.2950526
Article Google Scholar
Chalavadi V, Jeripothula P, Datla R, Ch SB (2022) mSODANet: a network for multi-scale object detection in aerial images using hierarchical dilated convolutions. Pattern Recognit 126:108548. https://doi.org/10.1016/j.patcog.2022.108548
Article Google Scholar
Zhang S, Wen L, Bian X, Lei Z, Li SZ (2018) Single-shot refinement neural network for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4203–4212. https://doi.org/10.1109/CVPR.2018.00442
Zhang X, Wu J, Peng Z, Meng M (2020) SODNet: small object detection using deconvolutional neural network. IET Image Process 14(8):1662–1669. https://doi.org/10.1049/iet-ipr.2019.0833
Article Google Scholar
Wu G, Guo Z, Shi X, Chen Q, Xu Y, Shibasaki R, Shao X (2018) A boundary regulated network for accurate roof segmentation and outline extraction. Remote Sens 10(8):1195. https://doi.org/10.3390/rs10081195
Article Google Scholar
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969. https://doi.org/10.1109/ICCV.2017.322
Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8759–8768. https://doi.org/10.1109/CVPR.2018.00913
Bochkovskiy A, Wang C.-Y, Liao H-YM (2020) Yolov4: optimal speed and accuracy of object detection. arXiv:2004.10934
Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2014) Semantic image segmentation with deep convolutional nets and fully connected crfs. arXiv:1412.7062
Bell S, Zitnick CL, Bala K, Girshick R (2016) Inside-outside net: detecting objects in context with skip pooling and recurrent neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2874–2883. https://doi.org/10.1109/CVPR.2016.314
Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848. https://doi.org/10.1109/TPAMI.2017.2699184
Article Google Scholar
Chen K, Cao Y, Loy CC, Lin D, Feichtenhofer C (2020) Feature pyramid grids. arXiv:2004.03580
Liu S, Huang D, Wang Y (2019) Learning spatial fusion for single-shot object detection. arXiv:1911.09516
Jocher G, Chaurasia A, Stoken A, Borovec J, NanoCode012, Kwon Y, TaoXie, Fang J, imyhxy, Michael K (2022) ultralytics/yolov5: v6. 1-tensorrt, tensorflow edge tpu and openvino export and inference. Zenodo 22. https://doi.org/10.5281/zenodo.6222936
Lin T.-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick C.L(2014) Microsoft coco: common objects in context. In: European conference on computer vision, pp 740–755. https://doi.org/10.1007/978-3-319-10602-1_48
Jeong J, Park H, Kwak N (2017) Enhancement of SSD by concatenating feature maps for object detection. arXiv:1705.09587
Zhou H, Li Z, Ning C, Tang J (2017) Cad: scale invariant framework for real-time object detection. In: Proceedings of the IEEE international conference on computer vision workshops, pp 760–768. https://doi.org/10.1109/ICCVW.2017.95
Zhu Z, Li Z (2020) online video object detection via local and mid-range feature propagation. In: Proceedings of the 1st international workshop on human-centric multimedia analysis, pp 73–82. https://doi.org/10.1145/3422852.34234
Huang Z, Wang J, Fu X, Yu T, Guo Y, Wang R (2020) DC-SPP-YOLO: dense connection and spatial pyramid pooling based yolo for object detection. Inf Sci 522:241–258. https://doi.org/10.1016/j.ins.2020.02.067
Article MathSciNet Google Scholar
Cheng G, Si Y, Hong H, Yao X, Guo L (2021) Cross-scale feature fusion for object detection in optical remote sensing images. IEEE Geosci Remote Sens Lett 18(3):431–435. https://doi.org/10.1109/LGRS.2020.2975541
Article Google Scholar
Gao S-H, Cheng M-M, Zhao K, Zhang X-Y, Yang M-H, Torr P (2021) Res2net: a new multi-scale backbone architecture. IEEE Trans Pattern Anal Mach Intell 43(2):652–662. https://doi.org/10.1109/TPAMI.2019.2938758
Article Google Scholar
Tang L, Tang W, Qu X, Han Y, Wang W, Zhao B (2022) A scale-aware pyramid network for multi-scale object detection in SAR images. Remote Sens 14(4):973. https://doi.org/10.3390/rs14040973
Article Google Scholar
Qu X, Long E, Lv S, Chen P, Lai G, Yang Y, Du J (2021) Ship detection method based on scale matched r3det. In: 2021 3rd International conference on advanced information science and system (AISS 2021), pp 1–6. https://doi.org/10.1145/3503047.3503068
Gong Y, Yu X, Ding Y, Peng X, Zhao J, Han Z (2021) Effective fusion factor in FPN for tiny object detection. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 1160–1168. https://doi.org/10.1109/WACV48630.2021.00120
Yang G, Wang Z, Zhuang S (2021) PFF-FPN: a parallel feature fusion module based on FPN in pedestrian detection. In: 2021 International conference on computer engineering and artificial intelligence (ICCEAI), pp 377–381. https://doi.org/10.1109/ICCEAI52939.2021.00075
Zhou K, Zhang M, Wang H, Tan J (2022) Ship detection in SAR images based on multi-scale feature extraction and adaptive feature fusion. Remote Sens 14(3):755. https://doi.org/10.3390/rs14030755
Article Google Scholar
Zhang Y-M, Hsieh J-W, Lee C-C, Fan K-C (2022) SFPN: Synthetic FPN for object detection. arXiv:2203.02445
Tang H, Yuan C, Li Z, Tang J (2022) learning attention-guided pyramidal features for few-shot fine-grained recognition. Pattern Recognit. https://doi.org/10.1016/j.patcog.2022.1087
Article Google Scholar
Li Z, Sun Y, Zhang L, Tang J (2021) CTNet: context-based tandem network for semantic segmentation. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2021.31320
Article Google Scholar
Chen Y, Dai X, Liu M, Chen D, Yuan L, Liu Z (2020) Dynamic convolution: attention over convolution kernels. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11030–11039. https://doi.org/10.1109/CVPR42600.2020.01104
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141. https://doi.org/10.1109/CVPR.2018.00745
Hendrycks D, Gimpel K (2016) Gaussian error linear units (gelus). arXiv:1606.08415
Stergiou A, Poppe R, Kalliatakis G (2021) Refining activation downsampling with softpool. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 10357–10366. https://doi.org/10.1109/ICCV48922.2021.01019
Wang C-Y, Bochkovskiy A, Liao H-YM (2022) Yolov7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv:2207.02696
Li Y, Chen Y, Wang N, Zhang Z (2019) Scale-aware trident networks for object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6054–6063. https://doi.org/10.1109/ICCV.2019.00615
Tan M, Pang R, Le QV (2020) Efficientdet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10781–10790. https://doi.org/10.1109/CVPR42600.2020.01079
Wang S, Gong Y, Xing J, Huang L, Huang C, Hu W (2020) Rdsnet: a new deep architecture for reciprocal object detection and instance segmentation. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 12208–12215. https://doi.org/10.1609/aaai.v34i07.6902
Law H, Deng J (2018) Cornernet: detecting objects as paired keypoints. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds) Proceedings of the European conference on computer vision (ECCV), pp 765–781. https://doi.org/10.1007/978-3-030-01264-9_45
Pang J, Chen K, Shi J, Feng H, Ouyang W (2019) Libra r-cnn: towards balanced learning for object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 821–830. https://doi.org/10.1109/CVPR.2019.00091

Download references

Funding

This work was supported in part by a Special Project of the Central Government for Local Science and Technology Development of Hubei Province (No. 2019ZYYD020), the Science and Technology Research Program of the Hubei Provincial Department of Education (No. T201805), two Projects of the Hubei University of Technology Ph.D. Research Startup Fund (No. BSQD2020015 and No. BSQD2020014).

Author information

Authors and Affiliations

School of Electrical and Electronic Engineering, Hubei University of Technology, Wuhan, 430068, China
Min Liu, Kun Lin, Wujie Huo, Lanlan Hu & Zhizi He

Authors

Min Liu
View author publications
You can also search for this author in PubMed Google Scholar
Kun Lin
View author publications
You can also search for this author in PubMed Google Scholar
Wujie Huo
View author publications
You can also search for this author in PubMed Google Scholar
Lanlan Hu
View author publications
You can also search for this author in PubMed Google Scholar
Zhizi He
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kun Lin.

Ethics declarations

Conflict of interest

This article is subject to no conflict of interest with any individual or organization.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Liu, M., Lin, K., Huo, W. et al. Feature enhancement modules applied to a feature pyramid network for object detection. Pattern Anal Applic 26, 617–629 (2023). https://doi.org/10.1007/s10044-023-01152-0

Download citation

Received: 05 August 2022
Accepted: 24 January 2023
Published: 16 February 2023
Issue Date: May 2023
DOI: https://doi.org/10.1007/s10044-023-01152-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Feature enhancement modules applied to a feature pyramid network for object detection

Abstract

Access this article

Similar content being viewed by others

Gated Feature Pyramid Network for Object Detection

Multi-level feature fusion pyramid network for object detection

Multi-scale Attention-Based Feature Pyramid Networks for Object Detection

Code or data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Feature enhancement modules applied to a feature pyramid network for object detection

Abstract

Access this article

Similar content being viewed by others

Gated Feature Pyramid Network for Object Detection

Multi-level feature fusion pyramid network for object detection

Multi-scale Attention-Based Feature Pyramid Networks for Object Detection

Code or data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation