Attention-based fusion factor in FPN for object detection

Li, Yuancheng; Zhou, Shenglong; Chen, Hui

doi:10.1007/s10489-022-03220-0

Attention-based fusion factor in FPN for object detection

Published: 16 March 2022

Volume 52, pages 15547–15556, (2022)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

1271 Accesses
29 Citations
1 Altmetric
Explore all metrics

Abstract

At present, most advanced detectors usually use the feature pyramid to detect objects of different scales. Among them, FPN is one of the representative works of multi-scale feature summation to construct the feature pyramid. However, the existing FPN-based feature extraction networks pay more attention to capturing effective semantic information and ignore the influence of the dataset scale distribution on the FPN feature fusion process. To solve this problem, we propose a novel attention structure, which can be applied to any FPN-based network model. Different from the general attention that gets its own attention from itself, our proposed method makes better use of the influence of the lower layer feature of the adjacent layer on feature fusion, which guides the filtering of the upper layer feature. By considering the difference in the feature information of the same sample in different feature maps, it is better to filter out the invalid sample features of the upper layer relative to the lower layer. Our method can better learn the degree of deep features participating in shallow learning so that each layer of FPN is more focused on its own layer learning while effectively transferring features. Our experimental results show that our method can significantly improve the multi-scale object detection performance of the model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MFANet: Multi-scale feature fusion network with attention mechanism

Article 11 May 2022

Multi-scale Attention-Based Feature Pyramid Networks for Object Detection

AgBFPN: Attention Guided Bidirectional Feature Pyramid Network for Object Detection

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Zhao Q, Sheng T, Wang Y, Tang Z, Chen Y, Cai L, Ling H (2019) M2det: A single-shot object detector based on multi-level feature pyramid network. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 9259–9266
Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8759–8768
Gong Y, Yu X, Ding Y, Peng X, Zhao J, Han Z (2021) Effective fusion factor in fpn for tiny object detection. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 1160–1168
Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: Convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp 3–19
Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H (2019) Dual attention network for scene segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3146–3154
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: Single shot multibox detector. In: European conference on computer vision. Springer, pp 21–37
Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125
Guo C, Fan B, Zhang Q, Xiang S, Pan C (2020) Augfpn: Improving multi-scale feature learning for object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12595–12604
Singh B, Davis LS (2018) An analysis of scale invariance in object detection snip. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3578–3587
Kong T, Sun F, Tan C, Liu H, Huang W (2018) Deep feature pyramid reconfiguration for object detection. In: Proceedings of the European conference on computer vision (ECCV), pp 169–185
Cao J, Pang Y, Zhao S, Li X (2019) High-level semantic networks for multi-scale object detection. IEEE Trans Circ Syst Video Technol 30(10):3372–3386
Article Google Scholar
Nie J, Anwer RM, Cholakkal H, Khan FS, Pang Y, Shao L (2019) Enriched feature guided refinement network for object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9537–9546
Sun K, Xiao B, Liu D, Wang J (2019) Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5693–5703
Pang J, Chen K, Shi J, Feng H, Ouyang W, Lin D (2019) Libra r-cnn: Towards balanced learning for object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 821–830
Ghiasi G, Lin T-Y, Le QV (2019) Nas-fpn: Learning scalable feature pyramid architecture for object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7036–7045
Wu Y, Jiang X, Fang Z, Gao Y, Fujita H (2021) Multi-modal 3d object detection by 2d-guided precision anchor proposal and multi-layer fusion. Appl Soft Comput 108:107405
Article Google Scholar
Pérez-Hernández F, Tabik S, Lamas A, Olmos R, Fujita H, Herrera F (2020) Object detection binary classifiers methodology based on deep learning to identify small objects handled similarly: Application in video surveillance. Knowl-Based Syst 194:105590
Article Google Scholar
Tan M, Pang R, Le QV (2020) Efficientdet: Scalable and efficient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10781–10790
Wang F, Jiang M, Qian C, Yang S, Li C, Zhang H, Wang X, Tang X (2017) Residual attention network for image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3156–3164
Chen L-C, Yang Y, Wang J, Xu W, Yuille AL (2016) Attention to scale: Scale-aware semantic image segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3640–3649
Ren M, Zemel RS (2017) End-to-end instance segmentation with recurrent attention. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6656–6664
Lu J, Xiong C, Parikh D, Socher R (2017) Knowing when to look: Adaptive attention via a visual sentinel for image captioning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 375–383
You Q, Jin H, Wang Z, Fang C, Luo J (2016) Image captioning with semantic attention. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4651–4659
Tan Z, Wang M, Xie J, Chen Y, Shi X (2018) Deep semantic role labeling with self-attention. In: Proceedings of the AAAI conference on artificial intelligence, vol 32
Wang X, Girshick R, Gupta A, He K (2018) Non-local neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7794–7803
Li H, Liu Y, Ouyang W, Wang X (2019) Zoom out-and-in network with map attention decision for region proposal and object detection. Int J Comput Vis 127(3):225–238
Article Google Scholar
Zhu Y, Zhao C, Guo H, Wang J, Zhao X, Lu H (2018) Attention couplenet: Fully convolutional attention coupling network for object detection. IEEE Trans Image Process 28(1):113–126
Article MathSciNet Google Scholar
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision. Springer, pp 818–833
Yu X, Gong Y, Jiang N, Ye Q, Han Z (2020) Scale match for tiny person detection. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 1257–1265
Tian Z, Shen C, Chen H, He T (2019) Fcos: Fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9627–9636
Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988
Zhang X, Wan F, Liu C, Ji X, Ye Q (2021) Learning to match anchors for visual object detection. IEEE Trans Pattern Anal Mach Intell
Cao Y, Xu J, Lin S, Wei F, Hu H (2019) Gcnet: Non-local networks meet squeeze-excitation networks and beyond. In: Proceedings of the IEEE/CVF international conference on computer vision workshops, pp 0–0
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. Adv Neural Inf Process Syst 28:91–99
Google Scholar
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969

Download references

Acknowledgments

This work was supported in part by the State Grid Jiangxi Information & Telecommunication Company Project under Grant 52183520007V.

Author information

Authors and Affiliations

School Of Control And Computer Engineering, North China Electric Power University, Beijing, China
Yuancheng Li, Shenglong Zhou & Hui Chen

Authors

Yuancheng Li
View author publications
You can also search for this author inPubMed Google Scholar
Shenglong Zhou
View author publications
You can also search for this author inPubMed Google Scholar
Hui Chen
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Yuancheng Li.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, Y., Zhou, S. & Chen, H. Attention-based fusion factor in FPN for object detection. Appl Intell 52, 15547–15556 (2022). https://doi.org/10.1007/s10489-022-03220-0

Download citation

Accepted: 10 January 2022
Published: 16 March 2022
Issue Date: October 2022
DOI: https://doi.org/10.1007/s10489-022-03220-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Attention-based fusion factor in FPN for object detection

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

MFANet: Multi-scale feature fusion network with attention mechanism

Multi-scale Attention-Based Feature Pyramid Networks for Object Detection

AgBFPN: Attention Guided Bidirectional Feature Pyramid Network for Object Detection

Explore related subjects

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now