skip to main content
10.1145/3373509.3373529acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiccprConference Proceedingsconference-collections
research-article

Attention-Based Feature Pyramid Network for Object Detection

Published: 25 March 2020 Publication History

Abstract

Attention mechanism and feature pyramid have been widely used in various fields of deep learning in recent years. Especially, Feature Pyramid Network(FPN)becomes a popular object detection network since it is put forward in 2017, which is embedded into many well-known networks.However, FPN takes a suboptimal approach to fuse feature and detects small objects on low-level features that fused with high-level features which contain redundant information. There are very few articles discussing the way of feature fusion.So in this paper we propose a novel Attention-based Feature Pyramid Network(AFPN) which can not only enable better integration of high-level and low-level feature maps but also increase accurate semantic information of low-level features. In particular, the AFPN consists of two modules: the Feature Fusion Module(FFM) and the Feature Enhance Module(FEM). Because our model is a lightweight and general module, it is end-to-end trainable along with base CNNs. We validate our AFPN through extensive experiments on VOC and COCO detection datasets. Our experiments show consistent improvements in detection performances.

References

[1]
Redmon J, Farhadi A. YOLOv3: An Incremental Improvement[J]. (2018).
[2]
Liu W, Anguelov D, Erhan D, et al. SSD: Single Shot MultiBox Detector[C]// European Conference on Computer Vision(2016).
[3]
Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence (2017).
[4]
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection withregion proposal networks. In: NIPS (2015)
[5]
Lin T Y, Dollár P, Girshick R, et al. Feature Pyramid Networks for Object Detection[J](2016).
[6]
Hu J, Shen L, Albanie S, et al. Squeeze-and-Excitation Networks[J] (2017).
[7]
Zhang G, Huang X, Li S Z, et al. Boosting local binary pattern (LBP)-Based face recognition[C]// ChineseConference on Advances in Biometric Person Authentication ( 2004).
[8]
Choi J Y, Sung K S, Yang Y K. Multiple Vehicles Detection and Tracking based on Scale-Invariant Feature Transform[C]// IEEE Intelligent Transportation Systems Conference (2007).
[9]
Jia H X, Zhang Y J. Fast Human Detection by Boosting Histograms of Oriented Gradients[C]// International Conference on Image & Graphics (2007).
[10]
Girshick R. Fast R-CNN[J]. Computer Science (2015).
[11]
M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The PASCAL Visual ObjectClasses Challenge 2007 (VOC2007) Results (2007).
[12]
Vaswani A, Shazeer N, Parmar N, et al. Attention Is All You Need[J](2017).
[13]
Russakovsky O, Deng J, Su H, et al. ImageNet Large Scale Visual Recognition Challenge[J]. International Journal of Computer Vision, 115(3):211--252 (2015).
[14]
Lin T Y, Maire M, Belongie S, et al. Microsoft COCO: Common Objects in Context[J] (2014).
[15]
Wang J, Yuan Y, Yu G. Face Attention Network: An Effective Face Detector for the Occluded Faces[J](2017).
[16]
K.Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition (2014)
[17]
Li, Y., He, K., Sun, J., et al.: R-fcn: Object detection via region-based fully convolutional networks. In: NIPS (2016)
[18]
Redmon, J., Farhadi, A.: Yolo9000: Better, faster, stronger. In: CVPR (2017)
[19]
Fu, C.Y., et al.: Dssd: Deconvolutional single shot detector. arXiv preprintarXiv:1701.06659 (2017)
[20]
Fei, W., et al.: Residual attention network for image classification. arXiv preprint arXiv:1704.06904 (2017)
[21]
Kaiming, H., Georgia, G., Piotr, D., Ross, G.: Mask R-CNN. In: IEEE InternationalConference on Computer Vision (ICCV), pp. 2980--2988 (2017)
[22]
Pierre, S., David, E., Xiang, Z., Micha, M., Rob, F., Yann, L.C.: Overfeat: integratedrecognition, localization and detection using convolutional networks. In: International Conference on Learning Representations (2014)
[23]
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S. E., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. Going deeper with convolutions. In CVPR.(2015)
[24]
Zhang, S., Wen, L., Bian, X., Lei, Z.: Single-shot refinement neural network for object detection. In: CVPR (2018)
[25]
Lin, T.Y., Goyal, P., Girshick, R., He, K., Doll´ar, P.: Focal loss for dense objectdetection. In: ICCV (2017)

Cited By

View all
  • (2023)Robust Image Inpainting Forensics by Using an Attention-Based Feature Pyramid NetworkApplied Sciences10.3390/app1316919613:16(9196)Online publication date: 12-Aug-2023
  • (2023)SAFPN: a full semantic feature pyramid network for object detectionPattern Analysis and Applications10.1007/s10044-023-01200-926:4(1729-1739)Online publication date: 28-Sep-2023
  • (2021)Integrating Gate and Attention Modules for High-Resolution Image Semantic SegmentationIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing10.1109/JSTARS.2021.307135314(4530-4546)Online publication date: 2021

Index Terms

  1. Attention-Based Feature Pyramid Network for Object Detection

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICCPR '19: Proceedings of the 2019 8th International Conference on Computing and Pattern Recognition
    October 2019
    522 pages
    ISBN:9781450376570
    DOI:10.1145/3373509
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    • Hebei University of Technology
    • Beijing University of Technology

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 25 March 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Attention mechanism
    2. FPN
    3. feature fusion
    4. object detection

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    ICCPR '19

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 11 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Robust Image Inpainting Forensics by Using an Attention-Based Feature Pyramid NetworkApplied Sciences10.3390/app1316919613:16(9196)Online publication date: 12-Aug-2023
    • (2023)SAFPN: a full semantic feature pyramid network for object detectionPattern Analysis and Applications10.1007/s10044-023-01200-926:4(1729-1739)Online publication date: 28-Sep-2023
    • (2021)Integrating Gate and Attention Modules for High-Resolution Image Semantic SegmentationIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing10.1109/JSTARS.2021.307135314(4530-4546)Online publication date: 2021

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media