skip to main content
10.1145/3488933.3488939acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaiprConference Proceedingsconference-collections
research-article

A Longitudinal Dense Feature Pyramid Network for Object Detection

Published: 25 February 2022 Publication History

Abstract

Feature Pyramid Network (FPN) is one of the most popular subnetworks in the current object detection model, which deals with the scale change problem in object detection through feature fusion. In feature fusion, when different layers of feature maps are merged into a new feature map, it will inevitably lead to noise increase and information loss. All of these will make a feature pyramid suboptimal. To address these challenges, this paper proposes a new feature fusion network called Longitudinal Dense BiFPN (LD-BiFPN) based on the bi-directional feature pyramid network (BiFPN). LD-BiFPN supplements the missing feature information by reusing the existing feature maps and improves the accuracy of the detector. The focal point of this work is to explore a more efficient feature pyramid network to improve the accuracy of object detection tasks without increasing a large number of parameters. Experiments have demonstrated that the EfficientDet-D0 detector using LD-BiFPN has an Average Precision (AP) improvement of 0.9 points than our re-implemented EfficientDet-D0 detector on the COCO val-2017 benchmark, without bells and whistles.

References

[1]
Xie B, Zhu X, Han C, Research of the Space-Borne Infrared Ship Target Recognition Technology Based on the Complex Background[J]. Journal of Advances in Information Technology, 2019, 10(2): 48-53.
[2]
Patil G G, Banyal R K. A Dynamic Unconstrained Feature Matching Algorithm for Face Recognition[J]. Journal of Advances in Information Technology, 2020, 11(2):103-108.
[3]
Chuang C W, Fan C P. Deep-Learning Based Joint Iris and Sclera Recognition with YOLO Network for Identity Identification[J]. Journal of Advances in Information Technology, 2021, 12(1):60-65.
[4]
Habal B G M, Malasaga E V, Magpantay A T. An Experimental Approach on Detecting and Measuring Waterbody through Image Processing Techniques[J]. Journal of Advances in Information Technology, 2021, 12(1):45-50.
[5]
Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[J]. Advances in neural information processing systems, 2012, 25: 1097-1105.
[6]
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[EB]. arXiv preprint arXiv:1409.1556, 2014.
[7]
He K, Zhang X, Ren S, Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770-778.
[8]
Huang G, Liu Z, Van Der Maaten L, Densely connected convolutional networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 4700-4708.
[9]
Sermanet P, Eigen D, Zhang X, Overfeat: Integrated recognition, localization and detection using convolutional networks[EB]. arXiv preprint arXiv:1312.6229, 2013.
[10]
Liu W, Anguelov D, Erhan D, Ssd: Single shot multibox detector[C]//European conference on computer vision. Springer, Cham, 2016: 21-37.
[11]
Redmon J, Divvala S, Girshick R, You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 779-788.
[12]
Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 7263-7271.
[13]
Redmon J, Farhadi A. Yolov3: An incremental improvement[EB]. arXiv preprint arXiv:1804.02767, 2018.
[14]
Girshick R, Donahue J, Darrell T, Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2014: 580-587.
[15]
He K, Zhang X, Ren S, Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE transactions on pattern analysis and machine intelligence, 2015, 37(9): 1904-1916.
[16]
Girshick R. Fast r-cnn[C]//Proceedings of the IEEE international conference on computer vision. 2015: 1440-1448.
[17]
Ren S, He K, Girshick R, Faster r-cnn: Towards real-time object detection with region proposal networks[EB]. arXiv preprint arXiv:1506.01497, 2015.
[18]
Dai J, Li Y, He K, R-fcn: Object detection via region-based fully convolutional networks[EB]. arXiv preprint arXiv:1605.06409, 2016.
[19]
He K, Gkioxari G, Dollár P, Mask r-cnn[C]//Proceedings of the IEEE international conference on computer vision. 2017: 2961-2969.
[20]
Lin T Y, Goyal P, Girshick R, Focal loss for dense object detection[C]//Proceedings of the IEEE international conference on computer vision. 2017: 2980-2988.
[21]
Lin T Y, Dollár P, Girshick R, Feature pyramid networks for object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 2117-2125.
[22]
Liu S, Qi L, Qin H, Path aggregation network for instance segmentation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 8759-8768.
[23]
Kim S W, Kook H K, Sun J Y, Parallel feature pyramid network for object detection[C]//Proceedings of the European Conference on Computer Vision. 2018: 234-250.
[24]
Zhao Q, Sheng T, Wang Y, M2det: A single-shot object detector based on multi-level feature pyramid network[C]//Proceedings of the AAAI conference on artificial intelligence. 2019, 33(01): 9259-9266.
[25]
Ghiasi G, Lin T Y, Le Q V. Nas-fpn: Learning scalable feature pyramid architecture for object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 7036-7045.
[26]
Liu S, Huang D, Wang Y. Learning spatial fusion for single-shot object detection[EB]. arXiv preprint arXiv:1911.09516, 2019.
[27]
Tan M, Pang R, Le Q V. Efficientdet: Scalable and efficient object detection[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 10781-10790.
[28]
Guo C, Fan B, Zhang Q, Augfpn: Improving multi-scale feature learning for object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 12595-12604.
[29]
Tan M, Le Q. Efficientnet: Rethinking model scaling for convolutional neural networks[C]//International Conference on Machine Learning. PMLR, 2019: 6105-6114.
[30]
Loffe S, Szegedy C. Batch normalization: Accelerating deep network training by reducing internal covariate shift[C]//International conference on machine learning. PMLR, 2015: 448-456.
[31]
Ramachandran P, Zoph B, Le Q V. Searching for activation functions[EB]. arXiv preprint arXiv:1710.05941, 2017.
[32]
Chollet F. Xception: Deep learning with depthwise separable convolutions[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 1251-1258.
[33]
Glorot X, Bordes A, Bengio Y. Deep sparse rectifier neural networks[C]//Proceedings of the fourteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, 2011: 315-323.
[34]
Lin T Y, Maire M, Belongie S, Microsoft coco: Common objects in context[C]//European conference on computer vision. Springer, Cham, 2014: 740-755.
[35]
Russakovsky O, Deng J, Su H, Imagenet large scale visual recognition challenge[J]. International journal of computer vision, 2015, 115(3): 211-252.

Index Terms

  1. A Longitudinal Dense Feature Pyramid Network for Object Detection
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Other conferences
        AIPR '21: Proceedings of the 2021 4th International Conference on Artificial Intelligence and Pattern Recognition
        September 2021
        715 pages
        ISBN:9781450384087
        DOI:10.1145/3488933
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 25 February 2022

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. Feature Pyramid Network
        2. Longitudinal Dense BiFPN
        3. Object detection
        4. Scale change

        Qualifiers

        • Research-article
        • Research
        • Refereed limited

        Funding Sources

        Conference

        AIPR 2021

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 0
          Total Citations
        • 40
          Total Downloads
        • Downloads (Last 12 months)2
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 20 Feb 2025

        Other Metrics

        Citations

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format.

        HTML Format

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media