skip to main content
10.1145/3484274.3484284acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicccvConference Proceedingsconference-collections
research-article

An Object Detection Algorithm Combining FPN Structure With DETR

Authors Info & Claims
Published:23 November 2021Publication History

ABSTRACT

In order to solve the problem of low detection accuracy of the DETR model for small and medium objects, an object detection algorithm with improved feature extraction combined with FPN structure combined with DETR is proposed. This method first extracts features from the original image through the improved Darknet53 network. In this process, the 104*104 size feature map after the first residual error in the second stage is additionally output as a fourth-scale feature map. Combine this feature map with the feature maps output from the original 3 stages to form 4 feature map outputs of different scales. Secondly, it uses FPN to down-sample and up-sample the feature maps of 4 scales, and to merge them to output 52*52 scales. Then, the feature map and the positional encoding are combined and input into the Transformer to obtain the data, and the category and position information of the predicted object are output through FFNs. On the COCO2017 data set, the accuracy has been improved compared with other models.

References

  1. Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Carion N, Massa F, Synnaeve G, End-to-End Object Detection with Transformers[C]. 16th European Conference on Computer Vision, ECCV 2020, August 23, 2020 - August 28, 2020, 2020: 213-229.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Lin T-Y, Dollar P, Girshick R, Feature pyramid networks for object detection[C]. 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, July 21, 2017 - July 26, 2017, 2017: 936-944.Google ScholarGoogle ScholarCross RefCross Ref
  4. Girshick R, Donahue J, Darrell T, Rich feature hierarchies for accurate object detection and semantic segmentation[C]. Proceedings of the IEEE conference on computer vision and pattern recognition, 2014: 580-587.Google ScholarGoogle Scholar
  5. Cai Z, Vasconcelos N. Cascade R-CNN: Delving into High Quality Object Detection[C]. 31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018, June 18, 2018 - June 22, 2018, 2018: 6154-6162.Google ScholarGoogle ScholarCross RefCross Ref
  6. Bochkovskiy A, Wang C-Y, Liao H-Y M. Yolov4: Optimal speed and accuracy of object detection[J]. arXiv preprint arXiv:2004.10934, 2020.Google ScholarGoogle Scholar
  7. Redmon J, Divvala S, Girshick R, You only look once: Unified, real-time object detection[C]. 29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, June 26, 2016 - July 1, 2016, 2016: 779-788.Google ScholarGoogle ScholarCross RefCross Ref
  8. Redmon J, Farhadi A. YOLO9000: Better, faster, stronger[C]. 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, July 21, 2017 - July 26, 2017, 2017: 6517-6525.Google ScholarGoogle Scholar
  9. Redmon J, Farhadi A. Yolov3: An incremental improvement[J]. arXiv preprint arXiv:1804.02767, 2018.Google ScholarGoogle Scholar
  10. Lin T-Y, Goyal P, Girshick R, Focal loss for dense object detection[C]. Proceedings of the IEEE international conference on computer vision, 2017: 2980-2988.Google ScholarGoogle Scholar
  11. Tan M, Le Q. Efficientnet: Rethinking model scaling for convolutional neural networks[C]. International Conference on Machine Learning, 2019: 6105-6114.Google ScholarGoogle Scholar
  12. Vaswani A, Shazeer N, Parmar N, Attention is all you need[J]. arXiv preprint arXiv:1706.03762, 2017.Google ScholarGoogle Scholar
  13. He K, Zhang X, Ren S, Deep residual learning for image recognition[C]. Proceedings of the IEEE conference on computer vision and pattern recognition, 2016: 770-778.Google ScholarGoogle Scholar
  14. Ren S, He K, Girshick R, Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE transactions on pattern analysis and machine intelligence, 2016, 39(6): 1137-1149.Google ScholarGoogle Scholar
  15. Rezatofighi H, Tsoi N, Gwak J, Generalized intersection over union: A metric and a loss for bounding box regression[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 658-666.Google ScholarGoogle Scholar

Index Terms

  1. An Object Detection Algorithm Combining FPN Structure With DETR
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        ICCCV '21: Proceedings of the 4th International Conference on Control and Computer Vision
        August 2021
        207 pages
        ISBN:9781450390477
        DOI:10.1145/3484274

        Copyright © 2021 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 23 November 2021

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format