skip to main content
10.1145/3577117.3577135acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicaipConference Proceedingsconference-collections
research-article

Improving the Detection Performance of Sparse R-CNN with Different Necks

Published:25 February 2023Publication History

ABSTRACT

Sparse R-CNN uses a purely sparse method to detect objects and achieves good results. However, it does not make full use of the features extracted from the image, so its detection performance needs to be further improved. And we propose Sparse R-CNNv1 and Sparse R-CNNv2. In these algorithms, we use VOVNet with attention mechanism to replace ResNet of the original Sparse R-CNN as our backbone. In addition, we also use two different improved neck networks, Augpan and FPNencoder, to further improve the detection performance of the algorithm from the perspective of feature fusion and increasing the receptive field of each layer, respectively. Our algorithms are trained and verified on COCO2017, and the experimental results show that Sparser-CNNv1 achieves 45.0 AP and Sparser-CNNV2 achieves 45.3 AP, higher than the original SparseR-CNN's 43.0 AP in standard 3× training schedule.

References

  1. Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2017. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. J. IEEE Transactions on Pattern Analysis & Machine Intelligence, 39(6):1137-1149. https://doi.org/10.1109/TPAMI.2016.2577031Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Yuxin Wu, Alexander Kirillov, Francisco Massa, Wan-Yen Lo, and Ross Girshick. 2019. Detectron2. https://github. com/facebookresearch/detectron2Google ScholarGoogle Scholar
  3. Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017. Feature Pyramid Networks for Object Detection. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 936-944. https://doi.org/ 10.1109/CVPR.2017.106Google ScholarGoogle Scholar
  4. Chien-Yao Wang, Hong-Yuan Mark Liao, Yueh-Hua Wu, Ping-Yang Chen, Jun-Wei Hsieh, I-Hau Yeh. 2020. CSPNet: A New Backbone that can Enhance Learning Capability of CNN. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW),1571-1580, https://doi.org/10.1109/CVPRW50498.2020.00203Google ScholarGoogle ScholarCross RefCross Ref
  5. Alexey Bochkovskiy, Chien-Yao Wang, Hong-Yuan Mark Liao. 2020. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv preprint arXiv:2004.10934. https://doi.org/10.48550/arXiv.2004.10934Google ScholarGoogle Scholar
  6. Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollar. 2017. Focal loss for dense object detection. J. IEEE Transactions on Pattern Analysis & Machine Intelligence, (99):2999-3007. https://doi.org/10.1109/TPAMI.2018.2858826Google ScholarGoogle Scholar
  7. Zhi Tian, Chunhua Shen, Hao Chen, and Tong He. 2019. FCOS: Fully convolutional one-stage object detection. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 9626-9635, https://doi.org/10.1109/ICCV.2019.00972Google ScholarGoogle ScholarCross RefCross Ref
  8. Chaoxu Guo, Bin Fan, Qian Zhang, Shiming Xiang, Chunhong Pan. 2020. Augfpn: Improving multi-scale feature learning for object detection. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 12592-12601. https://doi.org/10.1109/CVPR42600.2020.01261Google ScholarGoogle ScholarCross RefCross Ref
  9. Hei Law, Jia Deng. 2018. Cornernet: Detecting objects as paired keypoints. Proceedings of the European conference on computer vision (ECCV). 734-750. https://doi.org/10.48550/arXiv.1808.01244Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Zhaohui Zheng, Ping Wang, Wei Liu, Jinze Li, Rongguang Ye, Dongwei Ren. 2019. Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression. arXiv preprint arXiv:arXiv:1911.08287. https://doi.org/10.48550/arXiv.1911.08287Google ScholarGoogle Scholar
  11. Qiang Chen, Yingming Wang, Tong Yang, Xiangyu Zhang, Jian Cheng, Jian Sun. 2021. You only look one-level feature. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 13034-13043. https://doi.org/10.1109/CVPR46437.2021.01284Google ScholarGoogle ScholarCross RefCross Ref
  12. Joseph Redmon and Ali Farhadi. 2017. YOLO9000: Better, faster, stronger. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 6517-6525. https://doi.org/10.1109/CVPR.2017.690Google ScholarGoogle ScholarCross RefCross Ref
  13. Xizhou Zhu, Weijie Su, Lewei Lu, Bin Li, Xiaogang Wang, and Jifeng Dai. 2020. Deformable detr: Deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159. https://doi.org/10.48550/arXiv.2010.04159Google ScholarGoogle Scholar
  14. Mahyar Najibi, Mohammad Rastegari, and Larry S Davis. 2016. G-CNN: an iterative grid based object detector. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2369-2377. https://doi.org/10.1109/CVPR.2016.260Google ScholarGoogle ScholarCross RefCross Ref
  15. Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. 2020. End-to-End object detection with transformers. European conference on computer vision. Springer, Cham, 213-229. https://doi.org/10.48550/arXiv.2005.12872Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Youngwan Lee, Joong-won Hwang, Sangrok Lee, Yuseok Bae, Jongyoul Park. 2019. An energy and GPU-computation efficient backbone network for real-time object detection. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 752-760. https://doi.org/10.1109/CVPRW.2019.00103Google ScholarGoogle ScholarCross RefCross Ref
  17. Jie Hu, Li Shen, Gang Sun. 2017. Squeeze-and-excitation networks. J. IEEE Transactions on Pattern Analysis and Machine Intelligence, (99). https://doi.org/10.1109/TPAMI.2019.2913372Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. 2014. Microsoft COCO: Common objects in context. European Conference on Computer Vision. Springer International Publishing. https://doi.org/10.1007/978-3-319-10602-1_48Google ScholarGoogle Scholar
  19. Zeming Li, Chao Peng, Gang Yu, Xiangyu Zhang, Yangdong Deng, Jian Sun. 2018. Detnet: A backbone network for object detection. J. arXiv preprint arXiv:1804.06215. https://doi.org/10.48550/arXiv.1804.06215Google ScholarGoogle Scholar
  20. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 770-778. https://doi.org/10.1109/CVPR.2016.90Google ScholarGoogle ScholarCross RefCross Ref
  21. Shu Liu, Lu Qi, Haifang Qin, Jianping Shi, Jiaya Jia. 2018. Path aggregation network for instance segmentation. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8759-8768, https://doi.org/10.1109/CVPR.2018.00913Google ScholarGoogle ScholarCross RefCross Ref
  22. Gangming Zhao, Weifeng Ge, Yizhou Yu. 2021. GraphFPN: Graph Feature Pyramid Network for Object Detection. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021, 2743-2752. https://doi.org/10.1109/ICCV48922.2021.00276Google ScholarGoogle Scholar
  23. Joseph Redmon, Ali Farhadi. 2018. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767. https://doi.org/10.48550/arXiv.1804.02767Google ScholarGoogle Scholar
  24. Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He. 2017. Aggregated residual transformations for deep neural networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 5987-5995, https://doi.org/10.1109/CVPR.2017.634Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Improving the Detection Performance of Sparse R-CNN with Different Necks

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ICAIP '22: Proceedings of the 6th International Conference on Advances in Image Processing
      November 2022
      202 pages
      ISBN:9781450397155
      DOI:10.1145/3577117

      Copyright © 2022 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 25 February 2023

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited
    • Article Metrics

      • Downloads (Last 12 months)36
      • Downloads (Last 6 weeks)4

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format