research-article

Improving the Detection Performance of Sparse R-CNN with Different Necks

Authors:
Zhaodong Zheng

Harbin Engineering University, China

Harbin Engineering University, China

0000-0003-2716-6310
View Profile

,
Zefeng Zhang

Harbin Engineering University, China

Harbin Engineering University, China

0000-0002-1056-0824
View Profile

,
Miao Fan

Harbin Engineering University, China

Harbin Engineering University, China

0000-0002-1624-5753
View Profile

,
Lilian Huang

Harbin Engineering University, China

Harbin Engineering University, China

0000-0002-3589-285X
View Profile

ICAIP '22: Proceedings of the 6th International Conference on Advances in Image ProcessingNovember 2022Pages 7–12https://doi.org/10.1145/3577117.3577135

Published:25 February 2023Publication History

ICAIP '22: Proceedings of the 6th International Conference on Advances in Image Processing

Pages 7–12

ABSTRACT

Sparse R-CNN uses a purely sparse method to detect objects and achieves good results. However, it does not make full use of the features extracted from the image, so its detection performance needs to be further improved. And we propose Sparse R-CNNv1 and Sparse R-CNNv2. In these algorithms, we use VOVNet with attention mechanism to replace ResNet of the original Sparse R-CNN as our backbone. In addition, we also use two different improved neck networks, Augpan and FPNencoder, to further improve the detection performance of the algorithm from the perspective of feature fusion and increasing the receptive field of each layer, respectively. Our algorithms are trained and verified on COCO2017, and the experimental results show that Sparser-CNNv1 achieves 45.0 AP and Sparser-CNNV2 achieves 45.3 AP, higher than the original SparseR-CNN's 43.0 AP in standard 3× training schedule.

References

Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2017. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. J. IEEE Transactions on Pattern Analysis & Machine Intelligence, 39(6):1137-1149. https://doi.org/10.1109/TPAMI.2016.2577031Google ScholarDigital Library
Yuxin Wu, Alexander Kirillov, Francisco Massa, Wan-Yen Lo, and Ross Girshick. 2019. Detectron2. https://github. com/facebookresearch/detectron2Google Scholar
Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017. Feature Pyramid Networks for Object Detection. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 936-944. https://doi.org/ 10.1109/CVPR.2017.106Google Scholar
Chien-Yao Wang, Hong-Yuan Mark Liao, Yueh-Hua Wu, Ping-Yang Chen, Jun-Wei Hsieh, I-Hau Yeh. 2020. CSPNet: A New Backbone that can Enhance Learning Capability of CNN. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW),1571-1580, https://doi.org/10.1109/CVPRW50498.2020.00203Google ScholarCross Ref
Alexey Bochkovskiy, Chien-Yao Wang, Hong-Yuan Mark Liao. 2020. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv preprint arXiv:2004.10934. https://doi.org/10.48550/arXiv.2004.10934Google Scholar
Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollar. 2017. Focal loss for dense object detection. J. IEEE Transactions on Pattern Analysis & Machine Intelligence, (99):2999-3007. https://doi.org/10.1109/TPAMI.2018.2858826Google Scholar
Zhi Tian, Chunhua Shen, Hao Chen, and Tong He. 2019. FCOS: Fully convolutional one-stage object detection. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 9626-9635, https://doi.org/10.1109/ICCV.2019.00972Google ScholarCross Ref
Chaoxu Guo, Bin Fan, Qian Zhang, Shiming Xiang, Chunhong Pan. 2020. Augfpn: Improving multi-scale feature learning for object detection. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 12592-12601. https://doi.org/10.1109/CVPR42600.2020.01261Google ScholarCross Ref
Hei Law, Jia Deng. 2018. Cornernet: Detecting objects as paired keypoints. Proceedings of the European conference on computer vision (ECCV). 734-750. https://doi.org/10.48550/arXiv.1808.01244Google ScholarDigital Library
Zhaohui Zheng, Ping Wang, Wei Liu, Jinze Li, Rongguang Ye, Dongwei Ren. 2019. Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression. arXiv preprint arXiv:arXiv:1911.08287. https://doi.org/10.48550/arXiv.1911.08287Google Scholar
Qiang Chen, Yingming Wang, Tong Yang, Xiangyu Zhang, Jian Cheng, Jian Sun. 2021. You only look one-level feature. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 13034-13043. https://doi.org/10.1109/CVPR46437.2021.01284Google ScholarCross Ref
Joseph Redmon and Ali Farhadi. 2017. YOLO9000: Better, faster, stronger. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 6517-6525. https://doi.org/10.1109/CVPR.2017.690Google ScholarCross Ref
Xizhou Zhu, Weijie Su, Lewei Lu, Bin Li, Xiaogang Wang, and Jifeng Dai. 2020. Deformable detr: Deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159. https://doi.org/10.48550/arXiv.2010.04159Google Scholar
Mahyar Najibi, Mohammad Rastegari, and Larry S Davis. 2016. G-CNN: an iterative grid based object detector. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2369-2377. https://doi.org/10.1109/CVPR.2016.260Google ScholarCross Ref
Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. 2020. End-to-End object detection with transformers. European conference on computer vision. Springer, Cham, 213-229. https://doi.org/10.48550/arXiv.2005.12872Google ScholarDigital Library
Youngwan Lee, Joong-won Hwang, Sangrok Lee, Yuseok Bae, Jongyoul Park. 2019. An energy and GPU-computation efficient backbone network for real-time object detection. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 752-760. https://doi.org/10.1109/CVPRW.2019.00103Google ScholarCross Ref
Jie Hu, Li Shen, Gang Sun. 2017. Squeeze-and-excitation networks. J. IEEE Transactions on Pattern Analysis and Machine Intelligence, (99). https://doi.org/10.1109/TPAMI.2019.2913372Google ScholarDigital Library
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. 2014. Microsoft COCO: Common objects in context. European Conference on Computer Vision. Springer International Publishing. https://doi.org/10.1007/978-3-319-10602-1_48Google Scholar
Zeming Li, Chao Peng, Gang Yu, Xiangyu Zhang, Yangdong Deng, Jian Sun. 2018. Detnet: A backbone network for object detection. J. arXiv preprint arXiv:1804.06215. https://doi.org/10.48550/arXiv.1804.06215Google Scholar
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 770-778. https://doi.org/10.1109/CVPR.2016.90Google ScholarCross Ref
Shu Liu, Lu Qi, Haifang Qin, Jianping Shi, Jiaya Jia. 2018. Path aggregation network for instance segmentation. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8759-8768, https://doi.org/10.1109/CVPR.2018.00913Google ScholarCross Ref
Gangming Zhao, Weifeng Ge, Yizhou Yu. 2021. GraphFPN: Graph Feature Pyramid Network for Object Detection. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021, 2743-2752. https://doi.org/10.1109/ICCV48922.2021.00276Google Scholar
Joseph Redmon, Ali Farhadi. 2018. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767. https://doi.org/10.48550/arXiv.1804.02767Google Scholar
Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He. 2017. Aggregated residual transformations for deep neural networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 5987-5995, https://doi.org/10.1109/CVPR.2017.634Google ScholarCross Ref

Index Terms

Improving the Detection Performance of Sparse R-CNN with Different Necks
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object detection

Recommendations

Performance analysis of different DCNN models in remote sensing image object detection
Abstract
In recent years, deep learning, especially deep convolutional neural networks (DCNN), has made great progress. Many researchers use different DCNN models to detect remote sensing targets. Different DCNN models have different advantages and ...
Read More
Multiple Objects Detection based on Improved Faster R-CNN
ICSPS 2017: Proceedings of the 9th International Conference on Signal Processing Systems

Object detection is one of the hotspots in recent years. In order to solve those problems that many traditional methods exist such as single object detection and poor robustness detection, a multiple objects detection model based on the improved Faster ...
Read More
Multi-model ensemble with rich spatial information for object detection
Highlights
- Ensemble learning improves the performance of object detection and achieves the mAP of state-of-the-art detectors.
Abstract
Due to the development of deep learning networks and big data dimensionality, research on ensemble deep learning is receiving an increasing amount of attention. This paper takes the object detection task as the research domain and ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICAIP '22: Proceedings of the 6th International Conference on Advances in Image Processing
November 2022
202 pages
ISBN:9781450397155
DOI:10.1145/3577117

Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 February 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Object detection
backbone
feature fusion
neck
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 44
  Total Downloads
- Downloads (Last 12 months)36
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Improving the Detection Performance of Sparse R-CNN with Different Necks

ICAIP '22: Proceedings of the 6th International Conference on Advances in Image Processing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Performance analysis of different DCNN models in remote sensing image object detection

Multiple Objects Detection based on Improved Faster R-CNN

Multi-model ensemble with rich spatial information for object detection

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Improving the Detection Performance of Sparse R-CNN with Different Necks

ICAIP '22: Proceedings of the 6th International Conference on Advances in Image Processing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Performance analysis of different DCNN models in remote sensing image object detection

Multiple Objects Detection based on Improved Faster R-CNN

Multi-model ensemble with rich spatial information for object detection

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media