skip to main content
10.1145/3573942.3574029acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaiprConference Proceedingsconference-collections
research-article

SN-YOLO: Improved YOLOv5 with Softer-NMS and SIOU for Object Detection

Published: 16 May 2023 Publication History

Abstract

As a lightweight target detection network, YOLOv5 is popular in the industry for its advantages of fast speed and small model, but the detection accuracy is not very high. In response to this problem, we propose an improved model SN-YOLO based on YOLOv5. First, we introduce Softer-NMS as the post-processing method of the model, which will make the prediction box more accurate. Secondly, we improved the loss function of the original algorithm and introduced the SIOU loss function to optimize the model and improve the accuracy of the algorithm. Finally, in order to improve the feature extraction ability of the backbone, we implanted the CBAM (Convolutional block attention module) module into the algorithm. We validate the model using the 2007 and 2012 datasets of PASCAL VOC. The experimental results show that SN-YOLO has a great improvement over the original model in all aspects. The effectiveness of the algorithm is verified.

References

[1]
Ross Girshick, Jeff Donahue, Trevor Darrell and Jitendra Malik. 2014. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 580-587.
[2]
Alexander Neubeck and Luc Van Gool. 2006. Efficient Non-Maximum Suppression. In Proceedings of the International Conference on Pattern Recognition. 20-24.
[3]
Yihui He, Xiangyu Zhang, Marios Savvides and Kris Kitani. 2018. Softer-NMS: Rethinking Bounding Box Regression for Accurate Object Detection. In Proceedings of the IEEE conference on computer vision and pattern recognition.
[4]
Zhora Gevorgyan. 2022. SIoU Loss: More Powerful Learning for Bounding Box Regression. In Proceedings of the IEEE conference on computer vision and pattern recognition.
[5]
Sanghyun Woo, Jongchan Park, Joon-Young Lee and In So Kweon. 2018. CBAM: Convolutional Block Attention Module. In Proceedings of the IEEE conference on computer vision and pattern recognition.
[6]
Zhaohui Zheng, Ping Wang, Dongwei Ren, Wei Liu, Rongguang Ye, Qinghua Hu and Wangmeng Zuo. 2020. Enhancing Geometric Factors in Model Learning and Inference for Object Detection and Instance Segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition.
[7]
Stefan Elfwing, Eiji Uchibe and Kenji Doya. 2018. Sigmoid-weighted linear units for neural network function approximation in reinforcement learning. Neural Networks. arXiv preprint arXiv.1702.03118.
[8]
Sergey Ioffe and Christian Szegedy. 2015. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In Proceedings of the International Conference on Machine Learning. arXiv preprint arXiv.1502.03167.
[9]
Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva and Antonio Torralba. 2015. Learning Deep Features for Discriminative Localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2921-2929.
[10]
Xinyi Ying, Yingqian Wang, Longguang Wang and Weidong Sheng. 2020. A Stereo Attention Module for Stereo Image Super-Resolution. In Proceedings of the IEEE Signal Processing Letters. 496-500.
[11]
Y-Lan Boureau, Francis Bach, Yann LeCun and Jean Ponce. Learning mid-level features for recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2559-2566.
[12]
Everingham Mark, Luc Van Gool, Christopher K. I.Williams, John M. Winn, and Andrew Zisserman. 2010. The Pascal Visual Object Classes (VOC) Challenge. 303-338.
[13]
Sebastian Ruder. 2016. An overview of gradient descent optimization algorithms. In Proceedings of the Machine Learning. arXiv preprint arXiv.1609.04747
[14]
Joseph Redmon and Ali Farhadi. 2018. YOLOv3: An Incremental Improvement. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. arXiv preprint arXiv.1804.02767.
[15]
Alexey Bochkovskiy, Chien-Yao Wang and Hong-Yuan Mark Liao. 2020. YOLOv4: Optimal Speed and Accuracy of Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. arXiv preprint arXiv.2004.10934.
[16]
Ilya Tolstikhin, Neil Houlsby, Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai and Alexey Dosovitskiy. 2021. MLP-Mixer: An all-MLP Architecture for Vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

Cited By

View all
  • (2024)Traffic Sign Detection Algorithm Based on Improved YOLOv52024 17th International Conference on Advanced Computer Theory and Engineering (ICACTE)10.1109/ICACTE62428.2024.10871684(150-158)Online publication date: 13-Sep-2024
  • (2024)Rapid Detection of PCB Defects Based on YOLOx-Plus and FPGAIEEE Access10.1109/ACCESS.2024.338794712(61343-61358)Online publication date: 2024
  • (2024)Lightweight coal and gangue detection algorithm based on improved Yolov7-tinyInternational Journal of Coal Preparation and Utilization10.1080/19392699.2023.230130444:11(1773-1792)Online publication date: 4-Jan-2024

Index Terms

  1. SN-YOLO: Improved YOLOv5 with Softer-NMS and SIOU for Object Detection

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    AIPR '22: Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition
    September 2022
    1221 pages
    ISBN:9781450396899
    DOI:10.1145/3573942
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 16 May 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Attention mechanism
    2. Deep learning
    3. Loss function
    4. Object detection
    5. YOLOv5

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    • Science Research Plan of the Shaanxi Provincial Department of Education

    Conference

    AIPR 2022

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)32
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 01 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Traffic Sign Detection Algorithm Based on Improved YOLOv52024 17th International Conference on Advanced Computer Theory and Engineering (ICACTE)10.1109/ICACTE62428.2024.10871684(150-158)Online publication date: 13-Sep-2024
    • (2024)Rapid Detection of PCB Defects Based on YOLOx-Plus and FPGAIEEE Access10.1109/ACCESS.2024.338794712(61343-61358)Online publication date: 2024
    • (2024)Lightweight coal and gangue detection algorithm based on improved Yolov7-tinyInternational Journal of Coal Preparation and Utilization10.1080/19392699.2023.230130444:11(1773-1792)Online publication date: 4-Jan-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media