skip to main content
10.1145/3650215.3650367acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlcaConference Proceedingsconference-collections
research-article

MDCYOLO: Improved YOLOv5 Algorithm with Modified Deformable Convolution

Published: 16 April 2024 Publication History

Abstract

Convolution is a fundamental operation for feature extraction in deep neural networks. However, traditional convolutional operations use fixed-shaped kernels, resulting in a stable receptive field for each point on the output feature map. This stability limits their adaptability to irregularly shaped objects. The mainstream deep learning network model uses convolution as the basic method of feature extraction, which also results in the model having poor recognition effect on data sets with large differences in shape and size. Deformable Convolutional Networks (DCN) series of work calculate corresponding offsets from the perspective of shape for the points on the feature map participating in the convolution operation, thereby changing the shape of the receptive field. However, DCN does not consider adjusting the feature map weights. Therefore, we proposed a modified deformable convolution (MDC), adding a mask to adjust the weight of the feature map based on Deformable ConvNets v2 (DCNv2) to simultaneously adjust the shape and weight of the feature map participating in the convolution operation. Furthermore, we used MDC in YOLOv5 and named the improved YOLOv5 MDCYOLO. Experimental results show that the detection accuracy of the MDC is significantly higher than DCNv2, and ultimately increased by 1.9% on the Pascal VOC data set and 3.1% on the COCO data set.

References

[1]
J. Dai, H. Qi, Y. Xiong, Y. Li, G. Zhang, H. Hu, and Y. Wei, “Deformable convolutional networks,” in {Proceedings of the IEEE international conference on computer vision}, 2017, pp. 764–773.
[2]
S. Albawi, T. A. Mohammed, and S. Al-Zawi, “Understanding of a convolutional neural network,” in {2017 International Conference on Engineering and Technology (ICET)}, 2017, pp. 1–6.
[3]
Q. Wang, S. Zhang, Y. Qian, G. Zhang, and H. Wang, “Enhancing representation learning by exploiting effective receptive fields for object detection,” {Neurocomputing}, vol. 481, pp. 22–32, 2022. [Online]. Available: https://doi.org/10.1016/j.neucom.2022.01.0 20
[4]
J. Dai, H. Qi, Y. Xiong, Y. Li, G. Zhang, H. Hu, and Y. Wei, “Deformable convolutional networks,” in {Proceedings of the IEEE international conference on computer vision}, 2017, pp. 764–773.
[5]
X. Zhu, H. Hu, S. Lin, and J. Dai, “Deformable convnets v2: More deformable, better results,” in {Proceedings of the IEEE/CVF conference on computer vision and pattern recognition}, 2019, pp. 9308– 9316.
[6]
S. Shetty, “Application of convolutional neural network for image classification on Pascal VOC challenge 2012 dataset,” {arXiv preprint arXiv:1607.03785}, 2016.
[7]
Tsung-Yi Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft coco: Common objects in context,” in {Computer Vision-ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13}, 2014, pp. 740–755
[8]
A. M. Roy, R. Bose, and J. Bhaduri, “A fast accurate fine-grain object detection model based on YOLOv4 deep neural network,” {Neural Computing and Applications}, pp. 1–27, 2022.
[9]
S. Wu and Y. Xu, “DSN: A new deformable subnetwork for object detection,” {IEEE Transactions on Circuits and Systems for Video Technology}, vol. 30, no. 7, pp. 2057–2066, 2019.
[10]
A. G. Asuero, A. Sayago, and A. G. González, “The correlation coefficient: An overview,” {Critical reviews in analytical chemistry}, vol. 36, no. 1, pp. 41–59, 2006.
[11]
Q. Xu, Z. Zhu, H. Ge, Z. Zhang, and X. Zang, “Effective face detector based on YOLOv5 and super resolution reconstruction,” {Computational and Mathematical Methods in Medicine}, vol. 2021, pp. 1–9, 2021.

Index Terms

  1. MDCYOLO: Improved YOLOv5 Algorithm with Modified Deformable Convolution

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICMLCA '23: Proceedings of the 2023 4th International Conference on Machine Learning and Computer Application
    October 2023
    1065 pages
    ISBN:9798400709449
    DOI:10.1145/3650215
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 16 April 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    Conference

    ICMLCA 2023

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 13
      Total Downloads
    • Downloads (Last 12 months)13
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 13 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media