DDH-YOLOv5: improved YOLOv5 based on Double IoU-aware Decoupled Head for object detection

Wang, Hui; Jin, Yang; Ke, Hongchang; Zhang, Xinping

doi:10.1007/s11554-022-01241-z

DDH-YOLOv5: improved YOLOv5 based on Double IoU-aware Decoupled Head for object detection

Original Research Paper
Published: 13 August 2022

Volume 19, pages 1023–1033, (2022)
Cite this article

Journal of Real-Time Image Processing Aims and scope Submit manuscript

3518 Accesses
34 Citations
Explore all metrics

Abstract

YOLOv5 is a high-performance real-time object detector that plays an important role in one-stage detectors. However, there are two problems with the design of the YOLOv5 head. The common branch of classification task and regression task of the YOLOv5 head will hurt the training process, and the correlation between classification score and localization accuracy is low. We propose a Double IoU-aware Decoupled Head (DDH) and apply it to YOLOv5. The improved model is named DDH-YOLOv5, which substantially improves the localization accuracy of the model without significantly increasing FLOPS and parameters. Extensive experiments on dataset PASCAL VOC2007 show that DDH-YOLOv5 has good performance. Compared with YOLOv5, DDH-YOLOv5m and DDH-YOLOv5l proposed in this paper achieve 2.4$\%$ and 1.3$\%$ improvement in Average Precision (AP), respectively. Compared with Deformable DETR, which is known for its fast-converging, DDH-YOLOv5 completely outperforms Deformable DETR on COCO2017 Val with half of FLOPS and only a quarter of epochs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

YOLO-SDH: improved YOLOv5 using scaled decoupled head for object detection

Article 07 October 2024

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

$\eta$-repyolo: real-time object detection method based on $\eta$-RepConv and YOLOv8

Article 03 May 2024

References

Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020)
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European conference on computer vision, pp. 213–229. Springer (2020)
Chen, Q., Wang, Y., Yang, T., Zhang, X., Cheng, J., Sun, J.: You only look one-level feature. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 13039–13048 (2021)
Dai, J., Li, Y., He, K., Sun, J.: R-fcn: Object detection via region-based fully convolutional networks. Advances in neural information processing systems 29 (2016)
Dai, J., Qi, H., Xiong, Y., Li, Y., Zhang, G., Hu, H., Wei, Y.: Deformable convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp. 764–773 (2017)
Dai, X., Chen, Y., Xiao, B., Chen, D., Liu, M., Yuan, L., Zhang, L.: Dynamic head: Unifying object detection heads with attentions. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7373–7382 (2021)
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. Int. J. Computer Vision 88(2), 303–338 (2010)
Article Google Scholar
Ge, Z., Liu, S., Wang, F., Li, Z., Sun, J.: Yolox: Exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430 (2021)
Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp. 1440–1448 (2015)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 580–587 (2014)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778 (2016)
Jiang, B., Luo, R., Mao, J., Xiao, T., Jiang, Y.: Acquisition of localization confidence for accurate object detection. In: Proceedings of the European conference on computer vision (ECCV), pp. 784–799 (2018)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Adv. Neural Inform. Process. Syst. 25 (2012)
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2117–2125 (2017)
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp. 2980–2988 (2017)
Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. In: European conference on computer vision, pp. 740–755. Springer (2014)
Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8759–8768 (2018)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: Ssd: Single shot multibox detector. In: European conference on computer vision, pp. 21–37. Springer (2016)
Pang, J., Chen, K., Shi, J., Feng, H., Ouyang, W., Lin, D.: Libra r-cnn: Towards balanced learning for object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 821–830 (2019)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779–788 (2016)
Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7263–7271 (2017)
Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. Neural Inform. Process. Syst. 28 (2015)
Song, G., Liu, Y., Wang, X.: Revisiting the sibling head in object detector. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11563–11572 (2020)
Ultralytics: Yolov5. https://github.com/ultralytics/yolov5 (2021)
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Adv. Neural Inform. Process. Syst. 30 (2017)
Wang, C.Y., Liao, H.Y.M., Wu, Y.H., Chen, P.Y., Hsieh, J.W., Yeh, I.H.: Cspnet: A new backbone that can enhance learning capability of cnn. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp. 390–391 (2020)
Wu, S., Li, X., Wang, X.: Iou-aware single-stage object detector for accurate localization. Image Vision Comput. 97, 103911 (2020)
Wu, Y., Chen, Y., Yuan, L., Liu, Z., Wang, L., Li, H., Fu, Y.: Rethinking classification and localization for object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 10186–10195 (2020)
Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz, D.: mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412 (2017)
Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., Ren, D.: Distance-iou loss: Faster and better learning for bounding box regression. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 12993–13000 (2020)
Zhu, X., Lyu, S., Wang, X., Zhao, Q.: Tph-yolov5: Improved yolov5 based on transformer prediction head for object detection on drone-captured scenarios. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2778–2788 (2021)
Zhu, X., Su, W., Lu, L., Li, B., Wang, X., Dai, J.: Deformable detr: Deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159 (2020)

Download references

Acknowledgements

This work was supported by the Jilin Province Science and Technology Department Science and Technology Development Planning Project of China (YDZJ202201ZYTS556), and the Jilin Province Education Department Scientific Research Planning Project of China (JJKH20210753KJ).

Author information

Authors and Affiliations

College of Computer Science and Engineering, Changchun University of Technology, Changchun, 130051, Jilin, China
Hui Wang, Yang Jin & Xinping Zhang
School of Computer Technology and Engineering, Changchun Institute of Technology, Changchun, 130012, Jilin, China
Hongchang Ke

Authors

Hui Wang
View author publications
You can also search for this author inPubMed Google Scholar
Yang Jin
View author publications
You can also search for this author inPubMed Google Scholar
Hongchang Ke
View author publications
You can also search for this author inPubMed Google Scholar
Xinping Zhang
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Hongchang Ke.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, H., Jin, Y., Ke, H. et al. DDH-YOLOv5: improved YOLOv5 based on Double IoU-aware Decoupled Head for object detection. J Real-Time Image Proc 19, 1023–1033 (2022). https://doi.org/10.1007/s11554-022-01241-z

Download citation

Received: 02 May 2022
Accepted: 04 August 2022
Published: 13 August 2022
Issue Date: December 2022
DOI: https://doi.org/10.1007/s11554-022-01241-z

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DDH-YOLOv5: improved YOLOv5 based on Double IoU-aware Decoupled Head for object detection

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

YOLO-SDH: improved YOLOv5 using scaled decoupled head for object detection

Object detection using YOLO: challenges, architectural successors, datasets and applications

\(\eta\)-repyolo: real-time object detection method based on \(\eta\)-RepConv and YOLOv8

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

DDH-YOLOv5: improved YOLOv5 based on Double IoU-aware Decoupled Head for object detection

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

YOLO-SDH: improved YOLOv5 using scaled decoupled head for object detection

Object detection using YOLO: challenges, architectural successors, datasets and applications

\(\eta\)-repyolo: real-time object detection method based on \(\eta\)-RepConv and YOLOv8

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now