A modified YOLOv4 detection method for a vision-based underwater garbage cleaning robot

Tian, Manjun; Li, Xiali; Kong, Shihan; Wu, Licheng; Yu, Junzhi

doi:10.1631/FITEE.2100473

A modified YOLOv4 detection method for a vision-based underwater garbage cleaning robot

基于改进YOLOv4的水下垃圾清理机器人视觉检测算法

Research Article
Published: 24 August 2022

Volume 23, pages 1217–1228, (2022)
Cite this article

Frontiers of Information Technology & Electronic Engineering Aims and scope Submit manuscript

Manjun Tian (田满军)^1,2,
Xiali Li (李霞丽)²,
Shihan Kong (孔诗涵)³,
Licheng Wu (吴立成)² &
…
Junzhi Yu (喻俊志) ORCID: orcid.org/0000-0002-6347-572X^3,4

2413 Accesses
Explore all metrics

Abstract

To tackle the problem of aquatic environment pollution, a vision-based autonomous underwater garbage cleaning robot has been developed in our laboratory. We propose a garbage detection method based on a modified YOLOv4, allowing high-speed and high-precision object detection. Specifically, the YOLOv4 algorithm is chosen as a basic neural network framework to perform object detection. With the purpose of further improvement on the detection accuracy, YOLOv4 is transformed into a four-scale detection method. To improve the detection speed, model pruning is applied to the new model. By virtue of the improved detection methods, the robot can collect garbage autonomously. The detection speed is up to 66.67 frames/s with a mean average precision (mAP) of 95.099%, and experimental results demonstrate that both the detection speed and the accuracy of the improved YOLOv4 are excellent.

摘要

为解决水环境污染问题, 依托基于视觉的水下垃圾自主清理机器人, 提出一种基于改进YOLOv4的垃圾检测方法, 可实现高速、高精度的目标检测. 具体而言, 选择YOLOv4算法作为执行目标检测的基本神经网络框架. 为进一步提高检测精度, 将传统YOLOv4改进为四尺度检测算法; 为提高检测速度, 对新模型进行模型剪枝操作. 同时, 将所提方法应用于水下机器人, 实现了自主垃圾收集作业. 检测速度可达66.67帧/秒, 平均准确率可达95.099%; 实验结果表明, 改进后的YOLOv4算法在检测速度和精度方面均表现优秀.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

GST-YOLO: a lightweight visual detection algorithm for underwater garbage detection

Article 16 June 2024

Design and Research of Intelligent Surface Garbage Cleaning Robot Based on Depth Vision Sensing

PAR-YOLO: a precise and real-time YOLO water surface garbage detection model

Article 02 January 2025

References

Albitar H, Dandan K, Ananiev A, et al., 2016. Underwater robotics: surface cleaning technics, adhesion and locomotion systems. Int J Adv Robot Syst, 13(1):7. https://doi.org/10.5772/62060
Article Google Scholar
Astapov S, Preden JS, Ehala J, et al., 2014. Object detection for military surveillance using distributed multimodal smart sensors. Proc 19^th Int Conf on Digital Signal Processing, p.366–371. https://doi.org/10.1109/ICDSP.2014.6900688
Bai JQ, Lian SG, Liu ZX, et al., 2018. Deep learning based robot for automatically picking up garbage on the grass. IEEE Trans Consum Electron, 64(3):382–389. https://doi.org/10.1109/TCE.2018.2859629
Article Google Scholar
Benjdira B, Khursheed T, Koubaa A, et al., 2019. Car detection using unmanned aerial vehicles: comparison between faster R-CNN and YOLOv3. Proc 1^st Int Conf on Unmanned Vehicle Systems-Oman, p.1–6. https://doi.org/10.1109/UVS.2019.8658300
Bochkovskiy A, Wang CY, Liao HYM, 2020. YOLOv4: optimal speed and accuracy of object detection. https://arxiv.org/abs/2004.10934
Choi H, 2018. Deep learning in nuclear medicine and molecular imaging: current perspectives and future directions. Nucl Med Mol Imag, 52(2):109–118. https://doi.org/10.1007/s13139-017-0504-7
Article Google Scholar
Dalal N, Triggs B, 2005. Histograms of oriented gradients for human detection. Proc IEEE Computer Society Conf on Computer Vision and Pattern Recognition, p.886–893. https://doi.org/10.1109/CVPR.2005.177
Ekins P, Gupta J, 2019. Perspective: a healthy planet for healthy people. Glob Sustain, 2:1–9. https://doi.org/10.1017/sus.2019.17
Google Scholar
Fei Y, Wang KCP, Zhang A, et al., 2020. Pixel-level cracking detection on 3D asphalt pavement images through deep-learning-based crackNet-V. IEEE Trans Intell Transp Syst, 21(1):273–284. https://doi.org/10.1109/TITS.2019.2891167
Article Google Scholar
Felzenszwalb P, McAllester D, Ramanan D, 2008. A discriminatively trained, multiscale, deformable part model. IEEE Int Conf on Computer Vision and Pattern Recognition, p.24–26.
Fu ZH, Chen YW, Yong HW, et al., 2019. Foreground gating and background refining network for surveillance object detection. IEEE Trans Image Process, 28(12):6077–6090. https://doi.org/10.1109/TIP.2019.2922095
Article MathSciNet Google Scholar
Girshick R, 2015. Fast R-CNN. Proc IEEE Int Conf on Computer Vision, p.1440–1448. https://doi.org/10.1109/ICCV.2015.169
Girshick R, Donahue J, Darrell T, et al., 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.580–587. https://doi.org/10.1109/CVPR.2014.81
Gural PS, 2019. Deep learning algorithms applied to the classification of video meteor detections. Mon Not R Astron Soc, 489(4):5109–5118. https://doi.org/10.1093/mnras/stz2456
Google Scholar
Hannun AY, Rajpurkar P, Haghpanahi M, et al., 2019. Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nat Med, 25(1):65–69. https://doi.org/10.1038/s41591-018-0268-3
Article Google Scholar
He KM, Zhang XY, Ren SQ, et al., 2015. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Patt Anal Mach Intell, 37(9):1904–1916. https://doi.org/10.1109/TPAMI.2015.2389824
Article Google Scholar
He KM, Zhang XY, Ren SQ, et al., 2016. Identity mappings in deep residual networks. Proc 14^th European Conf on Computer Vision, p.630–645. https://doi.org/10.1007/978-3-319-46493-0-38
Hong J, Fulton M, Sattar J, 2020. A generative approach towards improved robotic detection of marine litter. Proc IEEE Int Conf on Robotics and Automation, p.10525–10531. https://doi.org/10.1109/ICRA40945.2020.9197575
Horng GJ, Liu MX, Chen CC, 2020. The smart image recognition mechanism for crop harvesting system in intelligent agriculture. IEEE Sens J, 20(5):2766–2781. https://doi.org/10.1109/JSEN.2019.2954287
Article Google Scholar
Hsu WY, Lin WY, 2020. Ratio-and-scale-aware YOLO for pedestrian detection. IEEE Trans Image Process, 30:934–947. https://doi.org/10.1109/TIP.2020.3039574
Article Google Scholar
Hussain E, Hasan M, Rahman A, et al., 2021. CoroDet: a deep learning based classification for COVID-19 detection using chest X-ray images. Chaos Sol Fract, 142:110495. https://doi.org/10.1016/j.chaos.2020.110495
Article MathSciNet Google Scholar
Jambeck JR, Geyer R, Wilcox C, et al., 2015. Plastic waste inputs from land into the ocean. Science, 347(6223):768771. https://doi.org/10.1126/science.1260352
Article Google Scholar
Karatzas P, Melagraki G, Ellis LJA, et al., 2020. Development of deep learning models for predicting the effects of exposure to engineered nanomaterials on Daphnia magna. Small, 16(36):2001080. https://doi.org/10.1002/smll.202001080
Article Google Scholar
Kim J, Mishra AK, Limosani R, et al., 2019. Control strategies for cleaning robots in domestic applications: a comprehensive review. Int J Adv Robot Syst, 16(4):1–21. https://doi.org/10.1177/1729881419857432
Article Google Scholar
Kong SH, Tian MJ, Qiu CL, et al., 2021. IWSCR: an intelligent water surface cleaner robot for collecting floating garbage. IEEE Trans Syst Man Cybern Syst, 51(10):6358–6368. https://doi.org/10.1109/TSMC.2019.2961687
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton GE, 2017. ImageNet classification with deep convolutional neural networks. Commun ACM, 60(6):84–90. https://doi.org/10.1145/3065386
Article Google Scholar
Laschi C, Mazzolai B, Cianchetti M, 2016. Soft robotics: technologies and systems pushing the boundaries of robot abilities. Sci Robot, 41(1):eaah3690. https://doi.org/10.1126/scirobotics.aah3690
Article Google Scholar
Li CY, Guo CL, Ren WQ, et al., 2019. An underwater image enhancement benchmark dataset and beyond. IEEE Trans Image Process, 29:4376–4389. https://doi.org/10.1109/TIP.2019.2955241
Article Google Scholar
Li HP, Xiong PF, An J, et al., 2018. Pyramid attention network for semantic segmentation. Proc British Machine Vision Conf, p.285.
Li XL, Tian MJ, Kong SH, et al., 2020. A modified YOLOv3 detection method for vision-based water surface garbage capture robot. Int J Adv Robot Syst, 17(3):1–11. https://doi.org/10.1177/1729881420932715
Article Google Scholar
Lin TY, Dollár P, Girshick R, et al., 2017. Feature pyramid networks for object detection. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.936–944. https://doi.org/10.1109/CVPR.2017.106
Liu W, Anguelov D, Erhan D, et al., 2016. SSD: single shot multibox detector. European Conf on Computer Vision, p.21–37. https://doi.org/10.1007/978-3-319-46448-0-2
Liu Z, Li JG, Shen ZQ, et al., 2017. Learning efficient convolutional networks through network slimming. Proc IEEE Int Conf on Computer Vision, p.2755–2763. https://doi.org/10.1109/ICCV.2017.298
Lowe DG, 2004. Distinctive image features from scale-invariant keypoints. Int J Comput Vis, 60(2):91–110. https://doi.org/10.1023/B:VISI.0000029664.99615.94
Article Google Scholar
Mahler J, Pokorny FT, Hou B, et al., 2016. Dex-Net 1.0: a cloud-based network of 3D objects for robust grasp planning using a multi-armed bandit model with correlated rewards. Proc IEEE Int Conf on Robotics and Automation, p.1957–1964. https://doi.org/10.1109/ICRA.2016.7487342
Mahler J, Liang J, Niyaz S, et al., 2017. Dex-Net 2.0: deep learning to plan robust grasps with synthetic point clouds and analytic grasp metrics. https://arxiv.org/abs/1703.09312
Mahler J, Matl M, Liu XY, et al., 2018. Dex-Net 3.0: computing robust vacuum suction grasp targets in point clouds using a new analytic model and deep learning. Proc IEEE Int Conf on Robotics and Automation, p.5620–5627. https://doi.org/10.1109/ICRA.2018.8460887
Mahler J, Matl M, Satish V, et al., 2019. Learning ambidextrous robot grasping policies. Sci Robot, 4(26): eaau4984. https://doi.org/10.1126/scirobotics.aau4984
Article Google Scholar
Mhalla A, Chateau T, Gazzah S, et al., 2019. An embedded computer-vision system for multi-object detection in traffic surveillance. IEEE Trans Intell Transp Syst, 20(11):4006–4018. https://doi.org/10.1109/TITS.2018.2876614
Article Google Scholar
Ming X, Wei FY, Zhang T, et al., 2022. Group sampling for scale invariant face detection. IEEE Trans Patt Anal Mach Intell, 44(2):985–1001. https://doi.org/10.1109/TPAMI.2020.3012414
Article Google Scholar
Molchanov P, Mallya A, Tyree S, et al., 2019. Importance estimation for neural network pruning. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.11256–11264. https://doi.org/10.1109/CVPR.2019.01152
Ojala T, Pietikäinen M, Maenpaa T, 2002. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Patt Anal Mach Intell, 24(7):971–987. https://doi.org/10.1109/TPAMI.2002.1017623
Article Google Scholar
Ostle C, Thompson RC, Broughton D, et al., 2019. The rise in ocean plastics evidenced from a 60-year time series. Nat Commun, 10(1):1622. https://doi.org/10.1038/s41467-019-09506-1
Article Google Scholar
Park JH, Hwang HW, Moon JH, et al., 2019. Automated identification of cephalometric landmarks: Part 1—comparisons between the latest deep-learning methods YOLOV3 and SSD. Angle Orthod, 89(6):903–909. https://doi.org/10.2319/022019-127.1
Article Google Scholar
Prabakaran V, Elara MR, Pathmakumar T, et al., 2018. Floor cleaning robot with reconfigurable mechanism. Autom Constr, 91:155–165. https://doi.org/10.1016/j.autcon.2018.03.015
Article Google Scholar
Pu SL, Zhao W, Chen WJ, et al., 2021. Unsupervised object detection with scene-adaptive concept learning. Front Inform Technol Electron Eng, 22(5):638–651. https://doi.org/10.1631/FITEE.2000567
Article Google Scholar
Redmon J, Farhadi A, 2017. YOLO9000: better, faster, stronger. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.6517–6525. https://doi.org/10.1109/CVPR.2017.690
Redmon J, Farhadi A, 2018. YOLOv3: an incremental improvement. https://arxiv.org/abs/1804.02767
Redmon J, Divvala S, Girshick R, et al., 2016. You only look once: unified, realtime object detection. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.779–788. https://doi.org/10.1109/CVPR.2016.91
Ren SQ, He KM, Girshick RB, et al., 2015. Faster R-CNN: towards real-time object detection with region proposal networks. Proc Annual Conf on Neural Information Processing Systems, p.91–99.
Simonyan K, Zisserman A, 2015. Very deep convolutional networks for large-scale image recognition. https://arxiv.org/abs/1409.1556
Song ZG, Zou SM, Zhou WX, et al., 2020. Clinically applicable histopathological diagnosis system for gastric cancer detection using deep learning. Nat Commun, 11(1):4294. https://doi.org/10.1038/s41467-020-18147-8
Article Google Scholar
Szegedy C, Liu W, Jia YQ, et al., 2015. Going deeper with convolutions. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.1–9. https://doi.org/10.1109/CVPR.2015.7298594
Tian MJ, Li XL, Kong SH, et al., 2021. Pruning-based YOLOv4 algorithm for underwater gabage detection. Proc 40^th Chinese Control Conf, p.4008–4013. https://doi.org/10.23919/CCC52363.2021.9550592
Tschandl P, 2020. Problems and potentials of automated object detection for skin cancer recognition. JAMA Dermatol, 156(1):23–24. https://doi.org/10.1001/jamadermatol.2019.3360
Article Google Scholar
Valdenegro-Toro M, 2019. Deep neural networks for marine debris detection in sonar images. https://arxiv.org/abs/1905.05241
Viola P, Jones M, 2001. Rapid object detection using a boosted cascade of simple features. Proc IEEE Computer Society Conf on Computer Vision and Pattern Recognition, p.511–518. https://doi.org/10.1109/CVPR.2001.990517
Wang CY, Liao HYM, Wu YH, et al., 2020. CSPNet: a new backbone that can enhance learning capability of CNN. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition Workshops, p.1571–1580. https://doi.org/10.1109/CVPRW50498.2020.00203
Whitehill J, Omlin CW, 2006. Haar features for FACS AU recognition. Proc 7^th Int Conf on Automatic Face and Gesture Recognition, p.5–101. https://doi.org/10.1109/FGR.2006.61
Xu M, Karuppusamy NS, Kang BY, 2017. A novel design to improve the cooperative ability of the multi-cleaning robot in the unknown environment. Adv Sci Lett, 23(10):9557–9560. https://doi.org/10.1166/asl.2017.9746
Article Google Scholar

Download references

Author information

Authors and Affiliations

First Research Institute of the Ministry of Public Security of PRC, Beijing, 100048, China
Manjun Tian (田满军)
School of Information Engineering, Minzu University of China, Beijing, 100081, China
Manjun Tian (田满军), Xiali Li (李霞丽) & Licheng Wu (吴立成)
Department of Advanced Manufacturing and Robotics, College of Engineering, Peking University, Beijing, 100871, China
Shihan Kong (孔诗涵) & Junzhi Yu (喻俊志)
State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
Junzhi Yu (喻俊志)

Authors

Manjun Tian (田满军)
View author publications
You can also search for this author inPubMed Google Scholar
Xiali Li (李霞丽)
View author publications
You can also search for this author inPubMed Google Scholar
Shihan Kong (孔诗涵)
View author publications
You can also search for this author inPubMed Google Scholar
Licheng Wu (吴立成)
View author publications
You can also search for this author inPubMed Google Scholar
Junzhi Yu (喻俊志)
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

Manjun TIAN designed the research. Manjun TIAN, Xiali LI, and Shihan KONG proposed the methods. Manjun TIAN and Shihan KONG conducted the experiments. Licheng WU and Junzhi YU processed the data. Manjun TIAN and Shihan KONG drafted the paper. Xiali LI, Licheng WU, and Junzhi YU helped organize the paper. Shihan KONG and Junzhi YU revised and finalized the paper.

Corresponding author

Correspondence to Junzhi Yu (喻俊志).

Ethics declarations

Manjun TIAN, Xiali LI, Shihan KONG, Licheng WU, and Junzhi YU declare that they have no conflict of interest.

Additional information

Project supported by the National Natural Science Foundation of China (Nos. 61725305, U1909206, T2121002, and 62073196), the Postdoctoral Innovative Talent Support Program (No. BX2021010), and the S&T Program of Hebei Province, China (No. F2020203037)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tian, M., Li, X., Kong, S. et al. A modified YOLOv4 detection method for a vision-based underwater garbage cleaning robot. Front Inform Technol Electron Eng 23, 1217–1228 (2022). https://doi.org/10.1631/FITEE.2100473

Download citation

Received: 01 October 2021
Accepted: 07 March 2022
Published: 24 August 2022
Issue Date: August 2022
DOI: https://doi.org/10.1631/FITEE.2100473

Key words

关键词

CLC number

TP242

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A modified YOLOv4 detection method for a vision-based underwater garbage cleaning robot

Abstract

摘要

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

GST-YOLO: a lightweight visual detection algorithm for underwater garbage detection

Design and Research of Intelligent Surface Garbage Cleaning Robot Based on Depth Vision Sensing

PAR-YOLO: a precise and real-time YOLO water surface garbage detection model

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

关键词

CLC number

Subscribe and save

Buy Now