Detection and localization for lake floating objects based on CA-faster R-CNN

Yi, Zeren; Yao, Dongyi; Li, Guojin; Ai, Jiaoyan; Xie, Wei

doi:10.1007/s11042-022-12686-6

Detection and localization for lake floating objects based on CA-faster R-CNN

Published: 05 March 2022

Volume 81, pages 17263–17281, (2022)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Zeren Yi^1,2,
Dongyi Yao¹,
Guojin Li ORCID: orcid.org/0000-0001-7464-8561¹,
Jiaoyan Ai¹ &
…
Wei Xie²

481 Accesses
9 Citations
1 Altmetric
Explore all metrics

Abstract

As a general trend, unmanned ships have been gradually replacing humans and served as the cleaner of lakes. To work properly, those unmanned ships need to detect and localize lake floating objects that need to be collected. Compared to conventional image-based objects, lake floating objects are too small to detect. Meanwhile, because most conventional algorithms depend on bounding-boxes to detect the object, their results - it is hard to detect the accurate location of floating objects. To this end, this paper proposes a detection and localization algorithm based on CA-Faster R-CNN (Class Activation-Faster Regions with Convolutional Neural Network). Specifically, for an image with objects on it, the proposed algorithm detects and classifies objects with Faster R-CNN and localize objects with CA network. The experimental results show that, compared with the Faster R-CNN algorithm, this algorithm can reduce the positioning error without affecting the recognition accuracy, thereby can be used for the detection and localization of floating objects on the water surface. Compared with Faster R-CNN algorithm, the positioning accuracy of CA-Faster R-CNN algorithm is improved by 6.29 pixels. Also, the proposed algorithm remains a great potential for other objects that shared similar challenges with lake floating objects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

YOLO-based Object Detection Models: A Review and its Applications

Article 14 March 2024

Statistical Analysis of Design Aspects of Various YOLO-Based Deep Learning Models for Object Detection

Article Open access 02 August 2023

Data availability

All images used in this article can be provided by the corresponding author.

Code availability

Available upon request.

References

Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J et al (2016) TensorFlow: a system for large-scale machine learning. arXiv: 1605.08695
Agrawal P, Bhattacharya B (2013) Aquatic multi-robot system for lake cleaning. In: Proc. international conference on climbing & walking robots & the support technologies for mobile machines, Baltimore, USA, pp 171–178
Arcos-García Á, Álvarez-García JA, Soria-Morillo LM (2018) Evaluation of deep neural networks for traffic sign detection systems. Neurocomputing 316:S0925231218-30924X. https://doi.org/10.1016/j.neucom.2018.08.009
Article Google Scholar
Dai JF, Yi L, He KM, Jian S (2016) R-FCN: object detection via region-based fully convolutional networks. 2016 30th conference on neural information processing systems (NIPS), Barcelona, SPAIN, arXiv:1605.06409
Deng L, Yan LF, Zhang SH et al (2019) Intelligent recognition and judgment system of floating objects on water surface based on machine vision. Electronic Test, no 17, pp 133-134. https://doi.org/10.16520/j.cnki.1000-8519.2019.17.057
Everingham M, Eslami A, Gool L, Williams C, Winn J, Zisserman A (2015) The Pascal visual object classes challenge: a retrospective. Int J Comput Vis 111(1):98–136. https://doi.org/10.1007/s11263-014-0733-5
Article Google Scholar
Fang J, Feng SS, Feng Y (2017) Image algorithm of ship detection for surface vehicle. Transactions of Beijing Institute of Technology 37(12):1235–1240. https://doi.org/10.15918/j.tbit1001-0645.2017.12.005
Article Google Scholar
Girshick R (2015) Fast R-CNN. In: Proc. ICCV
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proc. CVPR, Columbus, OH, USA, pp 580–587 https://doi.org/10.1109/CVPR.2014.81
He K, Zhang X, Ren S, Sun J (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal 37(9):1904–1916. https://doi.org/10.1007/978-3-319-10578-9_23
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proc. computer vision, Santiago, Chile. https://doi.org/10.1109/ICCV.2015.123
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proc. CVPR, Las Vegas, Nevada. https://doi.org/10.1109/CVPR.2016.90
Hu G, Wang K, Peng Y, Qiu M, Shi J, Liu L (2018) Deep learning methods for underwater target feature extraction and recognition. Comput Intell Neurosci 2018:1–10. https://doi.org/10.1155/2018/1214301
Article Google Scholar
Li W, Eigen D, Fergus R (2015) End-to-end integration of a convolutional network, deformable parts model and non-maximum suppression. 2015 IEEE conference on computer vision and pattern recognition (CVPR). In: Proc. CVPR, Boston, MA, USA
Li YF, Zhou D, Ruan YD et al (2017) Ships saliency detection algorithm for inhibiting stern ripples based on video sequence. J Beijing Univ Posts Telecomm 40:72-76. https://doi.org/10.13190/j.jbupt.2017.s.016
Lin T, Dollár P, Girshick R, He K, Belongie S (2017) Feature pyramid networks for object detection. In: Proc. CVPR, Honolulu, HI, USA
Mahendran A, Vedaldi A (2013) Visualizing deep convolutional neural networks using natural pre-images. Int J Comput Vis 120. https://doi.org/10.1007/s11263-016-0911-8
Novatel (2003) GPS Position Accuracy Measures. Positioning Leadership APN-029 Rev 3. Novatel, Evry, pp 1–6
Google Scholar
Nowozin S (2014) Optimal decisions from probabilistic models: the intersection-over-union case. In: Proc. CVPR, Columbus, OH, USA. https://doi.org/10.1109/CVPR.2014.77
Pan C, Lu M, Xu B, Gao H (2019) An improved CNN model for within-project software defect prediction. Appl Sci 9:2138. https://doi.org/10.3390/app9102138
Article Google Scholar
Redmon J, Farhadi A (2017) YOLO9000: Better, faster, stronger. In: Proc. CVPR, Honolulu, HI, USA. https://doi.org/10.1109/CVPR.2017.690
Redmon J, Farhadi A (2018) YOLOv3: an incremental improvement. arXiv:1804.02767.
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proc CVPR. https://doi.org/10.1109/CVPR.2016.91
Ren SQ, He KM, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39. https://doi.org/10.1109/TPAMI.2016.2577031
Rs R, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-CAM: visual explanations from deep networks via gradient-based localization. 2017 16th IEEE international conference on computer vision (ICCV). In: Proc. Computer Vision, Venice, Italy. https://doi.org/10.1109/ICCV.2017.74
Sharma R, Savakis A (2015) Lean histogram of oriented gradients features for effective eye detection. J. Electron. Imaging 24(6):063007. https://doi.org/10.1117/1.JEI.24.6.063007
Article Google Scholar
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proc. CVPR, Las Vegas, Nevada. https://doi.org/10.1109/CVPR.2016.308
Szegedy C, Ioffe S, Vanhoucke V (2016) Inception-v4, inception-ResNet and the impact of residual connections on Learning. In: Proc. AAAI, Phoenix, Arizona, USA
Tang W, Liu SY, Gao H et al (2019) A target detection algorithm for surface cleaning robot based on machine vision. Sci Technol Eng 19(3):136–141
Google Scholar
Viola PA, Jones MJ (2001) Rapid object detection using a boosted cascade of simple features. In: Proc. CVPR, Kauai, HI, USA. https://doi.org/10.1109/cvpr.2001.990517
Voulodimos A, Doulamis N, Doulamis A, Protopapadakis E (2018) Deep learning for computer vision: a brief review. Comput Intell Neurosci 2018:1–13. https://doi.org/10.1155/2018/7068349
Article Google Scholar
Wang ZL, Liu YH, Yip HW, Peng B, Qian SY, Shi H (2008) Design and hydrodynamic modeling of a lake surface cleaning robot. 2008 IEEE/ASME international conference on advanced intelligent mechatronics. AIM, Xian, pp 1343–1348
Wang YQ, Ma L, Tian Y (2011) State-of-the-art of ship detection and recognition in optical remotely sensed imagery. Acta Automat Sin 37(9):1029–1039. https://doi.org/10.3724/SP.J.1004.2011.01029
Article Google Scholar
Wei JR (2017) Application of background texture model in sea surface target detection. Ship Sci Technol 39(10A):159–161. https://doi.org/10.3404/j.issn.1672-7649.2017.10A.054
Article Google Scholar
Wei L, Dragomir A, Dumitru E, Christian S, Scott R et al (2015) SSD: single shot MultiBox detector. arXiv:1512.02325. https://doi.org/10.1007/978-3-319-46448-0_2
Xie JX, Qiu GW, Chen WR, Cai Z, Zhu JQ, Zhang WH et al (2014) Design of multifunctional water-cleaning ship based on scenic lake pollution. Chin J Environ Eng 8(6):2371–2375
Google Scholar
Xue P (2017) Foreground and background segmentation based on superpiexel-level feature representation. J Xian Univ Sci Technol 37(5):731–735. https://doi.org/10.13800/j.cnki.xakjdxxb.2017.0520
Article Google Scholar
Yang L, Tian SW (2016) On identifying water body in remote sensing images based on distributed computing. Comput Appl Softw 33(6):138–140+145. https://doi.org/10.3969/j.issn.1000-386x.2016.06.034
Article MathSciNet Google Scholar
Yang GC, Yang J, Su ZD, Chen ZJ (2018) An improved YOLO feature extraction algorithm and its application to privacy situation detection of social robots. Acta Automat Sin 44:2238–2249
Google Scholar
Ye XJ, Zhao JF, Gong XL et al (2018) Infrared image enhancement for dim target based on edge weight analysis. Laser Infrared 48(1):119–123. https://doi.org/10.3969/j.issn.1001-5078.2018.01.022
Article Google Scholar
Yildirimoglu M et al (2013) Experienced travel time prediction for congested freeways. Transport Res B Methodol
Yoshua B, Aaron C, Pascal V (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828. https://doi.org/10.1109/TPAMI.2013.50
Article Google Scholar
Yu L, Wang RX (2002) Object detection and recognition based on multiscale deformable template. J Comput Res Dev 39(10):1325–1330
Google Scholar
Zhang BC, Gao YS, Zhao SQ, Liu JZ (2010) Local derivative pattern versus local binary pattern: face recognition with high-order local pattern descriptor. IEEE Trans Image Process 19(2):533–544. https://doi.org/10.1109/TIP.2009.2035882
Article MathSciNet MATH Google Scholar
Zhu YQ, Li CW (2007) A method and implementation for region-based image retrieval using partition of foreground and background. J Image Graph 12(02):234–238
Google Scholar

Download references

Funding

This work was supported in part by the Guangxi Innovation-driven Development Special Project of China under grant no. AA17202032-2, in part by the Key-Area Research and Development Program of Guangdong Province of China under grant no. 2018B010108001, and in part by Key-Area Research and Development Program of Foshan City under the grant no. 2020001006812.

Author information

Authors and Affiliations

School of Electrical Engineering, Guangxi University, Nanning, 530004, China
Zeren Yi, Dongyi Yao, Guojin Li & Jiaoyan Ai
School of Automation Science and Engineering, South China University of Technology, Guangzhou, 510641, China
Zeren Yi & Wei Xie

Authors

Zeren Yi
View author publications
You can also search for this author in PubMed Google Scholar
Dongyi Yao
View author publications
You can also search for this author in PubMed Google Scholar
Guojin Li
View author publications
You can also search for this author in PubMed Google Scholar
Jiaoyan Ai
View author publications
You can also search for this author in PubMed Google Scholar
Wei Xie
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guojin Li.

Ethics declarations

Conflict of interest

None.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yi, Z., Yao, D., Li, G. et al. Detection and localization for lake floating objects based on CA-faster R-CNN. Multimed Tools Appl 81, 17263–17281 (2022). https://doi.org/10.1007/s11042-022-12686-6

Download citation

Received: 11 October 2020
Revised: 03 March 2021
Accepted: 21 February 2022
Published: 05 March 2022
Issue Date: May 2022
DOI: https://doi.org/10.1007/s11042-022-12686-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Detection and localization for lake floating objects based on CA-faster R-CNN

Abstract

Access this article

Similar content being viewed by others

Object detection using YOLO: challenges, architectural successors, datasets and applications

YOLO-based Object Detection Models: A Review and its Applications

Statistical Analysis of Design Aspects of Various YOLO-Based Deep Learning Models for Object Detection

Data availability

Code availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Detection and localization for lake floating objects based on CA-faster R-CNN

Abstract

Access this article

Similar content being viewed by others

Object detection using YOLO: challenges, architectural successors, datasets and applications

YOLO-based Object Detection Models: A Review and its Applications

Statistical Analysis of Design Aspects of Various YOLO-Based Deep Learning Models for Object Detection

Data availability

Code availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation