Abstract
As a general trend, unmanned ships have been gradually replacing humans and served as the cleaner of lakes. To work properly, those unmanned ships need to detect and localize lake floating objects that need to be collected. Compared to conventional image-based objects, lake floating objects are too small to detect. Meanwhile, because most conventional algorithms depend on bounding-boxes to detect the object, their results - it is hard to detect the accurate location of floating objects. To this end, this paper proposes a detection and localization algorithm based on CA-Faster R-CNN (Class Activation-Faster Regions with Convolutional Neural Network). Specifically, for an image with objects on it, the proposed algorithm detects and classifies objects with Faster R-CNN and localize objects with CA network. The experimental results show that, compared with the Faster R-CNN algorithm, this algorithm can reduce the positioning error without affecting the recognition accuracy, thereby can be used for the detection and localization of floating objects on the water surface. Compared with Faster R-CNN algorithm, the positioning accuracy of CA-Faster R-CNN algorithm is improved by 6.29 pixels. Also, the proposed algorithm remains a great potential for other objects that shared similar challenges with lake floating objects.
Similar content being viewed by others
Data availability
All images used in this article can be provided by the corresponding author.
Code availability
Available upon request.
References
Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J et al (2016) TensorFlow: a system for large-scale machine learning. arXiv: 1605.08695
Agrawal P, Bhattacharya B (2013) Aquatic multi-robot system for lake cleaning. In: Proc. international conference on climbing & walking robots & the support technologies for mobile machines, Baltimore, USA, pp 171–178
Arcos-García Á, Álvarez-García JA, Soria-Morillo LM (2018) Evaluation of deep neural networks for traffic sign detection systems. Neurocomputing 316:S0925231218-30924X. https://doi.org/10.1016/j.neucom.2018.08.009
Dai JF, Yi L, He KM, Jian S (2016) R-FCN: object detection via region-based fully convolutional networks. 2016 30th conference on neural information processing systems (NIPS), Barcelona, SPAIN, arXiv:1605.06409
Deng L, Yan LF, Zhang SH et al (2019) Intelligent recognition and judgment system of floating objects on water surface based on machine vision. Electronic Test, no 17, pp 133-134. https://doi.org/10.16520/j.cnki.1000-8519.2019.17.057
Everingham M, Eslami A, Gool L, Williams C, Winn J, Zisserman A (2015) The Pascal visual object classes challenge: a retrospective. Int J Comput Vis 111(1):98–136. https://doi.org/10.1007/s11263-014-0733-5
Fang J, Feng SS, Feng Y (2017) Image algorithm of ship detection for surface vehicle. Transactions of Beijing Institute of Technology 37(12):1235–1240. https://doi.org/10.15918/j.tbit1001-0645.2017.12.005
Girshick R (2015) Fast R-CNN. In: Proc. ICCV
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proc. CVPR, Columbus, OH, USA, pp 580–587 https://doi.org/10.1109/CVPR.2014.81
He K, Zhang X, Ren S, Sun J (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal 37(9):1904–1916. https://doi.org/10.1007/978-3-319-10578-9_23
He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proc. computer vision, Santiago, Chile. https://doi.org/10.1109/ICCV.2015.123
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proc. CVPR, Las Vegas, Nevada. https://doi.org/10.1109/CVPR.2016.90
Hu G, Wang K, Peng Y, Qiu M, Shi J, Liu L (2018) Deep learning methods for underwater target feature extraction and recognition. Comput Intell Neurosci 2018:1–10. https://doi.org/10.1155/2018/1214301
Li W, Eigen D, Fergus R (2015) End-to-end integration of a convolutional network, deformable parts model and non-maximum suppression. 2015 IEEE conference on computer vision and pattern recognition (CVPR). In: Proc. CVPR, Boston, MA, USA
Li YF, Zhou D, Ruan YD et al (2017) Ships saliency detection algorithm for inhibiting stern ripples based on video sequence. J Beijing Univ Posts Telecomm 40:72-76. https://doi.org/10.13190/j.jbupt.2017.s.016
Lin T, Dollár P, Girshick R, He K, Belongie S (2017) Feature pyramid networks for object detection. In: Proc. CVPR, Honolulu, HI, USA
Mahendran A, Vedaldi A (2013) Visualizing deep convolutional neural networks using natural pre-images. Int J Comput Vis 120. https://doi.org/10.1007/s11263-016-0911-8
Novatel (2003) GPS Position Accuracy Measures. Positioning Leadership APN-029 Rev 3. Novatel, Evry, pp 1–6
Nowozin S (2014) Optimal decisions from probabilistic models: the intersection-over-union case. In: Proc. CVPR, Columbus, OH, USA. https://doi.org/10.1109/CVPR.2014.77
Pan C, Lu M, Xu B, Gao H (2019) An improved CNN model for within-project software defect prediction. Appl Sci 9:2138. https://doi.org/10.3390/app9102138
Redmon J, Farhadi A (2017) YOLO9000: Better, faster, stronger. In: Proc. CVPR, Honolulu, HI, USA. https://doi.org/10.1109/CVPR.2017.690
Redmon J, Farhadi A (2018) YOLOv3: an incremental improvement. arXiv:1804.02767.
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proc CVPR. https://doi.org/10.1109/CVPR.2016.91
Ren SQ, He KM, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39. https://doi.org/10.1109/TPAMI.2016.2577031
Rs R, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-CAM: visual explanations from deep networks via gradient-based localization. 2017 16th IEEE international conference on computer vision (ICCV). In: Proc. Computer Vision, Venice, Italy. https://doi.org/10.1109/ICCV.2017.74
Sharma R, Savakis A (2015) Lean histogram of oriented gradients features for effective eye detection. J. Electron. Imaging 24(6):063007. https://doi.org/10.1117/1.JEI.24.6.063007
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proc. CVPR, Las Vegas, Nevada. https://doi.org/10.1109/CVPR.2016.308
Szegedy C, Ioffe S, Vanhoucke V (2016) Inception-v4, inception-ResNet and the impact of residual connections on Learning. In: Proc. AAAI, Phoenix, Arizona, USA
Tang W, Liu SY, Gao H et al (2019) A target detection algorithm for surface cleaning robot based on machine vision. Sci Technol Eng 19(3):136–141
Viola PA, Jones MJ (2001) Rapid object detection using a boosted cascade of simple features. In: Proc. CVPR, Kauai, HI, USA. https://doi.org/10.1109/cvpr.2001.990517
Voulodimos A, Doulamis N, Doulamis A, Protopapadakis E (2018) Deep learning for computer vision: a brief review. Comput Intell Neurosci 2018:1–13. https://doi.org/10.1155/2018/7068349
Wang ZL, Liu YH, Yip HW, Peng B, Qian SY, Shi H (2008) Design and hydrodynamic modeling of a lake surface cleaning robot. 2008 IEEE/ASME international conference on advanced intelligent mechatronics. AIM, Xian, pp 1343–1348
Wang YQ, Ma L, Tian Y (2011) State-of-the-art of ship detection and recognition in optical remotely sensed imagery. Acta Automat Sin 37(9):1029–1039. https://doi.org/10.3724/SP.J.1004.2011.01029
Wei JR (2017) Application of background texture model in sea surface target detection. Ship Sci Technol 39(10A):159–161. https://doi.org/10.3404/j.issn.1672-7649.2017.10A.054
Wei L, Dragomir A, Dumitru E, Christian S, Scott R et al (2015) SSD: single shot MultiBox detector. arXiv:1512.02325. https://doi.org/10.1007/978-3-319-46448-0_2
Xie JX, Qiu GW, Chen WR, Cai Z, Zhu JQ, Zhang WH et al (2014) Design of multifunctional water-cleaning ship based on scenic lake pollution. Chin J Environ Eng 8(6):2371–2375
Xue P (2017) Foreground and background segmentation based on superpiexel-level feature representation. J Xian Univ Sci Technol 37(5):731–735. https://doi.org/10.13800/j.cnki.xakjdxxb.2017.0520
Yang L, Tian SW (2016) On identifying water body in remote sensing images based on distributed computing. Comput Appl Softw 33(6):138–140+145. https://doi.org/10.3969/j.issn.1000-386x.2016.06.034
Yang GC, Yang J, Su ZD, Chen ZJ (2018) An improved YOLO feature extraction algorithm and its application to privacy situation detection of social robots. Acta Automat Sin 44:2238–2249
Ye XJ, Zhao JF, Gong XL et al (2018) Infrared image enhancement for dim target based on edge weight analysis. Laser Infrared 48(1):119–123. https://doi.org/10.3969/j.issn.1001-5078.2018.01.022
Yildirimoglu M et al (2013) Experienced travel time prediction for congested freeways. Transport Res B Methodol
Yoshua B, Aaron C, Pascal V (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828. https://doi.org/10.1109/TPAMI.2013.50
Yu L, Wang RX (2002) Object detection and recognition based on multiscale deformable template. J Comput Res Dev 39(10):1325–1330
Zhang BC, Gao YS, Zhao SQ, Liu JZ (2010) Local derivative pattern versus local binary pattern: face recognition with high-order local pattern descriptor. IEEE Trans Image Process 19(2):533–544. https://doi.org/10.1109/TIP.2009.2035882
Zhu YQ, Li CW (2007) A method and implementation for region-based image retrieval using partition of foreground and background. J Image Graph 12(02):234–238
Funding
This work was supported in part by the Guangxi Innovation-driven Development Special Project of China under grant no. AA17202032-2, in part by the Key-Area Research and Development Program of Guangdong Province of China under grant no. 2018B010108001, and in part by Key-Area Research and Development Program of Foshan City under the grant no. 2020001006812.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
None.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yi, Z., Yao, D., Li, G. et al. Detection and localization for lake floating objects based on CA-faster R-CNN. Multimed Tools Appl 81, 17263–17281 (2022). https://doi.org/10.1007/s11042-022-12686-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-12686-6