Skip to main content

Advertisement

Log in

Detection and localization for lake floating objects based on CA-faster R-CNN

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

As a general trend, unmanned ships have been gradually replacing humans and served as the cleaner of lakes. To work properly, those unmanned ships need to detect and localize lake floating objects that need to be collected. Compared to conventional image-based objects, lake floating objects are too small to detect. Meanwhile, because most conventional algorithms depend on bounding-boxes to detect the object, their results - it is hard to detect the accurate location of floating objects. To this end, this paper proposes a detection and localization algorithm based on CA-Faster R-CNN (Class Activation-Faster Regions with Convolutional Neural Network). Specifically, for an image with objects on it, the proposed algorithm detects and classifies objects with Faster R-CNN and localize objects with CA network. The experimental results show that, compared with the Faster R-CNN algorithm, this algorithm can reduce the positioning error without affecting the recognition accuracy, thereby can be used for the detection and localization of floating objects on the water surface. Compared with Faster R-CNN algorithm, the positioning accuracy of CA-Faster R-CNN algorithm is improved by 6.29 pixels. Also, the proposed algorithm remains a great potential for other objects that shared similar challenges with lake floating objects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data availability

All images used in this article can be provided by the corresponding author.

Code availability

Available upon request.

References

  1. Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J et al (2016) TensorFlow: a system for large-scale machine learning. arXiv: 1605.08695

  2. Agrawal P, Bhattacharya B (2013) Aquatic multi-robot system for lake cleaning. In: Proc. international conference on climbing & walking robots & the support technologies for mobile machines, Baltimore, USA, pp 171–178

  3. Arcos-García Á, Álvarez-García JA, Soria-Morillo LM (2018) Evaluation of deep neural networks for traffic sign detection systems. Neurocomputing 316:S0925231218-30924X. https://doi.org/10.1016/j.neucom.2018.08.009

    Article  Google Scholar 

  4. Dai JF, Yi L, He KM, Jian S (2016) R-FCN: object detection via region-based fully convolutional networks. 2016 30th conference on neural information processing systems (NIPS), Barcelona, SPAIN, arXiv:1605.06409

  5. Deng L, Yan LF, Zhang SH et al (2019) Intelligent recognition and judgment system of floating objects on water surface based on machine vision. Electronic Test, no 17, pp 133-134. https://doi.org/10.16520/j.cnki.1000-8519.2019.17.057

  6. Everingham M, Eslami A, Gool L, Williams C, Winn J, Zisserman A (2015) The Pascal visual object classes challenge: a retrospective. Int J Comput Vis 111(1):98–136. https://doi.org/10.1007/s11263-014-0733-5

    Article  Google Scholar 

  7. Fang J, Feng SS, Feng Y (2017) Image algorithm of ship detection for surface vehicle. Transactions of Beijing Institute of Technology 37(12):1235–1240. https://doi.org/10.15918/j.tbit1001-0645.2017.12.005

    Article  Google Scholar 

  8. Girshick R (2015) Fast R-CNN. In: Proc. ICCV

  9. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proc. CVPR, Columbus, OH, USA, pp 580–587 https://doi.org/10.1109/CVPR.2014.81

  10. He K, Zhang X, Ren S, Sun J (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal 37(9):1904–1916. https://doi.org/10.1007/978-3-319-10578-9_23

    Article  Google Scholar 

  11. He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proc. computer vision, Santiago, Chile. https://doi.org/10.1109/ICCV.2015.123

  12. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proc. CVPR, Las Vegas, Nevada. https://doi.org/10.1109/CVPR.2016.90

  13. Hu G, Wang K, Peng Y, Qiu M, Shi J, Liu L (2018) Deep learning methods for underwater target feature extraction and recognition. Comput Intell Neurosci 2018:1–10. https://doi.org/10.1155/2018/1214301

    Article  Google Scholar 

  14. Li W, Eigen D, Fergus R (2015) End-to-end integration of a convolutional network, deformable parts model and non-maximum suppression. 2015 IEEE conference on computer vision and pattern recognition (CVPR). In: Proc. CVPR, Boston, MA, USA

  15. Li YF, Zhou D, Ruan YD et al (2017) Ships saliency detection algorithm for inhibiting stern ripples based on video sequence. J Beijing Univ Posts Telecomm 40:72-76. https://doi.org/10.13190/j.jbupt.2017.s.016

  16. Lin T, Dollár P, Girshick R, He K, Belongie S (2017) Feature pyramid networks for object detection. In: Proc. CVPR, Honolulu, HI, USA

  17. Mahendran A, Vedaldi A (2013) Visualizing deep convolutional neural networks using natural pre-images. Int J Comput Vis 120. https://doi.org/10.1007/s11263-016-0911-8

  18. Novatel (2003) GPS Position Accuracy Measures. Positioning Leadership APN-029 Rev 3. Novatel, Evry, pp 1–6

    Google Scholar 

  19. Nowozin S (2014) Optimal decisions from probabilistic models: the intersection-over-union case. In: Proc. CVPR, Columbus, OH, USA. https://doi.org/10.1109/CVPR.2014.77

  20. Pan C, Lu M, Xu B, Gao H (2019) An improved CNN model for within-project software defect prediction. Appl Sci 9:2138. https://doi.org/10.3390/app9102138

    Article  Google Scholar 

  21. Redmon J, Farhadi A (2017) YOLO9000: Better, faster, stronger. In: Proc. CVPR, Honolulu, HI, USA. https://doi.org/10.1109/CVPR.2017.690

  22. Redmon J, Farhadi A (2018) YOLOv3: an incremental improvement. arXiv:1804.02767.

  23. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proc CVPR. https://doi.org/10.1109/CVPR.2016.91

  24. Ren SQ, He KM, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39. https://doi.org/10.1109/TPAMI.2016.2577031

  25. Rs R, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-CAM: visual explanations from deep networks via gradient-based localization. 2017 16th IEEE international conference on computer vision (ICCV). In: Proc. Computer Vision, Venice, Italy. https://doi.org/10.1109/ICCV.2017.74

  26. Sharma R, Savakis A (2015) Lean histogram of oriented gradients features for effective eye detection. J. Electron. Imaging 24(6):063007. https://doi.org/10.1117/1.JEI.24.6.063007

    Article  Google Scholar 

  27. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proc. CVPR, Las Vegas, Nevada. https://doi.org/10.1109/CVPR.2016.308

  28. Szegedy C, Ioffe S, Vanhoucke V (2016) Inception-v4, inception-ResNet and the impact of residual connections on Learning. In: Proc. AAAI, Phoenix, Arizona, USA

  29. Tang W, Liu SY, Gao H et al (2019) A target detection algorithm for surface cleaning robot based on machine vision. Sci Technol Eng 19(3):136–141

    Google Scholar 

  30. Viola PA, Jones MJ (2001) Rapid object detection using a boosted cascade of simple features. In: Proc. CVPR, Kauai, HI, USA. https://doi.org/10.1109/cvpr.2001.990517

  31. Voulodimos A, Doulamis N, Doulamis A, Protopapadakis E (2018) Deep learning for computer vision: a brief review. Comput Intell Neurosci 2018:1–13. https://doi.org/10.1155/2018/7068349

    Article  Google Scholar 

  32. Wang ZL, Liu YH, Yip HW, Peng B, Qian SY, Shi H (2008) Design and hydrodynamic modeling of a lake surface cleaning robot. 2008 IEEE/ASME international conference on advanced intelligent mechatronics. AIM, Xian, pp 1343–1348

  33. Wang YQ, Ma L, Tian Y (2011) State-of-the-art of ship detection and recognition in optical remotely sensed imagery. Acta Automat Sin 37(9):1029–1039. https://doi.org/10.3724/SP.J.1004.2011.01029

    Article  Google Scholar 

  34. Wei JR (2017) Application of background texture model in sea surface target detection. Ship Sci Technol 39(10A):159–161. https://doi.org/10.3404/j.issn.1672-7649.2017.10A.054

    Article  Google Scholar 

  35. Wei L, Dragomir A, Dumitru E, Christian S, Scott R et al (2015) SSD: single shot MultiBox detector. arXiv:1512.02325. https://doi.org/10.1007/978-3-319-46448-0_2

  36. Xie JX, Qiu GW, Chen WR, Cai Z, Zhu JQ, Zhang WH et al (2014) Design of multifunctional water-cleaning ship based on scenic lake pollution. Chin J Environ Eng 8(6):2371–2375

    Google Scholar 

  37. Xue P (2017) Foreground and background segmentation based on superpiexel-level feature representation. J Xian Univ Sci Technol 37(5):731–735. https://doi.org/10.13800/j.cnki.xakjdxxb.2017.0520

    Article  Google Scholar 

  38. Yang L, Tian SW (2016) On identifying water body in remote sensing images based on distributed computing. Comput Appl Softw 33(6):138–140+145. https://doi.org/10.3969/j.issn.1000-386x.2016.06.034

    Article  MathSciNet  Google Scholar 

  39. Yang GC, Yang J, Su ZD, Chen ZJ (2018) An improved YOLO feature extraction algorithm and its application to privacy situation detection of social robots. Acta Automat Sin 44:2238–2249

    Google Scholar 

  40. Ye XJ, Zhao JF, Gong XL et al (2018) Infrared image enhancement for dim target based on edge weight analysis. Laser Infrared 48(1):119–123. https://doi.org/10.3969/j.issn.1001-5078.2018.01.022

    Article  Google Scholar 

  41. Yildirimoglu M et al (2013) Experienced travel time prediction for congested freeways. Transport Res B Methodol

  42. Yoshua B, Aaron C, Pascal V (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828. https://doi.org/10.1109/TPAMI.2013.50

    Article  Google Scholar 

  43. Yu L, Wang RX (2002) Object detection and recognition based on multiscale deformable template. J Comput Res Dev 39(10):1325–1330

    Google Scholar 

  44. Zhang BC, Gao YS, Zhao SQ, Liu JZ (2010) Local derivative pattern versus local binary pattern: face recognition with high-order local pattern descriptor. IEEE Trans Image Process 19(2):533–544. https://doi.org/10.1109/TIP.2009.2035882

    Article  MathSciNet  MATH  Google Scholar 

  45. Zhu YQ, Li CW (2007) A method and implementation for region-based image retrieval using partition of foreground and background. J Image Graph 12(02):234–238

    Google Scholar 

Download references

Funding

This work was supported in part by the Guangxi Innovation-driven Development Special Project of China under grant no. AA17202032-2, in part by the Key-Area Research and Development Program of Guangdong Province of China under grant no. 2018B010108001, and in part by Key-Area Research and Development Program of Foshan City under the grant no. 2020001006812.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guojin Li.

Ethics declarations

Conflict of interest

None.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yi, Z., Yao, D., Li, G. et al. Detection and localization for lake floating objects based on CA-faster R-CNN. Multimed Tools Appl 81, 17263–17281 (2022). https://doi.org/10.1007/s11042-022-12686-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-12686-6

Keywords

Navigation