Skip to main content

Coordinate Attention and Transformer Neck-Based Marine Organism Detection

  • Conference paper
  • First Online:
Sensor Systems and Software (S-Cube 2022)

Abstract

Marine organism detection is crucial for the intelligent construction of open-sea farm. Suffering from low-contrast, color-deviation and detail-blurry underwater environment, a coordinate attention and transformer neck-based benthonic organism detection (CATNBOD) scheme has been devised. Main contributions are as follows: 1) The coordinate attention (CA) module is designed in the feature extraction network to obtain meaningful features, such that the small-scale benthonic organisms can be accurately detected. 2) To efficiently address the challenge derived from intra- and inter-class occlusions of benthonic organism, the rotation window-based swin transformer (ST) module is devised in the neck structure. Combining with CA and ST modules contributes to the proposed CATNBOD scheme. The effectiveness and superiority have been sufficiently demonstrated on publicly available UDD dataset.

This work is supported by the National Natural Science Foundation of China (Grant 52271306), Innovative Research Foundation of Ship General Performance (Grant 31422120), and the Cultivation Program for the Excellent Doctoral Dissertation of Dalian Maritime University (Grant 2022YBPY004).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Wang, N., Wang, Y., Er, M.J.: Review on deep learning techniques for marine object recognition: architectures and algorithms. Control. Eng. Pract. 118(3), 104458 (2022)

    Article  Google Scholar 

  2. Wang, N., Qian, C., Sun, J., Liu, Y.: Adaptive robust finite-time trajectory tracking control of fully actuated marine surface vehicles. IEEE Trans. Cybern. 24(4), 1454–1462 (2016)

    Google Scholar 

  3. Wang, N., Er, M.J.: Direct adaptive fuzzy tracking control of marine vehicles with fully unknown parametric dynamics and uncertainties. IEEE Trans. Control Syst. Technol. 24(5), 1845–1852 (2016)

    Article  Google Scholar 

  4. Yeh, C., et al.: Lightweight deep neural network for joint learning of underwater object detection and color conversion. IEEE Trans. Neural Netw. Learn. Syst. 99, 1–15 (2021)

    Google Scholar 

  5. Wang, Y., et al.: Real-time underwater onboard vision sensing system for robotic gripping. IEEE Trans. Instrum. Meas. 70, 1–11 (2020)

    Article  Google Scholar 

  6. Han, M., Lyu, Z., Qiu, T., Xu, M.: A review on intelligence dehazing and color restoration for underwater images. IEEE Trans. Syst. Man Cybern. Syst. 50(5), 1820–1832 (2020)

    Article  Google Scholar 

  7. Forsyth, D.: Object detection with discriminatively trained part-based models. Computer 47(02), 6–7 (2016)

    Article  Google Scholar 

  8. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 886–893, San Diego, CA, USA (2005)

    Google Scholar 

  9. Bay, H., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. Lect. Notes Comput. Sci. 3951, 404–417 (2006)

    Article  Google Scholar 

  10. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)

    Article  Google Scholar 

  11. Cherkassky, V., Ma, Y.: Practical selection of SVM parameters and noise estimation for SVM regression. Neural Netw. 17(1), 113–126 (2004)

    Article  MATH  Google Scholar 

  12. Wang, N., Er, M.J.: Self-constructing adaptive robust fuzzy neural tracking control of surface vehicles with uncertainties and unknown disturbances. IEEE Trans. Control Syst. Technol. 23(3), 991–1002 (2014)

    Article  Google Scholar 

  13. Villon, S.; Chaumont, M.; Subsol, G.; Villéger, S.; Claverie, T.; Mouillot, D.: Coral reef fish detection and recognition in underwatervideos by supervised machine learning: Comparison between Deep Learning and HOG+ SVM methods. In Proceedings of the International Conference on Advanced Concepts for Intelligent Vision Systems, Lecce, Italy, pp. 160–171 (2016)

    Google Scholar 

  14. Serban, A., Poll, E., Visser, J.: Adversarial examples on object recognition: a comprehensive survey. ACM Comput. Surv. 53(3), 1–38 (2020)

    Article  Google Scholar 

  15. Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)

    Google Scholar 

  16. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2015)

    Article  Google Scholar 

  17. He, K., Gkioxari, G., Dollár, P.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)

    Google Scholar 

  18. Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2

    Chapter  Google Scholar 

  19. Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. arXiv 2018. arXiv:1804.02767

  20. Lin, T., Goyal, P., Girshick, R., He, K., Dollar, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)

    Google Scholar 

  21. Peng, F., Miao, Z., Li, F., Li, Z.: S-FPN: a shortcut feature pyramid network for sea cucumber detection in underwater images. Expert Syst. Appl. 182, 115306 (2021)

    Article  Google Scholar 

  22. Chen, T., Wang, N., Wang, R., Zhao, H., Zhang, G.: One-stage CNN detector based benthonic organisms detection with limited training dataset. Neural Netw. 144, 247–259 (2021)

    Article  Google Scholar 

  23. Huang, H., Zhou, H., Yang, X.: Faster R-CNN for marine organisms detection and recognition using data augmentation. Neurocomputing 337, 372–384 (2019)

    Article  Google Scholar 

  24. Wang, N., Karimi, H.R., Li, H., Su, S.-F.: Accurate trajectory tracking of disturbed surface vehicles: a finite-time control approach. IEEE/ASME Trans. Mechatron. 24(3), 1064–1074 (2019)

    Article  Google Scholar 

  25. Wang, N., Er, M.J., Sun, J., Liu, Y.: Adaptive robust online constructive fuzzy control of a complex surface vehicle system. IEEE Trans. Cybern. 46(7), 1511–1523 (2016)

    Article  Google Scholar 

  26. Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-End object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13

    Chapter  Google Scholar 

  27. Wei, X., Yu, L., Tian, S., Feng, P., Ning, X.: Underwater target detection with an attention mechanism and improved scale. Multimed. Tools Appl. 80(25), 33747–33761 (2021). https://doi.org/10.1007/s11042-021-11230-2

    Article  Google Scholar 

  28. Li, A., Yu, L., Tian, S.: Underwater biological detection based on YOLOv4 combined with channel attention. J. Mar. Sci. Eng. 10(4), 469 (2022)

    Article  Google Scholar 

  29. Shi, Z., et al.: Detecting marine organisms via joint attention-relation learning for marine video surveillance. IEEE J. Ocean. Eng. 47(4), 959–974 (2022)

    Article  Google Scholar 

  30. Xu, F., Wang, H., Peng, J., Fu, X.: Scale-aware feature pyramid architecture for marine object detection. Neural Comput. Appl. 33(8), 3637–3653 (2021)

    Article  Google Scholar 

  31. Wang, C., Liao, H., Wu, Y., Chen, P., Hsieh, J., Yeh, I.: CSPNet: A new backbone that can enhance learning capability of CNN. In: roceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 390–391 (2020)

    Google Scholar 

  32. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)

    Google Scholar 

  33. Woo, S., Park, J., Lee, J., Kweom, I.: CBAM: convolutional block attention module. In: Proceedings of the European Conference on Computer Vision, pp. 3–19 (2018)

    Google Scholar 

  34. Liu, Z., et al.: Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)

    Google Scholar 

  35. Zhu, X., Lyu, S., Wang, X., Zhao, Q.: TPH-YOLOv5: improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2778–2788 (2021)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ning Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kong, X., Wang, N., Chen, T., Chen, Y. (2023). Coordinate Attention and Transformer Neck-Based Marine Organism Detection. In: Karimi , H.R., Wang, N. (eds) Sensor Systems and Software. S-Cube 2022. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 487. Springer, Cham. https://doi.org/10.1007/978-3-031-34899-0_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-34899-0_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-34898-3

  • Online ISBN: 978-3-031-34899-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics