Abstract
Marine organism detection is crucial for the intelligent construction of open-sea farm. Suffering from low-contrast, color-deviation and detail-blurry underwater environment, a coordinate attention and transformer neck-based benthonic organism detection (CATNBOD) scheme has been devised. Main contributions are as follows: 1) The coordinate attention (CA) module is designed in the feature extraction network to obtain meaningful features, such that the small-scale benthonic organisms can be accurately detected. 2) To efficiently address the challenge derived from intra- and inter-class occlusions of benthonic organism, the rotation window-based swin transformer (ST) module is devised in the neck structure. Combining with CA and ST modules contributes to the proposed CATNBOD scheme. The effectiveness and superiority have been sufficiently demonstrated on publicly available UDD dataset.
This work is supported by the National Natural Science Foundation of China (Grant 52271306), Innovative Research Foundation of Ship General Performance (Grant 31422120), and the Cultivation Program for the Excellent Doctoral Dissertation of Dalian Maritime University (Grant 2022YBPY004).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Wang, N., Wang, Y., Er, M.J.: Review on deep learning techniques for marine object recognition: architectures and algorithms. Control. Eng. Pract. 118(3), 104458 (2022)
Wang, N., Qian, C., Sun, J., Liu, Y.: Adaptive robust finite-time trajectory tracking control of fully actuated marine surface vehicles. IEEE Trans. Cybern. 24(4), 1454–1462 (2016)
Wang, N., Er, M.J.: Direct adaptive fuzzy tracking control of marine vehicles with fully unknown parametric dynamics and uncertainties. IEEE Trans. Control Syst. Technol. 24(5), 1845–1852 (2016)
Yeh, C., et al.: Lightweight deep neural network for joint learning of underwater object detection and color conversion. IEEE Trans. Neural Netw. Learn. Syst. 99, 1–15 (2021)
Wang, Y., et al.: Real-time underwater onboard vision sensing system for robotic gripping. IEEE Trans. Instrum. Meas. 70, 1–11 (2020)
Han, M., Lyu, Z., Qiu, T., Xu, M.: A review on intelligence dehazing and color restoration for underwater images. IEEE Trans. Syst. Man Cybern. Syst. 50(5), 1820–1832 (2020)
Forsyth, D.: Object detection with discriminatively trained part-based models. Computer 47(02), 6–7 (2016)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 886–893, San Diego, CA, USA (2005)
Bay, H., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. Lect. Notes Comput. Sci. 3951, 404–417 (2006)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Cherkassky, V., Ma, Y.: Practical selection of SVM parameters and noise estimation for SVM regression. Neural Netw. 17(1), 113–126 (2004)
Wang, N., Er, M.J.: Self-constructing adaptive robust fuzzy neural tracking control of surface vehicles with uncertainties and unknown disturbances. IEEE Trans. Control Syst. Technol. 23(3), 991–1002 (2014)
Villon, S.; Chaumont, M.; Subsol, G.; Villéger, S.; Claverie, T.; Mouillot, D.: Coral reef fish detection and recognition in underwatervideos by supervised machine learning: Comparison between Deep Learning and HOG+ SVM methods. In Proceedings of the International Conference on Advanced Concepts for Intelligent Vision Systems, Lecce, Italy, pp. 160–171 (2016)
Serban, A., Poll, E., Visser, J.: Adversarial examples on object recognition: a comprehensive survey. ACM Comput. Surv. 53(3), 1–38 (2020)
Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2015)
He, K., Gkioxari, G., Dollár, P.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. arXiv 2018. arXiv:1804.02767
Lin, T., Goyal, P., Girshick, R., He, K., Dollar, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Peng, F., Miao, Z., Li, F., Li, Z.: S-FPN: a shortcut feature pyramid network for sea cucumber detection in underwater images. Expert Syst. Appl. 182, 115306 (2021)
Chen, T., Wang, N., Wang, R., Zhao, H., Zhang, G.: One-stage CNN detector based benthonic organisms detection with limited training dataset. Neural Netw. 144, 247–259 (2021)
Huang, H., Zhou, H., Yang, X.: Faster R-CNN for marine organisms detection and recognition using data augmentation. Neurocomputing 337, 372–384 (2019)
Wang, N., Karimi, H.R., Li, H., Su, S.-F.: Accurate trajectory tracking of disturbed surface vehicles: a finite-time control approach. IEEE/ASME Trans. Mechatron. 24(3), 1064–1074 (2019)
Wang, N., Er, M.J., Sun, J., Liu, Y.: Adaptive robust online constructive fuzzy control of a complex surface vehicle system. IEEE Trans. Cybern. 46(7), 1511–1523 (2016)
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-End object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Wei, X., Yu, L., Tian, S., Feng, P., Ning, X.: Underwater target detection with an attention mechanism and improved scale. Multimed. Tools Appl. 80(25), 33747–33761 (2021). https://doi.org/10.1007/s11042-021-11230-2
Li, A., Yu, L., Tian, S.: Underwater biological detection based on YOLOv4 combined with channel attention. J. Mar. Sci. Eng. 10(4), 469 (2022)
Shi, Z., et al.: Detecting marine organisms via joint attention-relation learning for marine video surveillance. IEEE J. Ocean. Eng. 47(4), 959–974 (2022)
Xu, F., Wang, H., Peng, J., Fu, X.: Scale-aware feature pyramid architecture for marine object detection. Neural Comput. Appl. 33(8), 3637–3653 (2021)
Wang, C., Liao, H., Wu, Y., Chen, P., Hsieh, J., Yeh, I.: CSPNet: A new backbone that can enhance learning capability of CNN. In: roceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 390–391 (2020)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Woo, S., Park, J., Lee, J., Kweom, I.: CBAM: convolutional block attention module. In: Proceedings of the European Conference on Computer Vision, pp. 3–19 (2018)
Liu, Z., et al.: Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)
Zhu, X., Lyu, S., Wang, X., Zhao, Q.: TPH-YOLOv5: improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2778–2788 (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Kong, X., Wang, N., Chen, T., Chen, Y. (2023). Coordinate Attention and Transformer Neck-Based Marine Organism Detection. In: Karimi , H.R., Wang, N. (eds) Sensor Systems and Software. S-Cube 2022. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 487. Springer, Cham. https://doi.org/10.1007/978-3-031-34899-0_3
Download citation
DOI: https://doi.org/10.1007/978-3-031-34899-0_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-34898-3
Online ISBN: 978-3-031-34899-0
eBook Packages: Computer ScienceComputer Science (R0)