Abstract
Marine target detection is a challenging task because degraded underwater images cause unclear targets. Furthermore, marine targets are small in size and tend to live together. The popular object detection methods perform poorly in marine target detection. Thus, this paper proposes a novel multiple attentional path aggregation network named APAN to improve performance on marine object detection. Firstly, we design a path aggregation network structure which brings features from backbone network to bottom-up path augmentation. Each feature map is enhanced by the lower layer through the bottom-up downsampling pathway and incorporates the features from top-down upsampling layers. Specifically, the last layer fuses feature map from backbone network which enhances the semantic features and improve the ability of feature extraction. Then, a multi-attention which combines coordinate competing attention and spatial supplement attention applies to proposed path aggregation network. Multi-attention can further improve the accuracy of multiple marine object detection. Finally, a double transmission underwater image enhancement algorithm is proposed to enhance the underwater image datasets. The experiments show our method achieves 79.6% mAP in underwater image datasets and 79.03% mAP in enhanced underwater image datasets. Meanwhile, our method achieves 81.5% mAP in PASCAL VOC datasets. In addition, we also applly the method to the underwater robot. The experiments show our method achieves good performance compared with popular object detection methods. The source code is publicly available at https://github.com/yhf2022/APAN.












Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Salvi M, Acharya U R, Molinari F, Meiburger K M (2021) The impact of pre-and post-image processing techniques on deep learning frameworks: a comprehensive review for digital pathology image analysis. Comput Biol Med 128:104129
Ren S, He K, Girshick R, Sun J (2016) Faster r-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
Cai Z, Vasconcelos N (2018) Cascade r-cnn: Delving into high quality object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6154–6162
Xu F, Wang H, Peng J, Fu X (2021) Scale-aware feature pyramid architecture for marine object detection. Neural Comput Appl 33(8):3637–3653
Tian Y, Yang G, Wang Z, Wang H, Li E, Liang Z (2019) Apple detection during different growth stages in orchards using the improved YOLO-v3 model. Comput Electron Agricul 157:417–426
Mittal P, Singh R, Sharma A (2020) Deep learning-based object detection in low-altitude UAV datasets: a survey. Image Vis Comput 104:104046
Chen L, Zhang Z, Peng L (2018) Fast single shot multibox detector and its application on vehicle counting system. IET Intell Transp Syst 12(10):1406–1413
Zhao Z Q, Zheng P, Xu S T, Wu X (2019) Object detection with deep learning: a review. IEEE Trans Neural Netw Learn Syst 30(11):3212–3232
Lin T Y, Goyal P, Girshick R, He K, Dollar P (2020) Focal loss for dense object detection. IEEE Trans Pattern Anal Mach Intell 42(2):318–327
Wang N, Wang Y, Er MJ (2022) Review on deep learning techniques for marine object recognition: Architectures and algorithms. Control Eng Practice 118, 104458
Chen X, Yu J, Kong S, Wu Z, Fang X, Wen L (2019) Towards real-time advancement of underwater visual quality with GAN. IEEE Trans Ind Electron 66(12):9350–9359
Ancuti C O, Ancuti C, De Vleeschouwer C, Bekaert P (2017) Color balance and fusion for underwater image enhancement. IEEE Trans Image Process 27(1):379–393
Zhang Y, Wang C, Wang X, Zeng W, Liu W (2021) Fairmot: on the fairness of detection and re-identification in multiple object tracking. Int J Comput Vis 129(11):3069–3087
Liu L, Ouyang W, Wang X, Fieguth P, Chen J, Liu X, Pietikainen M (2020) Deep learning for generic object detection: a survey. Int J Comput Vis 128(2):261–318
Li K, Wan G, Cheng G, Meng L, Han J (2020) Object detection in optical remote sensing images: a survey and a new benchmark. ISPRS J Photogramm Remote Sens 159:296–307
Zhang H, Wang K, Tian Y, Gou C, Wang F Y (2018) MFR-CNN: Incorporating Multi-scale features and global information for traffic object detection. IEEE Trans Veh Technol 67(9):8019–8030
Xu Y, Wen G, Hu Y, Luo M, Dai D, Zhuang Y, Hall W (2021) Multiple attentional pyramid networks for Chinese herbal recognition. Pattern Recogn 110:107558
Ghiasi G, Fowlkes C C (2016) Laplacian pyramid reconstruction and refinement for semantic segmentation. In European conference on computer vision. Springer, pp 519–534
Peng C, Zhang X, Yu G, Luo G, Sun J (2017) Large kernel matters–improve semantic segmentation by global convolutional network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4353–4361
Song W, Zheng N, Liu X, Qiu L, Zheng R (2019) An improved u-net convolutional networks for seabed mineral image segmentation. IEEE Access 7:82744–82752
Shelhamer E, Long J, Darrell T (2016) Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell 39(4):640–651
Chen L C, Papandreou G, Kokkinos I, Murphy K, Yuille A L (2017) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Shrivastava A, Sukthankar R, Malik J, Gupta A (2016) Beyond skip connections: Top-down modulation for object detection. arXiv:1612.06851
Tian Z, Shen C, Chen H, He T (2019) Fcos: Fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9627–9636
Fan D P, Lin Z, Zhang Z, Zhu M, Cheng M M (2020) Rethinking RGB-d salient object detection: models, data sets, and large-scale benchmarks. IEEE Trans Neural Netw Learn Syst 32(5):2075–2089
Jiao L, Zhang F, Liu F, Yang S, Li L, Feng Z, Qu R (2019) A survey of deep learning-based object detection. IEEE Access 7:128837–128868
Bell S, Zitnick C L, Bala K, Girshick R (2016) Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2874–2883
Han J, Zhang D, Cheng G, Liu N, Xu D (2018) Advanced deep-learning techniques for salient and category-specific object detection: a survey. IEEE Signal Proc Mag 35(1):84–100
Nguyen T V, Zhao Q, Yan S (2018) Attentive systems: a survey. Int J Comput Vis 126 (1):86–110
Li W, Zhu X, Gong S (2020) Scalable person re-identification by harmonious attention. Int J Comput Vis 128(6):1635–1653
Wei S, Qu Q, Wu Y, Wang M, Shi J (2020) PRI Modulation recognition based on squeeze-and-excitation networks. IEEE Commun Lett 24(5):1047–1051
Taghanaki S A, Abhishek K, Cohen J P, Cohen-Adad J, Hamarneh G (2021) Deep semantic segmentation of natural and medical images: a review. Artif Intell Rev 54(1):137–178
Arrieta AB, Diaz Rodriguez N, Del Ser J, Bennetot A, Tabik S, Barbado A, Herrera F (2020) Explainable Artificial Intelligence (XAI): concepts, taxonomies, opportunities and challenges toward responsible AI. Inf Fusion 58:82–115
Khan A, Sohail A, Zahoora U, Qureshi AS (2020) A survey of the recent architectures of deep convolutional neural networks. Artif Intell Rev 53(8):5455–5516
Ouyang X, Huo J, Xia L, Shan F, Liu J, Mo Z, Shen D (2020) Dual-sampling attention network for diagnosis of COVID-19 from community acquired pneumonia. IEEE Trans Med Imaging 39(8):2595–2605
Chen L, Zhang H, Xiao J, Nie L, Shao J, Liu W, Chua TS (2017) Sca-cnn: Spatial and channel-wise attention in convolutional networks for image captioning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5659–5667
Yu X, Li X, Wu H, Bai Y (2020) DS-NLCSinet: Exploiting non-local neural networks for massive MIMO CSI feedback. IEEE Commun Lett 24(12):2790–2794
Du Y, Yuan C, Li B, Zhao L, Li Y, Hu W (2018) Interaction-aware spatio-temporal pyramid attention networks for action classification. In: Proceedings of the European conference on computer vision, pp 373–389
Tang R, Chen L, Zou Y, Lai Z, Albertini M K, Yang X (2021) Lightweight network with one-shot aggregation for image super-resolution. J Real-Time Image Proc 18(4):1275–1284
Long W, Li X, Gao L (2020) A transfer convolutional neural network for fault diagnosis based on ResNet-50. Neural Comput Appl 32(10):6111–6124
Xie W, Jiang T, Li Y, Jia X, Lei J (2019) Structure tensor and guided filtering-based algorithm for hyperspectral anomaly detection. IEEE Trans Geosci Remote Sens 57(7):4218– 4230
Peng Y T, Cosman P C (2017) Underwater image restoration based on image blurriness and light absorption. IEEE Trans Image Process 26(4):1579–1594
Gao S B, Zhang M, Zhao Q, Zhang X S (2019) Underwater image enhancement using adaptive retinal mechanisms, vol 28
Li C, Guo C, Ren W, Cong R, Hou J, Kwong S, Tao D (2019) An underwater image enhancement benchmark dataset and beyond. IEEE Trans Image Process 29:4376–4389
Li X, Lei C, Yu H, Feng Y (2022) Underwater image restoration by color compensation and color-line model. Signal Process Image Commun 101:116569
Chen X, Lu Y, Wu Z, Yu J, Wen L (2020) Reveal of domain effect: How visual restoration contributes to object detection in aquatic scenes. arXiv:2003.01913
Wang J, Luo J, Liu B, Feng R, Lu L, Zou H (2020) Automated diabetic retinopathy grading and lesion detection based on the modified r-FCN object detection algorithm. IET Comput Vis 14(1):1–8
Shen Z, Liu Z, Li J, Jiang Y G, Chen Y, Xue X (2019) Object detection from scratch with deep supervision. IEEE Trans Pattern Anal Mach Intell 42(2):398–412
Liu Z, Du J, Tian F, Wen J (2019) MR-CNN: A multi-scale region-based convolutional neural network for small traffic sign recognition. IEEE Access 7:57120–57128
Shen Z, Shi H, Yu J, Phan H, Feris R, Cao L, Savvides M (2017) Improving object detection from scratch via gated feature reuse. arXiv:1712.00886
Ge Z, Liu S, Wang F, Li Z, Sun J (2021) Yolox: Exceeding yolo series in 2021. arXiv:2107.08430
Jian M, Qi Q, Yu H et al (2019) The extended marine underwater environment database and baseline evaluations[J]. Appl Soft Comput 80:425–437
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China under Grant 61873224, Grant 62003295, and Grant 41976182, in part by the S&T Program of Hebei under Grant F2020203037, and F2019203031, in part by the Science and Technology Research Projectof Universities in Hebei under Grant QN2020301.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yu, H., Li, X., Feng, Y. et al. Multiple attentional path aggregation network for marine object detection. Appl Intell 53, 2434–2451 (2023). https://doi.org/10.1007/s10489-022-03622-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-03622-0