Hostname: page-component-848d4c4894-wg55d Total loading time: 0 Render date: 2024-06-02T09:30:38.493Z Has data issue: false hasContentIssue false

IFE-net: improved feature enhancement network for weak feature target recognition in autonomous underwater vehicles

Published online by Cambridge University Press:  08 February 2024

Lei Cai*
Affiliation:
School of Artificial Intelligence, Henan Institute of Science and Technology, Xinxiang, PR China
Bingyuan Zhang
Affiliation:
School of Information Engineering, Henan Institute of Science and Technology, Xinxiang, PR China
Yuejun Li
Affiliation:
School of Information Engineering, Henan Institute of Science and Technology, Xinxiang, PR China
Haojie Chai
Affiliation:
School of Artificial Intelligence, Henan Institute of Science and Technology, Xinxiang, PR China
*
Corresponding author: Lei Cai; Email: cailei2014@126.com

Abstract

The recognizing underwater targets is a crucial component of autonomous underwater vehicle patrols and detection efforts. In the process of visual image recognition in real underwater environment, the spatial and semantic features of the target often appear to different degrees of loss, and the scarcity of specific types of underwater samples leads to unbalanced data on categories. This kind of problem makes the target features appear weak and seriously affects the accuracy of underwater target recognition. Traditional deep learning methods based on data and feature enhancement cannot achieve ideal recognition effect. Based on the above difficulties, this paper proposes an improved feature enhancement network for weak feature target recognition. Firstly, a multi-scale spatial and semantic feature enhancement module is constructed to extract the feature information of the extraction target accurately. Secondly, this paper solves the influence of target feature distortion on classification through multi-scale feature comparison of positive and negative samples. Finally, the Rank & Sort Loss function was used to train the depth target detection to solve the problem of recognition accuracy under highly unbalanced sample data. Experimental results show that the recognition accuracy of the proposed method is 2.28% and 3.84% higher than that of the existing algorithms in the recognition of underwater fuzzy and distorted target images, which demonstrates the effectiveness and superiority of the proposed method.

Type
Research Article
Copyright
© The Author(s), 2024. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Sun, C., Wan, Z., Huang, H., Zhang, G., Bao, X., Li, J., Sheng, M. and Yang, X., “Intelligent target visual tracking and control strategy for open frame underwater vehicles,” Robotica 39(10), 17911805 (2021).10.1017/S0263574720001502CrossRefGoogle Scholar
Chen, Z.-M., Cui, Q., Wei, X.-S., Jin, X. and Guo, Y., “Disentangling, embedding and ranking label cues for multi-label image recognition,” IEEE Trans Multimedia 23(4), 18271840 (2021).CrossRefGoogle Scholar
Cai, L., Li, Y., Chen, C. and Chai, H., “Dynamic multiscale feature fusion method for underwater target recognition,” J Sens 2022(13), 2535 (2022).10.1155/2022/8110695CrossRefGoogle Scholar
Pan, T.-S., Huang, H.-C., Lee, J.-C. and Chen, C.-H., “Multi-scale ResNet for real-time underwater object detection,” Signal Image Video Process 15(5), 941949 (2021).CrossRefGoogle Scholar
Le, Q. D., Vu, T. T. C. and Vo, T. Q., “Application of 3D face recognition in the access control system,” Robotica 40(7), 24492467 (2022).CrossRefGoogle Scholar
Cheng, J., Sun, Y. and Meng, M. Q.-H., “Robust semantic mapping in challenging environments,” Robotica 38(2), 256270 (2020).CrossRefGoogle Scholar
Sun, B., Li, B., Cai, S., Yuan, Y.and Zhang, C., “Fsce: Few-Shot Object Detection via Contrastive Proposal Encoding,” In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA (IEEE, 2021) pp. 73527362.10.1109/CVPR46437.2021.00727CrossRefGoogle Scholar
Qin, F., Qiu, S., Gao, S. and Bai, J., “3D CAD model retrieval based on sketch and unsupervised variational autoencoder,” Adv Eng Inform 51, 101427 (2022).CrossRefGoogle Scholar
Hou, J., Luo, C., Qin, F., Shao, Y. and Chen, X., “FuS-GCN: Efficient B-rep based graph convolutional networks for 3D-CAD model classification and retrieval,” Adv Eng Inform 56, 102008 (2023).CrossRefGoogle Scholar
Chu, Z., Wang, F., Lei, T. and Luo, C., “Path planning based on deep reinforcement learning for autonomous underwater vehicles under ocean current disturbance,” IEEE Trans Intel Veh 8(1), 108120 (2023).CrossRefGoogle Scholar
Tang, Y., Wang, L., Jin, S., Zhao, J., Huang, C. and Yu, Y., “AUV-based side-scan sonar real-time method for underwater-target detection,” J Mar Sci Eng 11(4), 690 (2023).CrossRefGoogle Scholar
Cai, L., Luo, P., Zhou, G., Xu, T. and Chen, Z., "Multiperspective light field reconstruction method via transfer reinforcement learning,” Comput Intel Neurosc 2020(2), 114 (2020).CrossRefGoogle ScholarPubMed
Lin, T., Pan, F. and Wang, X., “Sparse point-plane odometry in structured environments,” Robotica 40(7), 23812394 (2022).CrossRefGoogle Scholar
Wang, T., Ma, C., Su, H. and Wang, W., “SSFENet: Spatial and Semantic Feature Enhancement Network for Object Detection,” In: ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada (IEEE, 2021) pp. 15001504.10.1109/ICASSP39728.2021.9413602CrossRefGoogle Scholar
Cai, L., Chen, C. and Chai, H., “Underwater distortion target recognition network (UDTRNet) via enhanced image features,” Comput Intell Neurosci 2021(3), 110 (2021).Google ScholarPubMed
Rabbi, J., Ray, N., Schubert, M., Chowdhury, S. and Chao, D., “Small-object detection in remote sensing images with end-to-end edge-enhanced GAN and object detector network,” Remote Sens 12(9), 1432 (2020).CrossRefGoogle Scholar
Ma, W., Gong, C., Xu, S. and Zhang, X., “Multi-scale spatial context-based semantic edge detection,” Inf Fusion 64(5), 238251 (2020).10.1016/j.inffus.2020.08.014CrossRefGoogle Scholar
Ju, M., Luo, J., Wang, Z. and Luo, H., “Adaptive feature fusion with attention mechanism for multi-scale target detection,” Neural Comput Appl 33(7), 27692781 (2021).CrossRefGoogle Scholar
Lee, T. and Yoo, S., “Augmenting few-shot learning with supervised contrastive learning,” IEEE Access 9(2), 6146661474 (2021).CrossRefGoogle Scholar
Kuang, H., Li, Y., Zhang, Y., Wan, Y. and Ge, G., “Research on rapid location method of mobile robot based on semantic grid map in large scene similar environment,” Robotica 40(11), 40114030 (2022).10.1017/S026357472200073XCrossRefGoogle Scholar
Cai, L., Qin, X. and Xu, T., “EHDC: Enhanced dilated convolution framework for underwater blurred target recognition,” Robotica 41(3), 900911 (2023).CrossRefGoogle Scholar
Liu, Z., Du, J., Tian, F. and Wen, J., “MR-CNN: A multi-scale region-based convolutional neural network for small traffic sign recognition,” IEEE Access 7(1), 5712057128 (2019).CrossRefGoogle Scholar
Douadi, L., Dupuis, Y. and Vasseur, P., “Stable keypoints selection for 2D LiDAR based place recognition with map data reduction,” Robotica 40(11), 37863810 (2022).10.1017/S0263574722000613CrossRefGoogle Scholar
Wang, K., Wang, Y., Zhang, S., Tian, Y. and Li, D., “SLMS-SSD: Improving the balance of semantic and spatial information in object detection,” Expert Syst Appl 206(6), 117682 (2022).10.1016/j.eswa.2022.117682CrossRefGoogle Scholar
Fang, C., Tian, H., Zhang, D., Zhang, Q., Han, J. and Han, J., “Densely nested top-down flows for salient object detection,” Sci China Inf Sci 65(8), 182103 (2022).CrossRefGoogle Scholar
Gupta, A., Seal, A., Khanna, P., Herrera-Viedma, E. and Krejcar, O., ALMNet: Adjacent layer driven multiscale features for salient object detection,” IEEE Trans Instrum Meas 70(3), 114 (2021).Google Scholar
Wang, S., Chen, S., Chen, T. and Shi, X., “Learning with privileged information for multi-label classification,” Pattern Recognit 81(5), 6070 (2018).CrossRefGoogle Scholar
Zhi, C., “Mmnet: A Multi-Method Network for Multi-Label Classification,” In: 2020 5th International Conference on Smart Grid and Electrical Automation (ICSGEA), Zhangjiajie, China (IEEE, 2020) pp. 441445.CrossRefGoogle Scholar
Gao, B.-B. and Zhou, H.-Y., “Learning to discover multi-class attentional regions for multi-label image recognition,” IEEE Trans Image Process 30(12), 59205932 (2021).10.1109/TIP.2021.3088605CrossRefGoogle ScholarPubMed
Jiang, W., Huang, K., Geng, J. and Deng, X., “Multi-scale metric learning for few-shot learning,” IEEE Trans Circuits Syst Video Technol 31(3), 10911102 (2021).10.1109/TCSVT.2020.2995754CrossRefGoogle Scholar
Oksuz, K., Cam, B. C., Akbas, E. and Kalkan, S., “Rank & Sort Loss for Object Detection and Instance Segmentation,” In: Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada (IEEE, 2021) pp. 29892998.CrossRefGoogle Scholar
Zhang, Y., Wang, L. and Dai, Y., “PLOT: A 3D point cloud object detection network for autonomous driving,” Robotica 41(5), 14831499 (2023).CrossRefGoogle Scholar
Khosla, P., Teterwak, P., Wang, C., Sarna, A., Tian, Y., Isola, P., Maschinot, A., Liu, C. and Krishnan, D., “Supervised contrastive learning,” Adv Neural Inf Process Syst 33(2), 1866118673 (2020).Google Scholar