Global enhancement network underwater archaeology scene parsing method

Junyan Pan; Jishen Jia; Lei Cai

doi:10.1017/S026357472300098X

Global enhancement network underwater archaeology scene parsing method

Published online by Cambridge University Press: 08 September 2023

Junyan Pan

Jishen Jia and

Lei Cai

Show author details

Junyan Pan: Affiliation:
School of Mathematical Sciences, Henan Institute of Science and Technology, Xinxiang, China
Jishen Jia: Affiliation:
School of Mathematical Sciences, Henan Institute of Science and Technology, Xinxiang, China Henan Digital Agriculture Engineering Technology Research Center, Xinxiang, China
Lei Cai*: Affiliation:
School of Artificial Intelligence, Henan Institute of Science and Technology, Xinxiang, China
*: Corresponding author: Lei Cai; Email: cailei2014@126.com

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Underwater archaeology is of great significance for historical and cultural transmission and preservation of underwater heritage, but it is also a challenging task. Underwater heritage is located in an environment with high sediment content, objects are mostly buried, and the water is turbid, resulting in some of the features of objects missing or blurred, making it difficult to accurately identify and understand the semantics of various objects in the scene. To tackle these issues, this paper proposes a global enhancement network (GENet) underwater scene parsing method. We introduce adaptive dilated convolution by adding an extra regression layer, which can automatically deduce adaptive dilated coefficients according to the different scene objects. In addition, considering the easy confusion in the process of fuzzy feature classification, an enhancement classification network is proposed to increase the difference between various types of probabilities by reducing the loss function. We verified the validity of the proposed model by conducting numerous experiments on the Underwater Shipwreck Scenes (USS) dataset. We achieve state-of-the-art performance compared to the current state-of-the-art algorithm under three different conditions: conventional, relic semi-buried, and turbidified water quality. The experimental results show that the proposed algorithm performs best in different situations. To verify the generalizability of the proposed algorithm, we conducted comparative experiments on the current publicly available Cityscapes, ADE20K, and the underwater dataset SUIM. The experimental results show that this paper achieves good performance on the public dataset, indicating that the proposed algorithm is generalizable.

Keywords

underwater archaeology target semi-buried adaptive dilated convolution feature fuzziness enhancement network

Type: Research Article
Information: Robotica , Volume 41 , Issue 12 , December 2023 , pp. 3541 - 3564

DOI: https://doi.org/10.1017/S026357472300098X [Opens in a new window]
Copyright: © The Author(s), 2023. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Long, J., Shelhamer, E. and Darrell, T., “Fully Convolutional Networks for Semantic Segmentation,” In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA (2015) pp. 3431–3440.Google Scholar

Jiang, J., Liu, J., Fu, J., Zhu, X., Li, Z. and Lu, H., “Global-guided selective context network for scene parsing,” IEEE Trans. Neural Netw. Learn. Syst. 33(4), 1752–1764 (2022).10.1109/TNNLS.2020.3043808CrossRef Google Scholar

Zhou, W., Lin, X., Lei, J., Yu, L. and Hwang, J., “MFFENet: Multiscale feature fusion and enhancement network for RGB-thermal urban road scene parsing,” IEEE Trans. Multimedia 24 (6), 2526–2538 (2021).10.1109/TMM.2021.3086618CrossRef Google Scholar

Ma, S., Pang, Y., Pan, J. and Shao, L., “Preserving details in semantics-aware context for scene parsing,” Sci. China Inf. Sci. 63(2), 1–14 (2020).10.1007/s11432-019-2738-yCrossRef Google Scholar

Yan, K., Wang, H., Bu, S., Yang, L. and Li, J., “Scene parsing for very high resolution remote sensing images using on attention-residual block-embedded adversarial networks,” Remote Sens. Lett. 12(7), 625–635 (2021).10.1080/2150704X.2021.1910362CrossRef Google Scholar

Liu, S., Zang, H., Li, S. and Yang, J., “Built-in depth-semantic coupled encoding for scene arsing, vehicle detection, and road segmentation,” IEEE Trans. Intell. Transp. Syst. 22(9), 5520–5534 (2021).10.1109/TITS.2020.2987819CrossRef Google Scholar

Jiang, J., He, Z., Zhang, S., Zhao, X. and Tan, J., “Learning to transfer focus of graph neural network for scene graph parsing,” Pattern Recognit. 112 (4), 107707 (2020).10.1016/j.patcog.2020.107707CrossRef Google Scholar

Xiong, Z., Yuan, Y., Guo, N. and Wang, Q., “Variational Context-Deformable ConvNets for Indoor Scene Parsing,” In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA (2020) pp. 3991–4001.Google Scholar

Nie, Y., Han, X., Guo, S., Zheng, Y., Chang, J. and Zhang, J., “Total 3D Understanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image,” In: 33st IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, MA (2020) pp. 52–61.Google Scholar

Ji, J., Lu, X. C., Luo, M., Yin, M., Miao, Q. and Liu, X., “Parallel fully convolutional network for semantic segmentation,” IEEE Access 9 (11), 673–682 (2021).10.1109/ACCESS.2020.3042254CrossRef Google Scholar

Li, H., Xiong, P., Fan, H. and Sun, J., “DFANet: Deep Feature Aggregation for Real-Time Semantic Segmentation,” In: 33rd IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA (2019) pp. 9522–9531.Google Scholar

Li, X., Zhang, L., Cheng, G., Lin, Z., Tan, S. and Tong, Y., “Improving Semantic Segmentation via Decoupled Body and Edge Supervision,” In: European Conference on Computer Vision (ECCV) (Springer, Cham, 2020) pp. 435–452.Google Scholar

Zhang, X., Yan, Y., Xue, J.-H., Hua, Y. and Wang, H., “Semantic-aware occlusion-robust network for occluded person re-identification,” IEEE Trans. Circuits Syst. Video Technol. 31(7), 2764–2778 (2021).10.1109/TCSVT.2020.3033165CrossRef Google Scholar

Liao, L., Xiao, J., Wang, Z., Lin, C.-W. and Satoh, S., “Uncertainty-aware semantic guidance and estimation for image inpainting,” IEEE J. Sel. Top. Signal Process. 15(2), 310–323 (2021).10.1109/JSTSP.2020.3045627CrossRef Google Scholar

Cai, L., Qin, X. C. and Xu, T., “EHDC: Enhanced dilated convolution framework for underwater blurred target recognition,” Robotica 41(3), 900–911 (2022).CrossRef Google Scholar

Qiao, X., Zheng, Q., Cao, Y. and Lau, R. W. H., “Object-level scene context prediction,” IEEE Trans. Pattern Anal. Mach. Intell. 44(9), 5280–5292 (2021).Google Scholar

Li, Z., Sun, Y., Zhang, L. and Tang, J., “CTNet: Context-based tandem network for semantic segmentation,” IEEE Trans. Pattern Anal. Mach. Intell. 44(12), 9904–9917 (2021).10.1109/TPAMI.2021.3132068CrossRef Google Scholar

Sun, Y. and Li, Z., “SSA: Semantic structure aware inference for weakly pixel-wise dense predictions without cost,” arXiv preprint arXiv:2111.03392 (2021).Google Scholar

Khalfaoui-Hassani, I., Pellegrini, T. and Masquelier, T., “Dilated convolution with learnable spacings,” arXiv preprint arXiv:2112.03740 (2021).Google Scholar

Chen, Q., Zhang, W., Zhou, N., Lei, P., Xu, Y., Zheng, Y. and Fan, J., “Adaptive Fractional Dilated Convolution Network for Image Aesthetics Assessment,” In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA (2020) pp. 14114–14123.Google Scholar

Luo, Z., Wang, Z., Huang, Y., Wang, L., Tan, T. and Zhou, E., “Adaptive dilated convolution for human pose estimation,” arXiv preprint arXiv:2107.10477 (2021).Google Scholar

Niu, X., Yan, B., Tan, W. and Wang, J., “Graphs, convolutions, and neural networks: From graph filters to graph neural networks,” IEEE Signal Process. Mag. 37(6), 128–138 (2020).Google Scholar

Cai, L., Chen, C. and Chai, H., “Underwater distortion target recognition network (UDTRNet) via enhanced image features,” Comput. Intell. Neurosci. 2021(10), 1–10 (2021).Google Scholar PubMed

Cai, L., Chen, C., Sun, Q. and Chai, H., “Glass refraction distortion object detection via abstract features,” Comput. Intell. Neurosci. 2022 (3), 5456818 (2022).10.1155/2022/5456818CrossRef Google Scholar PubMed

Liu, R., Jiang, Z., Yang, S. and Fan, X., “Twin adversarial contrastive learning for underwater image enhancement and beyond,” IEEE Trans. Image Process. 31 (7), 4922–1936 (2022).CrossRef Google Scholar PubMed

Rajan, S. K. S. and Damodaran, N., “Multiscale decomposition and fusion-based color contrast restoration for various water-colored environments,” Color Res. Appl. 47(2), 301–328 (2022).CrossRef Google Scholar

Wang, Z., Zhang, S., Huang, W., Guo, J. and Zeng, L., “Sonar image target detection based on adaptive global feature enhancement network,” IEEE Sens. J. 22(2), 1509–1530 (2021).10.1109/JSEN.2021.3131645CrossRef Google Scholar

Zhu, Z., Xu, M., Bai, S., Huang, T. and Bai, X., “Asymmetric Non-local Neural Networks for Semantic Segmentation,” In: IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South) (2019) pp. 593–602.10.1109/ICCV.2019.00068CrossRef Google Scholar

Zhou, W., Jin, J., Lei, J. and Hwang, J. N., “CEGFNet: Common extraction and gate fusion network for scene parsing of remote sensing images,” IEEE Trans. Geosci. Remote Sens. 60 (9), 1–10 (2021).Google Scholar

Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J. M. and Luo, P., “SegFormer: Simple and efficient design for semantic segmentation with transformers,” arXiv preprint arxiv:2105.15203 (2021).Google Scholar

Sun, Y., Chen, Q., He, X., Wang, J., Feng, H., Han, J., Ding, E., Cheng, J., Li, Z., Wang, J., “Singular value fine-tuning: Few-shot segmentation requires few-parameters fine-tuning,” arXiv preprint arXiv:2206.06122 (2022).Google Scholar

Li, Z., Tang, H., Peng, Z., Qi, G. and Tang, J., “Knowledge-guided semantic transfer network for few-shot image recognition,” IEEE Trans. Neural. Netw. Learn. Syst. 34 (2), 1–15 (2023).Google Scholar

Islam, M. J., Edge, C., Xiao, Y., Luo, P., Mehtaz, M., Morse, C., Enanand, S. S. and Sattar, J., “Semantic Segmentation of Underwater Imagery: Dataset and Benchmark,” In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA (2020) pp. 1769–1776.CrossRef Google Scholar

Wang, S. and Yang, Y., “Image semantic segmentation method based on deep fusion network and conditional random field,” Comput. Intell. Neurosci. 2022 (5), 8961456 (2022).Google Scholar PubMed

Yang, W. and Hui, Y., “Image scene analysis based on improved FCN model,” Int. J. Pattern Recognit. Artif. Intell. 35(15), 2152020 (2021).10.1142/S0218001421520200CrossRef Google Scholar

Liu, P. and Song, Y., “Segmentation of sonar imagery using convolutional neural networks and Markov random field,” Multidimens. Syst. Signal Process. 31(1), 21–47 (2019).10.1007/s11045-019-00652-9CrossRef Google Scholar

Yu, F. and Koltun, V., “Multi-scale context aggregation by dilated convolutions,” arXiv preprint arXiv:1511.07122 (2015)Google Scholar

Article contents

Global enhancement network underwater archaeology scene parsing method

Abstract

Keywords

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests