Hostname: page-component-76fb5796d-25wd4 Total loading time: 0 Render date: 2024-04-28T10:38:44.764Z Has data issue: false hasContentIssue false

Global enhancement network underwater archaeology scene parsing method

Published online by Cambridge University Press:  08 September 2023

Junyan Pan
Affiliation:
School of Mathematical Sciences, Henan Institute of Science and Technology, Xinxiang, China
Jishen Jia
Affiliation:
School of Mathematical Sciences, Henan Institute of Science and Technology, Xinxiang, China Henan Digital Agriculture Engineering Technology Research Center, Xinxiang, China
Lei Cai*
Affiliation:
School of Artificial Intelligence, Henan Institute of Science and Technology, Xinxiang, China
*
Corresponding author: Lei Cai; Email: cailei2014@126.com

Abstract

Underwater archaeology is of great significance for historical and cultural transmission and preservation of underwater heritage, but it is also a challenging task. Underwater heritage is located in an environment with high sediment content, objects are mostly buried, and the water is turbid, resulting in some of the features of objects missing or blurred, making it difficult to accurately identify and understand the semantics of various objects in the scene. To tackle these issues, this paper proposes a global enhancement network (GENet) underwater scene parsing method. We introduce adaptive dilated convolution by adding an extra regression layer, which can automatically deduce adaptive dilated coefficients according to the different scene objects. In addition, considering the easy confusion in the process of fuzzy feature classification, an enhancement classification network is proposed to increase the difference between various types of probabilities by reducing the loss function. We verified the validity of the proposed model by conducting numerous experiments on the Underwater Shipwreck Scenes (USS) dataset. We achieve state-of-the-art performance compared to the current state-of-the-art algorithm under three different conditions: conventional, relic semi-buried, and turbidified water quality. The experimental results show that the proposed algorithm performs best in different situations. To verify the generalizability of the proposed algorithm, we conducted comparative experiments on the current publicly available Cityscapes, ADE20K, and the underwater dataset SUIM. The experimental results show that this paper achieves good performance on the public dataset, indicating that the proposed algorithm is generalizable.

Type
Research Article
Copyright
© The Author(s), 2023. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Long, J., Shelhamer, E. and Darrell, T., “Fully Convolutional Networks for Semantic Segmentation,” In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA (2015) pp. 34313440.Google Scholar
Jiang, J., Liu, J., Fu, J., Zhu, X., Li, Z. and Lu, H., “Global-guided selective context network for scene parsing,” IEEE Trans. Neural Netw. Learn. Syst. 33(4), 17521764 (2022).10.1109/TNNLS.2020.3043808CrossRefGoogle Scholar
Zhou, W., Lin, X., Lei, J., Yu, L. and Hwang, J., “MFFENet: Multiscale feature fusion and enhancement network for RGB-thermal urban road scene parsing,” IEEE Trans. Multimedia 24 (6), 25262538 (2021).10.1109/TMM.2021.3086618CrossRefGoogle Scholar
Ma, S., Pang, Y., Pan, J. and Shao, L., “Preserving details in semantics-aware context for scene parsing,” Sci. China Inf. Sci. 63(2), 114 (2020).10.1007/s11432-019-2738-yCrossRefGoogle Scholar
Yan, K., Wang, H., Bu, S., Yang, L. and Li, J., “Scene parsing for very high resolution remote sensing images using on attention-residual block-embedded adversarial networks,” Remote Sens. Lett. 12(7), 625635 (2021).10.1080/2150704X.2021.1910362CrossRefGoogle Scholar
Liu, S., Zang, H., Li, S. and Yang, J., “Built-in depth-semantic coupled encoding for scene arsing, vehicle detection, and road segmentation,” IEEE Trans. Intell. Transp. Syst. 22(9), 55205534 (2021).10.1109/TITS.2020.2987819CrossRefGoogle Scholar
Jiang, J., He, Z., Zhang, S., Zhao, X. and Tan, J., “Learning to transfer focus of graph neural network for scene graph parsing,” Pattern Recognit. 112 (4), 107707 (2020).10.1016/j.patcog.2020.107707CrossRefGoogle Scholar
Xiong, Z., Yuan, Y., Guo, N. and Wang, Q., “Variational Context-Deformable ConvNets for Indoor Scene Parsing,” In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA (2020) pp. 39914001.Google Scholar
Nie, Y., Han, X., Guo, S., Zheng, Y., Chang, J. and Zhang, J., “Total 3D Understanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image,” In: 33st IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, MA (2020) pp. 5261.Google Scholar
Ji, J., Lu, X. C., Luo, M., Yin, M., Miao, Q. and Liu, X., “Parallel fully convolutional network for semantic segmentation,” IEEE Access 9 (11), 673682 (2021).10.1109/ACCESS.2020.3042254CrossRefGoogle Scholar
Li, H., Xiong, P., Fan, H. and Sun, J., “DFANet: Deep Feature Aggregation for Real-Time Semantic Segmentation,” In: 33rd IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA (2019) pp. 95229531.Google Scholar
Li, X., Zhang, L., Cheng, G., Lin, Z., Tan, S. and Tong, Y., “Improving Semantic Segmentation via Decoupled Body and Edge Supervision,” In: European Conference on Computer Vision (ECCV) (Springer, Cham, 2020) pp. 435452.Google Scholar
Zhang, X., Yan, Y., Xue, J.-H., Hua, Y. and Wang, H., “Semantic-aware occlusion-robust network for occluded person re-identification,” IEEE Trans. Circuits Syst. Video Technol. 31(7), 27642778 (2021).10.1109/TCSVT.2020.3033165CrossRefGoogle Scholar
Liao, L., Xiao, J., Wang, Z., Lin, C.-W. and Satoh, S., “Uncertainty-aware semantic guidance and estimation for image inpainting,” IEEE J. Sel. Top. Signal Process. 15(2), 310323 (2021).10.1109/JSTSP.2020.3045627CrossRefGoogle Scholar
Cai, L., Qin, X. C. and Xu, T., “EHDC: Enhanced dilated convolution framework for underwater blurred target recognition,” Robotica 41(3), 900911 (2022).CrossRefGoogle Scholar
Qiao, X., Zheng, Q., Cao, Y. and Lau, R. W. H., “Object-level scene context prediction,” IEEE Trans. Pattern Anal. Mach. Intell. 44(9), 52805292 (2021).Google Scholar
Li, Z., Sun, Y., Zhang, L. and Tang, J., “CTNet: Context-based tandem network for semantic segmentation,” IEEE Trans. Pattern Anal. Mach. Intell. 44(12), 99049917 (2021).10.1109/TPAMI.2021.3132068CrossRefGoogle Scholar
Sun, Y. and Li, Z., “SSA: Semantic structure aware inference for weakly pixel-wise dense predictions without cost,” arXiv preprint arXiv:2111.03392 (2021).Google Scholar
Khalfaoui-Hassani, I., Pellegrini, T. and Masquelier, T., “Dilated convolution with learnable spacings,” arXiv preprint arXiv:2112.03740 (2021).Google Scholar
Chen, Q., Zhang, W., Zhou, N., Lei, P., Xu, Y., Zheng, Y. and Fan, J., “Adaptive Fractional Dilated Convolution Network for Image Aesthetics Assessment,” In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA (2020) pp. 1411414123.Google Scholar
Luo, Z., Wang, Z., Huang, Y., Wang, L., Tan, T. and Zhou, E., “Adaptive dilated convolution for human pose estimation,” arXiv preprint arXiv:2107.10477 (2021).Google Scholar
Niu, X., Yan, B., Tan, W. and Wang, J., “Graphs, convolutions, and neural networks: From graph filters to graph neural networks,” IEEE Signal Process. Mag. 37(6), 128138 (2020).Google Scholar
Cai, L., Chen, C. and Chai, H., “Underwater distortion target recognition network (UDTRNet) via enhanced image features,” Comput. Intell. Neurosci. 2021(10), 110 (2021).Google ScholarPubMed
Cai, L., Chen, C., Sun, Q. and Chai, H., “Glass refraction distortion object detection via abstract features,” Comput. Intell. Neurosci. 2022 (3), 5456818 (2022).10.1155/2022/5456818CrossRefGoogle ScholarPubMed
Liu, R., Jiang, Z., Yang, S. and Fan, X., “Twin adversarial contrastive learning for underwater image enhancement and beyond,” IEEE Trans. Image Process. 31 (7), 4922–1936 (2022).CrossRefGoogle ScholarPubMed
Rajan, S. K. S. and Damodaran, N., “Multiscale decomposition and fusion-based color contrast restoration for various water-colored environments,” Color Res. Appl. 47(2), 301328 (2022).CrossRefGoogle Scholar
Wang, Z., Zhang, S., Huang, W., Guo, J. and Zeng, L., “Sonar image target detection based on adaptive global feature enhancement network,” IEEE Sens. J. 22(2), 15091530 (2021).10.1109/JSEN.2021.3131645CrossRefGoogle Scholar
Zhu, Z., Xu, M., Bai, S., Huang, T. and Bai, X., “Asymmetric Non-local Neural Networks for Semantic Segmentation,” In: IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South) (2019) pp. 593602.10.1109/ICCV.2019.00068CrossRefGoogle Scholar
Zhou, W., Jin, J., Lei, J. and Hwang, J. N., “CEGFNet: Common extraction and gate fusion network for scene parsing of remote sensing images,” IEEE Trans. Geosci. Remote Sens. 60 (9), 110 (2021).Google Scholar
Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J. M. and Luo, P., “SegFormer: Simple and efficient design for semantic segmentation with transformers,” arXiv preprint arxiv:2105.15203 (2021).Google Scholar
Sun, Y., Chen, Q., He, X., Wang, J., Feng, H., Han, J., Ding, E., Cheng, J., Li, Z., Wang, J., “Singular value fine-tuning: Few-shot segmentation requires few-parameters fine-tuning,” arXiv preprint arXiv:2206.06122 (2022).Google Scholar
Li, Z., Tang, H., Peng, Z., Qi, G. and Tang, J., “Knowledge-guided semantic transfer network for few-shot image recognition,” IEEE Trans. Neural. Netw. Learn. Syst. 34 (2), 1–15 (2023).Google Scholar
Islam, M. J., Edge, C., Xiao, Y., Luo, P., Mehtaz, M., Morse, C., Enanand, S. S. and Sattar, J., “Semantic Segmentation of Underwater Imagery: Dataset and Benchmark,” In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA (2020) pp. 17691776.CrossRefGoogle Scholar
Wang, S. and Yang, Y., “Image semantic segmentation method based on deep fusion network and conditional random field,” Comput. Intell. Neurosci. 2022 (5), 8961456 (2022).Google ScholarPubMed
Yang, W. and Hui, Y., “Image scene analysis based on improved FCN model,” Int. J. Pattern Recognit. Artif. Intell. 35(15), 2152020 (2021).10.1142/S0218001421520200CrossRefGoogle Scholar
Liu, P. and Song, Y., “Segmentation of sonar imagery using convolutional neural networks and Markov random field,” Multidimens. Syst. Signal Process. 31(1), 2147 (2019).10.1007/s11045-019-00652-9CrossRefGoogle Scholar
Yu, F. and Koltun, V., “Multi-scale context aggregation by dilated convolutions,” arXiv preprint arXiv:1511.07122 (2015)Google Scholar