Special perceptual parsing for Chinese landscape painting scene understanding: a semantic segmentation approach

Yang, Rui; Yang, Honghong; Zhao, Min; Jia, Ru; Wu, Xiaojun; Zhang, Yumei

doi:10.1007/s00521-023-09343-w

Special perceptual parsing for Chinese landscape painting scene understanding: a semantic segmentation approach

Original Article
Published: 27 December 2023

Volume 36, pages 5231–5249, (2024)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Rui Yang^1,3,
Honghong Yang^2,3,
Min Zhao^1,3,
Ru Jia^1,3,
Xiaojun Wu^1,2,3 &
…
Yumei Zhang^1,2,3

211 Accesses
Explore all metrics

Abstract

The automatic and precise perceptual parsing of Chinese landscape paintings (CLP) significantly aids in the digitization and recreation of artworks. Manual extraction and analysis of objects in CLPs is challenging, even for expert painters with professional knowledge and sharp discernment. Two main key reasons restricted the development of CLP parsing: (1) a lack of pixel-level labeled data used to supervise model training, and (2) the inherent complexity of CLP images compared to real scenes, characterized by varied scales, diverse textures, and intricate empty spaces. To address these challenges, we first construct a pixel-level annotated CLP segmentation datasets to advance perceptual parsing. Then, a novel CLP Perceptual Parsing (CLPPP) model is designed to fully utilize the intrinsic features of CLP images. To dynamically and adaptively capture context information, we introduced a set of learnable kernels into the CLPPP model based on the multiscale features of objects within CLPs. This enabled the model to learn an appropriate receptive field for context information extraction. Additionally, a positional attention head is devised to effectively eliminate noise from the intergroup and help the kernel gain inter-object position information. This iterative optimization process is helpful to learn powerful feature representations for different textures in CLPs. The experiment results demonstrate that the proposed CLPPP model outperforms state-of-the-art methods with mIoU, aAcc, and mAcc scores of 55.45, 75.08, and 71.15, respectively, achieving a large margin on the CLP dataset under consistent conditions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hierarchical painter: Chinese landscape painting restoration with fine-grained styles

Article Open access 01 September 2023

Dense feature pyramid network for cartoon dog parsing

Article 09 July 2020

Deep Learning Does Not Generalize Well to Recognizing Cats and Dogs in Chinese Paintings

Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Notes

References

Bousselham W, Thibault G, Pagano L, Machireddy A, Gray J, Chang YH, Song X (2021) Efficient self-ensemble framework for semantic segmentation. arXiv preprint arXiv:2111.13280
Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers. European conference on computer vision. Springer, Berlin, pp 213–229
Google Scholar
Chatzistamatis S, Rigos A, Tsekouras GE (2020) Image recoloring of art paintings for the color blind guided by semantic segmentation. International conference on engineering applications of neural networks. Springer, Berlin, pp 261–273
Google Scholar
Chen LC, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587
Chen LC, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision (ECCV)
Cheng B, Misra I, Schwing AG, Kirillov A, Girdhar R (2022) Masked-attention mask transformer for universal image segmentation
Cheng B, Misra I, Schwing AG, Kirillov A, Girdhar R (2022) Masked-attention mask transformer for universal image segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1290–1299
Choi S, Kim JT, Choo J (2020) Cars can’t fly up in the sky: improving urban-scene segmentation via height-driven attention networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9373–9383
Cohen N, Newman Y, Shamir A (022) Semantic segmentation in art paintings. In: Computer graphics forum, vol 41, pp 261–275. Wiley Online Library
Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, chiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3213–3223
Deng J, Dong W, Socher R, Li LJ, Li FF (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE computer society conference on computer vision and pattern recognition (CVPR 2009), 20-25 June 2009, Miami, Florida, USA
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S et al (2020) An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929
Everingham M, Eslami SA, Van Gool L, Williams CK, Winn J, Zisserman A (2015) The pascal visual object classes challenge: a retrospective. Int J Comput Vis 111(1):98–136
Article Google Scholar
He J, Deng Z, Qiao Y (2019) Dynamic multi-scale filters for semantic segmentation. In:Proceedings of the IEEE/CVF international conference on computer vision, pp 3562–3572
He K, Gkioxari G, Dollár P, Girshick R(2017) Mask R-CNN. In:Proceedings of the IEEE international conference on computer vision, pp 2961–2969
He K, Zhang X, Ren S, Sun J(2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Huang Z, Wang X, Huang L, Huang C, Wei Y, Liu W (2019) Ccnet: Criss-cross attention for semantic segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 603–612
Islam MA, Jia S, Bruce NDB (2020) How much position information do convolutional neural networks encode? arXiv preprint arXiv:2001.08248
Kirillov A, He K, Girshick R, Rother C, Dollár P(2019) Panoptic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9404–9413
Lai Y-C, Chen B-A, Chen K-W, Si W-L, Yao C-Y, Zhang E (2016) Data-driven npr illustrations of natural flows in Chinese painting. IEEE Trans Vis Comput Graph 23(12):2535–2549
Article ADS PubMed Google Scholar
Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125
Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S(2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125
Lin TY, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988,
Li H, Tao C, Zhu X, Wang X, Huang G, Dai J(2021) Auto seg-loss: searching metric surrogates for semantic segmentation. ArXiv, ArXiv:abs/2010.07930
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 10012–10022
Liu S, Li F, Zhang H, Yang X, Qi X, Su H, Zhu J, Zhang L (2022) DAB-DETR: dynamic anchor boxes are better queries for DETR. In: International conference on learning representations
Li X, Wang W, Hu X, Yang J(2019) Selective kernel networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 510–519
Loehr M (1964) The way of the brush: painting techniques of China and Japan. Harv J Asiat Stud 25:284–289
Article Google Scholar
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
Loshchilov I, Hutter F (2017) Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101
Milletari F, Navab N, Ahmadi SA (2016) V-net: fully convolutional neural networks for volumetric medical image segmentation. In: 2016 fourth international conference on 3D vision (3DV), pp 565–571, IEEE
MMSegmentation Contributors (2020) MMSegmentation: Openmmlab semantic segmentation toolbox and benchmark. https://github.com/open-mmlab/mmsegmentation
PaddlePaddle Contributors (2019) Paddleseg, end-to-end image segmentation kit based on paddlepaddle. https://github.com/PaddlePaddle/PaddleSeg
Peng Z, Huang W, Gu S, Xie L, Wang Y, Jiao J, Ye Q (2021) Conformer: local features coupling global representations for visual recognition. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 367–376
Rezatofighi H, Tsoi N, Gwak JY, Sadeghian A, Reid I, Savarese S (2019) Generalized intersection over union: a metric and a loss for bounding box regression. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 658–666
Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: Medical image computing and computer-assisted intervention–MICCAI 2015: 18th international conference, Munich, Germany, October 5–9, 2015, Proceedings, Part III 18, pp 234–241. Springer
Strudel R, Pinel RG, Laptev I, Schmid C(2021) Segmenter: transformer for semantic segmentation. In: ICCV, pp 7242–7252. IEEE
Tang F, Dong W, Meng Y, Mei X, Huang F, Zhang X, Deussen O (2017) Animated construction of Chinese brush paintings. IEEE Trans Vis Comput Graph 24(12):3019–3031
Article PubMed Google Scholar
Tian Z, Shen C, Chen H (2020) Conditional convolutions for instance segmentation. In: European conference on computer vision, pp 282–298. Springer
Tong X-Y, Xia G-S, Qikai L, Shen H, Li S, You S, Zhang L (2020) Land-cover classification with high-resolution remote sensing images using transferable deep models. Remote Sens Environ 237:111322
Article Google Scholar
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst, 30
Wang T, Mo L, Vartanian O, Cant JS, Cupchik G (2015) An investigation of the neural substrates of mind wandering induced by viewing traditional Chinese landscape paintings. Front Hum Neurosci 8:1018
Article PubMed PubMed Central Google Scholar
Wang J, Sun K, Cheng T, Jiang B, Deng C, Zhao Y, Liu D, Yadong M, Tan M, Wang X et al (2020) Deep high-resolution representation learning for visual recognition. IEEE Trans Pattern Anal Mach Intell 43(10):3349–3364
Article Google Scholar
Wang X, Zhang R, Kong T, Li L, Shen C (2020) Solov2: dynamic and fast instance segmentation. Adv Neural Inf Process Syst 33:17721–17732
Google Scholar
Wang X, Kong T, Shen C, Jiang Y, Li L (2020) Solo: segmenting objects by locations. In: European conference on computer vision, pp 649–665. Springer
Wang G, Shen J, Yue M, Ma Y, Wu S (2022) A computational study of empty space ratios in Chinese landscape painting, pp 618–2011
Xiao T, Liu Y, Zhou B, Jiang Y, Sun J (2018) Unified perceptual parsing for scene understanding. In: Proceedings of the European conference on computer vision (ECCV), pp 418–434
Xiao T, Liu Y, Zhou B, Jiang Y, Sun J (2018) Unified perceptual parsing for scene understanding. In: Proceedings of the European conference on computer vision (ECCV), pp 418–434
Xie E, Wang W, Yu Z, Anandkumar A, Alvarez JM, Luo P (2021) Segformer: Simple and efficient design for semantic segmentation with transformers. Adv Neural Inf Process Syst 34:12077–12090
Google Scholar
Xue A (2021) End-to-end chinese landscape painting creation using generative adversarial networks. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 3863–3871
Xu J, Xiong Z, Bhattacharyya SP (2023) Pidnet: a real-time semantic segmentation network inspired by pid controllers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 19529–1953
Yang D, Ye X, Guo B (2021) Application of multitask joint sparse representation algorithm in chinese painting image classification. Complexity
Yin R, Monson E, Honig E, Daubechies I, Maggioni M (2016) Object recognition in art drawings: transfer of a neural network. In: 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 2299–2303. IEEE
Yuan Y, Chen X, Wang J (2020) Object-contextual representations for semantic segmentation. In: European conference on computer vision, pp 173–190. Springer
Zhang J, Zhou Y, Xia K, Jiang Y, Liu Y (2020) A novel automatic image segmentation method for chinese literati paintings using multi-view fuzzy clustering technology. Multimedia Syst 26(1):37–51
Article Google Scholar
Zhang W, Pang J, Chen K, Loy CC (2021) K-net: toward unified image segmentation. Adv Neural Inf Process Syst 34:10326–10338
Google Scholar
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2881–2890
Zhou P, Li K, Wei W, Wang Z, Zhou M (2020) Fast generation method of 3d scene in Chinese landscape painting. Multimed Tools Appl 79(23):16441–16457
Article Google Scholar
Zhou Z, Siddiquee MMR, Tajbakhsh N, Liang J (2019) Unet++: redesigning skip connections to exploit multiscale features in image segmentation. IEEE Trans Med Imaging
Zhou B, Zhao H, Puig X, Fidler S, Barriuso A, Torralba A (2017) Scene parsing through ade20k dataset. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 633–641

Download references

Acknowledgements

We thank the students who annotated the data for their diligence and patience. We also thank the School of Computer Science of the Shaanxi Normal University for computing resources.

Funding

This work was partially supported the National Natural Science Foundation of China (Nos. 62377034, 61907028), the Young science and technology stars in Shaanxi Province (No. 2021KJXX-91), the Fundamental Research Funds for the Central Universities of China under Grant (No. GK202101004), and the Shaanxi Key Science and Technology Innovation Team Project (No. 2022TD-26).

Author information

Authors and Affiliations

School of Computer Science, Shaanxi Normal University, Chang’an, Xi’an, 710119, China
Rui Yang, Min Zhao, Ru Jia, Xiaojun Wu & Yumei Zhang
Key Laboratory of Modern Teaching Technology, Ministry of Education, Xi’an, 710062, China
Honghong Yang, Xiaojun Wu & Yumei Zhang
Key Laboratory of Intelligent Computing and Service Technology for Folk Song, Ministry of Culture and Tourism, Xi’an, 710062, China
Rui Yang, Honghong Yang, Min Zhao, Ru Jia, Xiaojun Wu & Yumei Zhang

Authors

Rui Yang
View author publications
You can also search for this author in PubMed Google Scholar
Honghong Yang
View author publications
You can also search for this author in PubMed Google Scholar
Min Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Ru Jia
View author publications
You can also search for this author in PubMed Google Scholar
Xiaojun Wu
View author publications
You can also search for this author in PubMed Google Scholar
Yumei Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Honghong Yang or Xiaojun Wu.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yang, R., Yang, H., Zhao, M. et al. Special perceptual parsing for Chinese landscape painting scene understanding: a semantic segmentation approach. Neural Comput & Applic 36, 5231–5249 (2024). https://doi.org/10.1007/s00521-023-09343-w

Download citation

Received: 25 November 2022
Accepted: 26 November 2023
Published: 27 December 2023
Issue Date: April 2024
DOI: https://doi.org/10.1007/s00521-023-09343-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Special perceptual parsing for Chinese landscape painting scene understanding: a semantic segmentation approach

Abstract

Access this article

Similar content being viewed by others

Hierarchical painter: Chinese landscape painting restoration with fine-grained styles

Dense feature pyramid network for cartoon dog parsing

Deep Learning Does Not Generalize Well to Recognizing Cats and Dogs in Chinese Paintings

Data availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Special perceptual parsing for Chinese landscape painting scene understanding: a semantic segmentation approach

Abstract

Access this article

Similar content being viewed by others

Hierarchical painter: Chinese landscape painting restoration with fine-grained styles

Dense feature pyramid network for cartoon dog parsing

Deep Learning Does Not Generalize Well to Recognizing Cats and Dogs in Chinese Paintings

Data availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation