Encoder-decoder network with RMP for tongue segmentation

Kusakunniran, Worapan; Borwarnginn, Punyanuch; Karnjanapreechakorn, Sarattha; Thongkanchorn, Kittikhun; Ritthipravat, Panrasee; Tuakta, Pimchanok; Benjapornlert, Paitoon

doi:10.1007/s11517-022-02761-3

Encoder-decoder network with RMP for tongue segmentation

Original Article
Published: 24 January 2023

Volume 61, pages 1193–1207, (2023)
Cite this article

Medical & Biological Engineering & Computing Aims and scope Submit manuscript

Worapan Kusakunniran¹,
Punyanuch Borwarnginn ORCID: orcid.org/0000-0002-6309-5022¹,
Sarattha Karnjanapreechakorn¹,
Kittikhun Thongkanchorn¹,
Panrasee Ritthipravat²,
Pimchanok Tuakta³ &
…
Paitoon Benjapornlert³

249 Accesses
4 Citations
Explore all metrics

Abstract

Tongue and its movements can be used for several medical-related tasks, such as identifying a disease and tracking a rehabilitation. To be able to focus on a tongue region, the tongue segmentation is needed to compute a region of interest for a further analysis. This paper proposes an encoder-decoder CNN-based architecture for segmenting a tongue in an image. The encoder module is mainly used for the tongue feature extraction, while the decoder module is used to reconstruct a segmented tongue from the extracted features based on training images. In addition, the residual multi-kernel pooling (RMP) is also applied into the proposed network to help in encoding multiple scales of the features. The proposed method is evaluated on two publicly available datasets under a scenario of front view and one tongue posture. It is then tested on a newly collected dataset of five tongue postures. The reported performances show that the proposed method outperforms existing methods in the literature. In addition, the re-training process could improve applying the trained model on unseen dataset, which would be a necessary step of applying the trained model on the real-world scenario.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automated tongue segmentation using deep encoder-decoder model

Article 20 March 2023

DSE-Net: Deep Semantic Enhanced Network for Mobile Tongue Image Segmentation

Deep Upscale U-Net for automatic tongue segmentation

Article 19 February 2024

References

Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495
Article PubMed Google Scholar
BioHit (2014) Tongeimagedataset. https://github.com/BioHit/TongeImageDataset
Cai Y, Wang T, Liu W, Luo Z (2020) A robust interclass and intraclass loss function for deep learning based tongue segmentation. Concurr Comput Pract Experience 32(22):e5849
Article Google Scholar
Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2014) Semantic image segmentation with deep convolutional nets and fully connected CRFS. arXiv:14127062
Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017a) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFS. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Article PubMed Google Scholar
Chen LC, Papandreou G, Schroff F, Adam H (2017b) Rethinking atrous convolution for semantic image segmentation. arXiv:170605587
Chollet F (2017) Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1251–1258
Cui Z, Zuo W, Zhang H, Zhang D (2013) Automated tongue segmentation based on 2D Gabor filters and fast marching. In: International conference on intelligent science and big data engineering. Springer, pp 328–335
Fachrurrozi M et al (2017) Tongue segmentation using active contour model. In: IOP conference series: Materials science and engineering, vol 190. IOP Publishing, p 012041
Gu Z, Cheng J, Fu H, Zhou K, Hao H, Zhao Y, Zhang T, Gao S, Liu J (2019) Ce-net: context encoder network for 2D medical image segmentation. IEEE Trans Med Imaging 38(10):2281–2292
Article PubMed Google Scholar
Guo J, Yang Y, Wu Q, Su J, Ma F (2016) Adaptive active contour model based automatic tongue image segmentation. In: 2016 9th international congress on image and signal processing, Biomedical engineering and informatics (CISP-BMEI). IEEE, pp 1386–1390
Kriman S, Beliaev S, Ginsburg B, Huang J, Kuchaiev O, Lavrukhin V, Leary R, Li J, Zhang Y (2020) Quartznet: deep automatic speech recognition with 1D time-channel separable convolutions. In: ICASSP 2020-2020 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 6124–6128
Le DN, Parvathy VS, Gupta D, Khanna A, Rodrigues JJ, Shankar K (2021) IoT enabled depthwise separable convolution neural network with deep support vector machine for COVID-19 diagnosis and classification. Int J Mach Learn Cybern: 1–14
Li J, Xu B, Ban X, Tai P, Ma B (2017a) A tongue image segmentation method based on enhanced HSV convolutional neural network. In: International conference on cooperative design, visualization and engineering. Springer, pp 252–260
Li L, Luo Z, Zhang M, Cai Y, Li C, Li S (2020) An iterative transfer learning framework for cross-domain tongue segmentation. Concurr Comput Pract Experience 32(14):e5714
Article Google Scholar
Li X, Yang T, Hu Y, Xu M, Zhang W, Li F (2017b) Automatic tongue image matting for remote medical diagnosis. In: 2017 IEEE international conference on bioinformatics and biomedicine (BIBM). IEEE, pp 561–564
Lin B, Xie J, Li C, Qu Y (2018) Deeptongue: tongue segmentation via resnet. In: 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 1035–1039
Liu W, Zhou C, Li Z, Hu Z (2020) Patch-driven tongue image segmentation using sparse representation. IEEE Access 8:41372–41383
Article Google Scholar
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
Mehta S, Rastegari M, Caspi A, Shapiro L, Hajishirzi H (2018) Espnet: efficient spatial pyramid of dilated convolutions for semantic segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp 552–568
Park KB, Choi SH, Lee JY (2020) M-gan: retinal blood vessel segmentation by balancing losses through stacked deep fully convolutional networks. IEEE Access 8:146308–146322
Article Google Scholar
Paszke A, Chaurasia A, Kim S, Culurciello E (2016) Enet: a deep neural network architecture for real-time semantic segmentation. arXiv:160602147
Pinheiro PO, Collobert R, Dollár P (2015) Learning to segment object candidates. arXiv:150606204
Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention. Springer, pp 234–241
Rother C, Kolmogorov V, Blake A (2004) “grabcut” interactive foreground extraction using iterated graph cuts. ACM Trans Graph (TOG) 23(3):309–314
Article Google Scholar
Shen X, Tao X, Gao H, Zhou C, Jia J (2016) Deep automatic portrait matting. In: European conference on computer vision. Springer, pp 92–107
Shi D, Tang C, Blackley SV, Wang L, Yang J, He Y, Bennett SI, Xiong Y, Shi X, Zhou L et al (2020) An annotated dataset of tongue images supporting geriatric disease diagnosis. Data Brief 32:106153
Article PubMed PubMed Central Google Scholar
Tang C (2019) Replication data for: an annotated dataset of tongue images. https://doi.org/10.7910/DVN/COJZMQ
Xu Q, Zeng Y, Tang W, Peng W, Xia T, Li Z, Teng F, Li W, Guo J (2020) Multi-task joint learning model for segmenting and classifying tongue images using a deep neural network. IEEE J Biomed Health Inform 24(9):2481–2489
Article PubMed Google Scholar
Xue Y, Li X, Wu P, Li J, Wang L, Tong W (2018) Automated tongue segmentation in chinese medicine based on deep learning. In: International conference on neural information processing. Springer, pp 542–553
Yasrab R, Gu N, Zhang X (2017) An encoder-decoder based convolution neural network (CNN) for future advanced driver assistance system (ADAS). Appl Sci 7(4):312
Article Google Scholar
Zeng X, Zhang Q, Chen J, Zhang G, Zhou A, Wang Y (2020) Boundary guidance hierarchical network for real-time tongue segmentation. arXiv:200306529
Zhou C, Fan H, Li Z (2019a) Tonguenet: accurate localization and segmentation for tongue images using deep neural networks. IEEE Access 7:148779–148789
Article Google Scholar
Zhou J, Zhang Q, Zhang B, Chen X (2019b) Tonguenet: a precise and fast tongue segmentation system using u-net with a morphological processing layer. Appl Sci 9(15):3128
Article Google Scholar

Download references

Funding

This project is funded by the National Research Council of Thailand.

Author information

Authors and Affiliations

Faculty of Information and Communication Technology, Mahidol University, Nakhon Pathom, Thailand
Worapan Kusakunniran, Punyanuch Borwarnginn, Sarattha Karnjanapreechakorn & Kittikhun Thongkanchorn
Department of Biomedical Engineering, Faculty of Engineering, Mahidol University, Nakhon Pathom, Thailand
Panrasee Ritthipravat
Department of Rehabilitation Medicine, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Bangkok, Thailand
Pimchanok Tuakta & Paitoon Benjapornlert

Authors

Worapan Kusakunniran
View author publications
You can also search for this author in PubMed Google Scholar
Punyanuch Borwarnginn
View author publications
You can also search for this author in PubMed Google Scholar
Sarattha Karnjanapreechakorn
View author publications
You can also search for this author in PubMed Google Scholar
Kittikhun Thongkanchorn
View author publications
You can also search for this author in PubMed Google Scholar
Panrasee Ritthipravat
View author publications
You can also search for this author in PubMed Google Scholar
Pimchanok Tuakta
View author publications
You can also search for this author in PubMed Google Scholar
Paitoon Benjapornlert
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Punyanuch Borwarnginn.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Kusakunniran, W., Borwarnginn, P., Karnjanapreechakorn, S. et al. Encoder-decoder network with RMP for tongue segmentation. Med Biol Eng Comput 61, 1193–1207 (2023). https://doi.org/10.1007/s11517-022-02761-3

Download citation

Received: 06 February 2022
Accepted: 26 December 2022
Published: 24 January 2023
Issue Date: May 2023
DOI: https://doi.org/10.1007/s11517-022-02761-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Encoder-decoder network with RMP for tongue segmentation

Abstract

Access this article

Similar content being viewed by others

Automated tongue segmentation using deep encoder-decoder model

DSE-Net: Deep Semantic Enhanced Network for Mobile Tongue Image Segmentation

Deep Upscale U-Net for automatic tongue segmentation

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Encoder-decoder network with RMP for tongue segmentation

Abstract

Access this article

Similar content being viewed by others

Automated tongue segmentation using deep encoder-decoder model

DSE-Net: Deep Semantic Enhanced Network for Mobile Tongue Image Segmentation

Deep Upscale U-Net for automatic tongue segmentation

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation