Skip to main content

Advertisement

Log in

Encoder-decoder network with RMP for tongue segmentation

  • Original Article
  • Published:
Medical & Biological Engineering & Computing Aims and scope Submit manuscript

Abstract

Tongue and its movements can be used for several medical-related tasks, such as identifying a disease and tracking a rehabilitation. To be able to focus on a tongue region, the tongue segmentation is needed to compute a region of interest for a further analysis. This paper proposes an encoder-decoder CNN-based architecture for segmenting a tongue in an image. The encoder module is mainly used for the tongue feature extraction, while the decoder module is used to reconstruct a segmented tongue from the extracted features based on training images. In addition, the residual multi-kernel pooling (RMP) is also applied into the proposed network to help in encoding multiple scales of the features. The proposed method is evaluated on two publicly available datasets under a scenario of front view and one tongue posture. It is then tested on a newly collected dataset of five tongue postures. The reported performances show that the proposed method outperforms existing methods in the literature. In addition, the re-training process could improve applying the trained model on unseen dataset, which would be a necessary step of applying the trained model on the real-world scenario.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495

    Article  PubMed  Google Scholar 

  2. BioHit (2014) Tongeimagedataset. https://github.com/BioHit/TongeImageDataset

  3. Cai Y, Wang T, Liu W, Luo Z (2020) A robust interclass and intraclass loss function for deep learning based tongue segmentation. Concurr Comput Pract Experience 32(22):e5849

    Article  Google Scholar 

  4. Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2014) Semantic image segmentation with deep convolutional nets and fully connected CRFS. arXiv:14127062

  5. Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017a) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFS. IEEE Trans Pattern Anal Mach Intell 40(4):834–848

    Article  PubMed  Google Scholar 

  6. Chen LC, Papandreou G, Schroff F, Adam H (2017b) Rethinking atrous convolution for semantic image segmentation. arXiv:170605587

  7. Chollet F (2017) Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1251–1258

  8. Cui Z, Zuo W, Zhang H, Zhang D (2013) Automated tongue segmentation based on 2D Gabor filters and fast marching. In: International conference on intelligent science and big data engineering. Springer, pp 328–335

  9. Fachrurrozi M et al (2017) Tongue segmentation using active contour model. In: IOP conference series: Materials science and engineering, vol 190. IOP Publishing, p 012041

  10. Gu Z, Cheng J, Fu H, Zhou K, Hao H, Zhao Y, Zhang T, Gao S, Liu J (2019) Ce-net: context encoder network for 2D medical image segmentation. IEEE Trans Med Imaging 38(10):2281–2292

    Article  PubMed  Google Scholar 

  11. Guo J, Yang Y, Wu Q, Su J, Ma F (2016) Adaptive active contour model based automatic tongue image segmentation. In: 2016 9th international congress on image and signal processing, Biomedical engineering and informatics (CISP-BMEI). IEEE, pp 1386–1390

  12. Kriman S, Beliaev S, Ginsburg B, Huang J, Kuchaiev O, Lavrukhin V, Leary R, Li J, Zhang Y (2020) Quartznet: deep automatic speech recognition with 1D time-channel separable convolutions. In: ICASSP 2020-2020 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 6124–6128

  13. Le DN, Parvathy VS, Gupta D, Khanna A, Rodrigues JJ, Shankar K (2021) IoT enabled depthwise separable convolution neural network with deep support vector machine for COVID-19 diagnosis and classification. Int J Mach Learn Cybern: 1–14

  14. Li J, Xu B, Ban X, Tai P, Ma B (2017a) A tongue image segmentation method based on enhanced HSV convolutional neural network. In: International conference on cooperative design, visualization and engineering. Springer, pp 252–260

  15. Li L, Luo Z, Zhang M, Cai Y, Li C, Li S (2020) An iterative transfer learning framework for cross-domain tongue segmentation. Concurr Comput Pract Experience 32(14):e5714

    Article  Google Scholar 

  16. Li X, Yang T, Hu Y, Xu M, Zhang W, Li F (2017b) Automatic tongue image matting for remote medical diagnosis. In: 2017 IEEE international conference on bioinformatics and biomedicine (BIBM). IEEE, pp 561–564

  17. Lin B, Xie J, Li C, Qu Y (2018) Deeptongue: tongue segmentation via resnet. In: 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 1035–1039

  18. Liu W, Zhou C, Li Z, Hu Z (2020) Patch-driven tongue image segmentation using sparse representation. IEEE Access 8:41372–41383

    Article  Google Scholar 

  19. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440

  20. Mehta S, Rastegari M, Caspi A, Shapiro L, Hajishirzi H (2018) Espnet: efficient spatial pyramid of dilated convolutions for semantic segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp 552–568

  21. Park KB, Choi SH, Lee JY (2020) M-gan: retinal blood vessel segmentation by balancing losses through stacked deep fully convolutional networks. IEEE Access 8:146308–146322

    Article  Google Scholar 

  22. Paszke A, Chaurasia A, Kim S, Culurciello E (2016) Enet: a deep neural network architecture for real-time semantic segmentation. arXiv:160602147

  23. Pinheiro PO, Collobert R, Dollár P (2015) Learning to segment object candidates. arXiv:150606204

  24. Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention. Springer, pp 234–241

  25. Rother C, Kolmogorov V, Blake A (2004) “grabcut” interactive foreground extraction using iterated graph cuts. ACM Trans Graph (TOG) 23(3):309–314

    Article  Google Scholar 

  26. Shen X, Tao X, Gao H, Zhou C, Jia J (2016) Deep automatic portrait matting. In: European conference on computer vision. Springer, pp 92–107

  27. Shi D, Tang C, Blackley SV, Wang L, Yang J, He Y, Bennett SI, Xiong Y, Shi X, Zhou L et al (2020) An annotated dataset of tongue images supporting geriatric disease diagnosis. Data Brief 32:106153

    Article  PubMed  PubMed Central  Google Scholar 

  28. Tang C (2019) Replication data for: an annotated dataset of tongue images. https://doi.org/10.7910/DVN/COJZMQ

  29. Xu Q, Zeng Y, Tang W, Peng W, Xia T, Li Z, Teng F, Li W, Guo J (2020) Multi-task joint learning model for segmenting and classifying tongue images using a deep neural network. IEEE J Biomed Health Inform 24(9):2481–2489

    Article  PubMed  Google Scholar 

  30. Xue Y, Li X, Wu P, Li J, Wang L, Tong W (2018) Automated tongue segmentation in chinese medicine based on deep learning. In: International conference on neural information processing. Springer, pp 542–553

  31. Yasrab R, Gu N, Zhang X (2017) An encoder-decoder based convolution neural network (CNN) for future advanced driver assistance system (ADAS). Appl Sci 7(4):312

    Article  Google Scholar 

  32. Zeng X, Zhang Q, Chen J, Zhang G, Zhou A, Wang Y (2020) Boundary guidance hierarchical network for real-time tongue segmentation. arXiv:200306529

  33. Zhou C, Fan H, Li Z (2019a) Tonguenet: accurate localization and segmentation for tongue images using deep neural networks. IEEE Access 7:148779–148789

    Article  Google Scholar 

  34. Zhou J, Zhang Q, Zhang B, Chen X (2019b) Tonguenet: a precise and fast tongue segmentation system using u-net with a morphological processing layer. Appl Sci 9(15):3128

    Article  Google Scholar 

Download references

Funding

This project is funded by the National Research Council of Thailand.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Punyanuch Borwarnginn.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kusakunniran, W., Borwarnginn, P., Karnjanapreechakorn, S. et al. Encoder-decoder network with RMP for tongue segmentation. Med Biol Eng Comput 61, 1193–1207 (2023). https://doi.org/10.1007/s11517-022-02761-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11517-022-02761-3

Keywords

Navigation