Abstract
Tongue and its movements can be used for several medical-related tasks, such as identifying a disease and tracking a rehabilitation. To be able to focus on a tongue region, the tongue segmentation is needed to compute a region of interest for a further analysis. This paper proposes an encoder-decoder CNN-based architecture for segmenting a tongue in an image. The encoder module is mainly used for the tongue feature extraction, while the decoder module is used to reconstruct a segmented tongue from the extracted features based on training images. In addition, the residual multi-kernel pooling (RMP) is also applied into the proposed network to help in encoding multiple scales of the features. The proposed method is evaluated on two publicly available datasets under a scenario of front view and one tongue posture. It is then tested on a newly collected dataset of five tongue postures. The reported performances show that the proposed method outperforms existing methods in the literature. In addition, the re-training process could improve applying the trained model on unseen dataset, which would be a necessary step of applying the trained model on the real-world scenario.
Similar content being viewed by others
References
Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495
BioHit (2014) Tongeimagedataset. https://github.com/BioHit/TongeImageDataset
Cai Y, Wang T, Liu W, Luo Z (2020) A robust interclass and intraclass loss function for deep learning based tongue segmentation. Concurr Comput Pract Experience 32(22):e5849
Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2014) Semantic image segmentation with deep convolutional nets and fully connected CRFS. arXiv:14127062
Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017a) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFS. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Chen LC, Papandreou G, Schroff F, Adam H (2017b) Rethinking atrous convolution for semantic image segmentation. arXiv:170605587
Chollet F (2017) Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1251–1258
Cui Z, Zuo W, Zhang H, Zhang D (2013) Automated tongue segmentation based on 2D Gabor filters and fast marching. In: International conference on intelligent science and big data engineering. Springer, pp 328–335
Fachrurrozi M et al (2017) Tongue segmentation using active contour model. In: IOP conference series: Materials science and engineering, vol 190. IOP Publishing, p 012041
Gu Z, Cheng J, Fu H, Zhou K, Hao H, Zhao Y, Zhang T, Gao S, Liu J (2019) Ce-net: context encoder network for 2D medical image segmentation. IEEE Trans Med Imaging 38(10):2281–2292
Guo J, Yang Y, Wu Q, Su J, Ma F (2016) Adaptive active contour model based automatic tongue image segmentation. In: 2016 9th international congress on image and signal processing, Biomedical engineering and informatics (CISP-BMEI). IEEE, pp 1386–1390
Kriman S, Beliaev S, Ginsburg B, Huang J, Kuchaiev O, Lavrukhin V, Leary R, Li J, Zhang Y (2020) Quartznet: deep automatic speech recognition with 1D time-channel separable convolutions. In: ICASSP 2020-2020 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 6124–6128
Le DN, Parvathy VS, Gupta D, Khanna A, Rodrigues JJ, Shankar K (2021) IoT enabled depthwise separable convolution neural network with deep support vector machine for COVID-19 diagnosis and classification. Int J Mach Learn Cybern: 1–14
Li J, Xu B, Ban X, Tai P, Ma B (2017a) A tongue image segmentation method based on enhanced HSV convolutional neural network. In: International conference on cooperative design, visualization and engineering. Springer, pp 252–260
Li L, Luo Z, Zhang M, Cai Y, Li C, Li S (2020) An iterative transfer learning framework for cross-domain tongue segmentation. Concurr Comput Pract Experience 32(14):e5714
Li X, Yang T, Hu Y, Xu M, Zhang W, Li F (2017b) Automatic tongue image matting for remote medical diagnosis. In: 2017 IEEE international conference on bioinformatics and biomedicine (BIBM). IEEE, pp 561–564
Lin B, Xie J, Li C, Qu Y (2018) Deeptongue: tongue segmentation via resnet. In: 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 1035–1039
Liu W, Zhou C, Li Z, Hu Z (2020) Patch-driven tongue image segmentation using sparse representation. IEEE Access 8:41372–41383
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
Mehta S, Rastegari M, Caspi A, Shapiro L, Hajishirzi H (2018) Espnet: efficient spatial pyramid of dilated convolutions for semantic segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp 552–568
Park KB, Choi SH, Lee JY (2020) M-gan: retinal blood vessel segmentation by balancing losses through stacked deep fully convolutional networks. IEEE Access 8:146308–146322
Paszke A, Chaurasia A, Kim S, Culurciello E (2016) Enet: a deep neural network architecture for real-time semantic segmentation. arXiv:160602147
Pinheiro PO, Collobert R, Dollár P (2015) Learning to segment object candidates. arXiv:150606204
Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention. Springer, pp 234–241
Rother C, Kolmogorov V, Blake A (2004) “grabcut” interactive foreground extraction using iterated graph cuts. ACM Trans Graph (TOG) 23(3):309–314
Shen X, Tao X, Gao H, Zhou C, Jia J (2016) Deep automatic portrait matting. In: European conference on computer vision. Springer, pp 92–107
Shi D, Tang C, Blackley SV, Wang L, Yang J, He Y, Bennett SI, Xiong Y, Shi X, Zhou L et al (2020) An annotated dataset of tongue images supporting geriatric disease diagnosis. Data Brief 32:106153
Tang C (2019) Replication data for: an annotated dataset of tongue images. https://doi.org/10.7910/DVN/COJZMQ
Xu Q, Zeng Y, Tang W, Peng W, Xia T, Li Z, Teng F, Li W, Guo J (2020) Multi-task joint learning model for segmenting and classifying tongue images using a deep neural network. IEEE J Biomed Health Inform 24(9):2481–2489
Xue Y, Li X, Wu P, Li J, Wang L, Tong W (2018) Automated tongue segmentation in chinese medicine based on deep learning. In: International conference on neural information processing. Springer, pp 542–553
Yasrab R, Gu N, Zhang X (2017) An encoder-decoder based convolution neural network (CNN) for future advanced driver assistance system (ADAS). Appl Sci 7(4):312
Zeng X, Zhang Q, Chen J, Zhang G, Zhou A, Wang Y (2020) Boundary guidance hierarchical network for real-time tongue segmentation. arXiv:200306529
Zhou C, Fan H, Li Z (2019a) Tonguenet: accurate localization and segmentation for tongue images using deep neural networks. IEEE Access 7:148779–148789
Zhou J, Zhang Q, Zhang B, Chen X (2019b) Tonguenet: a precise and fast tongue segmentation system using u-net with a morphological processing layer. Appl Sci 9(15):3128
Funding
This project is funded by the National Research Council of Thailand.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kusakunniran, W., Borwarnginn, P., Karnjanapreechakorn, S. et al. Encoder-decoder network with RMP for tongue segmentation. Med Biol Eng Comput 61, 1193–1207 (2023). https://doi.org/10.1007/s11517-022-02761-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11517-022-02761-3