Cross-domain Tongue Image Segmentation Based on Deep Adversarial Networks and Entropy Minimization

Zhao, Liang; Zhang, Shuai; Zhao, Xiaomeng

doi:10.1007/978-3-031-46317-4_11

Liang Zhao¹⁴,
Shuai Zhang¹⁴ &
Xiaomeng Zhao¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14359))

Included in the following conference series:

International Conference on Image and Graphics

228 Accesses

Abstract

The semantic segmentation of tongue image is a key problem in the development of TCM (Traditional Chinese Medicine) modernization, and there are a lot of research dedicated to the development of tongue segmentation. Although the performance improvement in tongue segmentation with the evolution of deep learning, there are major challenges in generalizing it to the diverse testing domain. As we known, the worse the consistency of cross-domain data distribution between source and target domain is, the lower the performance of model in test domain gets. Existing semantic segmentation methods based on supervised learning are difficult to deal with such problems when it is impossible to re-label the tongue image with poor generalization performance in the target domain. To address this problem, we design a adversarial training framework with regularizing entropy on target domain, aiming to enforce high certainty of model’s prediction on target domain during the trend of domain alignment. Specifically, we pre-trained the tongue image segmentation model with deep supervised method on the source domain. In addition to segmentation task, the segmentation model need to regularize entropy of output on target domain and maximally confuse the discriminator. The discriminator tries to distinguish whether the output of segmentation model from the source domain or the target domain. In this study, two datasets is constructed, and the five-fold cross-validation experiment is performed on it. Experimental results show that the tongue image segmentation performance in the open environment was improved by 21.5% mIOU (59.2% → 80.7%) after domain adaptation. As opposed to the pseudo label learning with different thresholds(0.6, 0.9), the mIOU of proposed method increased by 17%, 16.1%. Moreover, as opposed to MinEnt, the mIOU increased by 6%. The tongue images cross-domain segmentation method proposed in this paper significantly improves the segmentation accuracy in the unlabeled target domain by reducing the influence of the cross-domain discrepancy and enhancing the certainty of model output in target domain.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Shu-qiong, H., Yun-long, Z., Jing, Z., et al.: Research progress on the objectification, quantitation and standardization of tongue manifestation in traditional Chinese medicine. China J. Tradit. Chin. Med. Pharm. 32(4), 1625–1627 (2017)
Google Scholar
San-Ii, Y., et al.: Maximum entropy image segmentation based on maximum interclass variance. Comput. Eng. Sci. 40(10), 1874 (2018)
Google Scholar
Zhan-peng, H., et al.: An automatic tongue segmentation algorithm based on OTSU and region growing. Shizhen Guoyi Guoyao 28(12), 3062–3064 (2017)
Google Scholar
Ling, Z., Jian, Q.: Tongue-image segmentation based on gray projection and threshold-adaptive method. Chin. J. Tissue Eng. Res. 14(9), 1638 1641 (2010)
Google Scholar
Xuegang, H.U., Xiulan, Q.I.U.: Novel image segmentation algorithm based on Snake model. J. Comput. Appl. 37(12), 3523–3527 (2017)
Google Scholar
Girshick, R., et al.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Google Scholar
Ren, S., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Google Scholar
Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)
Article Google Scholar
He, K., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Zhao, H., et al.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881–2890 (2017)
Google Scholar
Redmon, J., Divvala, S., Girshick, R., et al.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Google Scholar
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Liran, W., et al.: Two-phase convolutional neural network design for tongue segmentation. 23(10), 1571–1581 (2018)
Google Scholar
Lu, Y.-X., et al.: Review on tongue image segmentation technologies for traditional Chinese medicine: methodologies, performances and prospects. Acta Autom. Sinica 47(05), 1005–1016 (2021)
Google Scholar
Ma, L., et al.: Research on tongue image segmentation algorithm based on high resolution feature. Comput. Eng. 46(10), 248–252 (2020)
Google Scholar
Abraham, N., Khan, N.M.: A novel focal tversky loss function with improved attention U-Net for lesion segmentation. In: 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019), pp. 683 687. IEEE (2019)
Google Scholar
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015)
Vu, T.H., et al.: ADVENT: Adversarial entropy minimization for domain adaptation in semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2517–2526 (2019)
Google Scholar
Cui, S., et al.: Towards discriminability and diversity: batch nuclear-norm maximization under label insufficient situations. arXiv preprint arXiv:2003.12237 (2020)
Zheng, Z., Yang, Y.: Rectifying Pseudo label learning via uncertainty estimation for domain adaptive semantic segmentation. arXiv preprint arXiv:2003.03773 (2020)
Hoffman, J., et al.: CyCADA: cycle-consistent adversarial domain adaptation. In: ICML (2018)
Google Scholar
Wu, Z., et al.: DCAN: dual channel-wise alignment networks for unsupervised scene adaptation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 518–534 (2018)
Google Scholar
Lee, D.H.: Pseudo-Label: the simple and efficient semi-supervised learning method for deep neural networks. In: Workshop on Challenges in Representation Learning, ICML 3, 2 (2013)
Google Scholar
Zou, Y., Yu, Z., Vijaya Kumar, B.V.K., et al.: Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 289–305 (2018)
Google Scholar
Hoffman, J., et al.: FCNs in the wild: pixel-level adversarial and constraint-based adaptation. arXiv preprint arXiv:1612.02649 (2016)
Tsai, Y.H., et al.: Learning to adapt structured output space for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7472–7481 (2018)
Google Scholar
Long, M., et al.: Unsupervised domain adaptation with residual transfer networks. In: Advances in Neural Information Processing Systems, pp. 136–144 (2016)
Google Scholar
Springenberg, J.T.: Unsupervised and semi-supervised learning with categorical generative adversarial networks. arXiv preprint arXiv:1511.06390 (2015)

Download references

Author information

Authors and Affiliations

Tianjin MedValley Technology Co. Ltd., Tianjin, 300392, China
Liang Zhao, Shuai Zhang & Xiaomeng Zhao

Authors

Liang Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Shuai Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaomeng Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shuai Zhang .

Editor information

Editors and Affiliations

Dalian University of Technology, Dalian, China
Huchuan Lu
University of Sydney, Sydney, NSW, Australia
Wanli Ouyang
Shenzhen University, Shenzhen, China
Hui Huang
Tsinghua University, Beijing, China
Jiwen Lu
Dalian University of Technology, Dalian, China
Risheng Liu
Institute of Automation, CAS, Beijing, China
Jing Dong
University of Technology Sydney, Sydney, NSW, Australia
Min Xu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhao, L., Zhang, S., Zhao, X. (2023). Cross-domain Tongue Image Segmentation Based on Deep Adversarial Networks and Entropy Minimization. In: Lu, H., et al. Image and Graphics . ICIG 2023. Lecture Notes in Computer Science, vol 14359. Springer, Cham. https://doi.org/10.1007/978-3-031-46317-4_11

Download citation

DOI: https://doi.org/10.1007/978-3-031-46317-4_11
Published: 29 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46316-7
Online ISBN: 978-3-031-46317-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics