Skip to main content

Cross-domain Tongue Image Segmentation Based on Deep Adversarial Networks and Entropy Minimization

  • Conference paper
  • First Online:
Image and Graphics (ICIG 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14359))

Included in the following conference series:

  • 228 Accesses

Abstract

The semantic segmentation of tongue image is a key problem in the development of TCM (Traditional Chinese Medicine) modernization, and there are a lot of research dedicated to the development of tongue segmentation. Although the performance improvement in tongue segmentation with the evolution of deep learning, there are major challenges in generalizing it to the diverse testing domain. As we known, the worse the consistency of cross-domain data distribution between source and target domain is, the lower the performance of model in test domain gets. Existing semantic segmentation methods based on supervised learning are difficult to deal with such problems when it is impossible to re-label the tongue image with poor generalization performance in the target domain. To address this problem, we design a adversarial training framework with regularizing entropy on target domain, aiming to enforce high certainty of model’s prediction on target domain during the trend of domain alignment. Specifically, we pre-trained the tongue image segmentation model with deep supervised method on the source domain. In addition to segmentation task, the segmentation model need to regularize entropy of output on target domain and maximally confuse the discriminator. The discriminator tries to distinguish whether the output of segmentation model from the source domain or the target domain. In this study, two datasets is constructed, and the five-fold cross-validation experiment is performed on it. Experimental results show that the tongue image segmentation performance in the open environment was improved by 21.5% mIOU (59.2% → 80.7%) after domain adaptation. As opposed to the pseudo label learning with different thresholds(0.6, 0.9), the mIOU of proposed method increased by 17%, 16.1%. Moreover, as opposed to MinEnt, the mIOU increased by 6%. The tongue images cross-domain segmentation method proposed in this paper significantly improves the segmentation accuracy in the unlabeled target domain by reducing the influence of the cross-domain discrepancy and enhancing the certainty of model output in target domain.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Shu-qiong, H., Yun-long, Z., Jing, Z., et al.: Research progress on the objectification, quantitation and standardization of tongue manifestation in traditional Chinese medicine. China J. Tradit. Chin. Med. Pharm. 32(4), 1625–1627 (2017)

    Google Scholar 

  2. San-Ii, Y., et al.: Maximum entropy image segmentation based on maximum interclass variance. Comput. Eng. Sci. 40(10), 1874 (2018)

    Google Scholar 

  3. Zhan-peng, H., et al.: An automatic tongue segmentation algorithm based on OTSU and region growing. Shizhen Guoyi Guoyao 28(12), 3062–3064 (2017)

    Google Scholar 

  4. Ling, Z., Jian, Q.: Tongue-image segmentation based on gray projection and threshold-adaptive method. Chin. J. Tissue Eng. Res. 14(9), 1638 1641 (2010)

    Google Scholar 

  5. Xuegang, H.U., Xiulan, Q.I.U.: Novel image segmentation algorithm based on Snake model. J. Comput. Appl. 37(12), 3523–3527 (2017)

    Google Scholar 

  6. Girshick, R., et al.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)

    Google Scholar 

  7. Ren, S., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)

    Google Scholar 

  8. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)

    Google Scholar 

  9. Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)

    Article  Google Scholar 

  10. He, K., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  11. Zhao, H., et al.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881–2890 (2017)

    Google Scholar 

  12. Redmon, J., Divvala, S., Girshick, R., et al.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)

    Google Scholar 

  13. Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2

    Chapter  Google Scholar 

  14. Liran, W., et al.: Two-phase convolutional neural network design for tongue segmentation. 23(10), 1571–1581 (2018)

    Google Scholar 

  15. Lu, Y.-X., et al.: Review on tongue image segmentation technologies for traditional Chinese medicine: methodologies, performances and prospects. Acta Autom. Sinica 47(05), 1005–1016 (2021)

    Google Scholar 

  16. Ma, L., et al.: Research on tongue image segmentation algorithm based on high resolution feature. Comput. Eng. 46(10), 248–252 (2020)

    Google Scholar 

  17. Abraham, N., Khan, N.M.: A novel focal tversky loss function with improved attention U-Net for lesion segmentation. In: 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019), pp. 683 687. IEEE (2019)

    Google Scholar 

  18. Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015)

  19. Vu, T.H., et al.: ADVENT: Adversarial entropy minimization for domain adaptation in semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2517–2526 (2019)

    Google Scholar 

  20. Cui, S., et al.: Towards discriminability and diversity: batch nuclear-norm maximization under label insufficient situations. arXiv preprint arXiv:2003.12237 (2020)

  21. Zheng, Z., Yang, Y.: Rectifying Pseudo label learning via uncertainty estimation for domain adaptive semantic segmentation. arXiv preprint arXiv:2003.03773 (2020)

  22. Hoffman, J., et al.: CyCADA: cycle-consistent adversarial domain adaptation. In: ICML (2018)

    Google Scholar 

  23. Wu, Z., et al.: DCAN: dual channel-wise alignment networks for unsupervised scene adaptation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 518–534 (2018)

    Google Scholar 

  24. Lee, D.H.: Pseudo-Label: the simple and efficient semi-supervised learning method for deep neural networks. In: Workshop on Challenges in Representation Learning, ICML 3, 2 (2013)

    Google Scholar 

  25. Zou, Y., Yu, Z., Vijaya Kumar, B.V.K., et al.: Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 289–305 (2018)

    Google Scholar 

  26. Hoffman, J., et al.: FCNs in the wild: pixel-level adversarial and constraint-based adaptation. arXiv preprint arXiv:1612.02649 (2016)

  27. Tsai, Y.H., et al.: Learning to adapt structured output space for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7472–7481 (2018)

    Google Scholar 

  28. Long, M., et al.: Unsupervised domain adaptation with residual transfer networks. In: Advances in Neural Information Processing Systems, pp. 136–144 (2016)

    Google Scholar 

  29. Springenberg, J.T.: Unsupervised and semi-supervised learning with categorical generative adversarial networks. arXiv preprint arXiv:1511.06390 (2015)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shuai Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhao, L., Zhang, S., Zhao, X. (2023). Cross-domain Tongue Image Segmentation Based on Deep Adversarial Networks and Entropy Minimization. In: Lu, H., et al. Image and Graphics . ICIG 2023. Lecture Notes in Computer Science, vol 14359. Springer, Cham. https://doi.org/10.1007/978-3-031-46317-4_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-46317-4_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-46316-7

  • Online ISBN: 978-3-031-46317-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics