Abstract
Traditional Chinese Medicine (TCM) has gained prominence in clinical practice, with tongue diagnosis, a key technique, now being integrated with Artificial Intelligence (AI) to achieve more objective and quantifiable results, thereby mitigating reliance on subjective judgment. However, challenges such as poor lighting conditions and limited imaging equipment often compromise image clarity, complicating tongue detection and identification. To address these issues, we propose a Dual-Task Feedback Learning (DTFL) framework, designed to enhance tongue detection in patient images by improving image quality. In our approach, Super-Resolution (SR) serves as a preliminary task preceding Tongue Detection (TD), enabling the TD network to process high-quality images for more accurate results. To further improve the interaction between SR and TD tasks, we incorporate Feature Alignment (FA) loss, which establishes a feedback connection that allows the SR network to acquire task-specific knowledge from the TD network. Additionally, we introduce a quality fusion augmentation and alternate training strategy to address potential challenges associated with FA loss during training. To the best of our knowledge, we are the first to integrate SR into TD. Experiments demonstrate that DTFL significantly improves performance by generating SR images that are optimally suited for TD.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Agustsson, E., Timofte, R.: NTIRE 2017 challenge on single image super-resolution: dataset and study. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (2017). https://doi.org/10.1109/cvprw.2017.150
Cheng, M.H., Hu, M.C., Lan, K.C.: Tongue fur detection on the smartphone. In: 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 1365–1371. IEEE (2016)
Dai, T., Cai, J., Zhang, Y., Xia, S.T., Zhang, L.: Second-order attention network for single image super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11065–11074 (2019)
David, C.: The big push for Chinese medicine for the first time, the world health organization will recognize traditional medicine in its influential global medicine compendium. Nature 561(7724), 448–450 (2018)
Deng, L., Zhou, Q., Wang, S., Zhang, Y.: Ssrnet: a deep learning network via spatial-based super-resolution reconstruction for cell counting and segmentation. Adv. Intell. Syst. 5(10), 2300185 (2023)
Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 38(2), 295–307 (2015)
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Fan, S., et al.: Machine learning algorithms in classifying TCM tongue features in diabetes mellitus and symptoms of gastric disease. Eur. J. Integr. Med. 43, 101288 (2021)
Hou, J., Su, H.Y., Yan, B., Zheng, H., Sun, Z.L., Cai, X.C.: Classification of tongue color based on CNN. In: 2017 IEEE 2nd International Conference on Big Data Analysis (ICBDA), pp. 725–729. IEEE (2017)
Huang, B., Wu, J., Zhang, D., Li, N.: Tongue shape classification by geometric features. Inf. Sci. 180(2), 312–324 (2010)
Jaouen, V., Wang, Z., Conze, P.H., Visvikis, D.: Self super resolution for hepatic vessel CT segmentation. In: 2023 IEEE Nuclear Science Symposium, Medical Imaging Conference and International Symposium on Room-Temperature Semiconductor Detectors (NSS MIC RTSD), p. 1. IEEE (2023)
Kim, J., Lee, J.K., Lee, K.M.: Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1646–1654 (2016)
Li, M., et al.: Object detection on low-resolution images with two-stage enhancement. Knowl.-Based Syst. 111985 (2024)
Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L., Timofte, R.: Swinir: image restoration using swin transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1833–1844 (2021)
Lim, B., Son, S., Kim, H., Nah, S., Mu Lee, K.: Enhanced deep residual networks for single image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 136–144 (2017)
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017)
Meng, D., et al.: Tongue images classification based on constrained high dispersal network. Evid.-Based Complement. Altern. Med. 2017(1), 7452427 (2017)
Paszke, A., et al.: Automatic differentiation in pytorch (2017)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, vol. 28 (2015)
Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vision 115, 211–252 (2015)
Wang, B., Lu, T., Zhang, Y.: Feature-driven super-resolution for object detection. In: 2020 5th International Conference on Control, Robotics and Cybernetics (CRC), pp. 211–215. IEEE (2020)
Wang, X., Luo, S., Tian, G., Rao, X., He, B., Sun, F.: Deep learning based tongue prickles detection in traditional Chinese medicine. Evid.-Based Complement. Altern. Med. 2022(1), 5899975 (2022)
Wang, Z., Cun, X., Bao, J., Zhou, W., Liu, J., Li, H.: Uformer: a general U-shaped transformer for image restoration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17683–17693 (2022)
Weng, H., Li, L., Lei, H., Luo, Z., Li, C., Li, S.: A weakly supervised tooth-mark and crack detection method in tongue image. Concurr. Comput. Pract. Exp. 33(16), e6262 (2021)
Yuan, L., et al.: Development of a tongue image-based machine learning tool for the diagnosis of gastric cancer: a prospective multicentre clinical cohort study. EClinicalMedicine 57 (2023)
Zhao-fu, F., Yi-di, G.: Mirror of health: tongue diagnosis in Chinese medicine. People’s Medical Publishing House (2007)
Zhou, Y., Li, Z., Guo, C.L., Bai, S., Cheng, M.M., Hou, Q.: Srformer: permuted self-attention for single image super-resolution. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12780–12791 (2023)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Sun, Y., Wei, M., Chen, G. (2025). Dual-Task Feedback Learning for Tongue Detection via Super-Resolution Integration. In: Ide, I., et al. MultiMedia Modeling. MMM 2025. Lecture Notes in Computer Science, vol 15520. Springer, Singapore. https://doi.org/10.1007/978-981-96-2054-8_24
Download citation
DOI: https://doi.org/10.1007/978-981-96-2054-8_24
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-96-2053-1
Online ISBN: 978-981-96-2054-8
eBook Packages: Computer ScienceComputer Science (R0)