Abstract
Lung-infected area segmentation is crucial for assessing the severity of lung diseases. However, existing image-text multi-modal methods typically rely on labour-intensive annotations for model training, posing challenges regarding time and expertise. To address this issue, we propose a novel attribute knowledge guided framework for unsupervised lung-infected area segmentation (AKGNet), which achieves segmentation solely based on image-text data without any mask annotation. AKGNet conducts text attribute knowledge learning, attribute-image cross-attention fusion, and high-confidence based pseudo-label exploration simultaneously. It learns statistical information and captures spatial correlations between image and text attributes in the embedding space, iteratively refining the mask to enhance segmentation. Specifically, we introduce a text attribute knowledge learning module by extracting attribute knowledge and deploying it for feature representation learning, enabling the model to learn statistical information and adapt to different attributes. Moreover, we devise an attribute-image cross-attention module by exploiting the correlations between attributes and images in the embedding space to capture spatial dependency information, thus selectively focusing on relevant regions. Finally, a self-training mask improvement process is employed by generating pseudo-labels using high-confidence predictions and enhancing the mask and segmentation iteratively. Experimental results on a benchmark medical image dataset demonstrate the superior performance of our proposed method compared to state-of-the-art segmentation techniques in unsupervised scenarios.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
de Almeida, P.A.C., Borges, D.L.: A deep unsupervised saliency model for lung segmentation in chest x-ray images. Biomed. Signal Process. Control 86, 105334 (2023)
Anwar, S.M., Majid, M., Qayyum, A., Awais, M., Alnowami, M., Khan, M.K.: Medical image analysis using convolutional neural networks: a review. J. Med. Syst. 42, 1–13 (2018)
Cao, H., Wang, Y., Chen, J., Jiang, D., Zhang, X., Tian, Q., Wang, M.: Swin-unet: Unet-like pure transformer for medical image segmentation. In: European Conference on Computer Vision (ECCV) Workshops (2023). https://doi.org/10.1007/978-3-031-25066-8_9
Chen, J., et al.: Transunet: Transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306 (2021)
Degerli, A., Kiranyaz, S., Chowdhury, M.E., Gabbouj, M.: Osegnet: Operational segmentation network for covid-19 detection using chest x-ray images. In: 2022 IEEE International Conference on Image Processing (ICIP) (2022)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the North American Chapter of the Association for Computational Linguistics (NAACL) (2019)
Dong, B., Wang, W., Fan, D.P., Li, J., Fu, H., Shao, L.: Polyp-pvt: Polyp segmentation with pyramid vision transformers. arXiv preprint arXiv:2108.06932 (2021)
Duncan, J.S., Ayache, N.: Medical image analysis: progress over two decades and the challenges ahead. IEEE Trans. Pattern Anal. Mach. Intell. 22(1), 85–106 (2000)
Keshwani, D., Kitamura, Y., Ihara, S., Iizuka, S., Simo-Serra, E.: TopNet: topology preserving metric learning for vessel tree reconstruction and labelling. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12266, pp. 14–23. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59725-2_2
Fan, D.P., et al.: Inf-net: automatic Covid-19 lung infection segmentation from ct images. IEEE Trans. Med. Imaging 39(8), 2626–2637 (2020)
Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., Lu, H.: Dual attention network for scene segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Lee, G.E., Kim, S.H., Cho, J., Choi, S.T., Choi, S.I.: Text-guided cross-position attention for segmentation: Case of medical image. In: International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI) (2023)
Li, Z., et al.: Tfcns: a cnn-transformer hybrid network for medical image segmentation. In: International Conference on Artificial Neural Networks (ICANN) (2022)
Li, Z., et al.: Lvit: language meets vision transformer in medical image segmentation. IEEE Transactions on Medical Imaging (2023)
Liu, F., You, C., Wu, X., Ge, S., Sun, X., et al.: Auto-encoding knowledge graph for unsupervised medical report generation. Advances in Neural Information Processing Systems (NeurIPS) (2021)
Lüddecke, T., Ecker, A.: Image segmentation using text and image prompts. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
Mansoor, A., et al.: Segmentation and image analysis of abnormal lungs at ct: current approaches, challenges, and future trends. Radiographics 35(4), 1056–1076 (2015)
Milletari, F., Navab, N., Ahmadi, S.A.: V-net: fully convolutional neural networks for volumetric medical image segmentation. In: International Conference on 3D Vision (3DV) (2016)
Poudel, K., Dhakal, M., Bhandari, P., Adhikari, R., Thapaliya, S., Khanal, B.: Exploring transfer learning in medical image segmentation using vision-language models. arXiv preprint arXiv:2308.07706 (2023)
Radford, A., et al.: Learning transferable visual models from natural language supervision. In: International Conference on Machine Learning (ICML) (2021)
Rahman, M.M., Marculescu, R.: Medical image segmentation via cascaded attention decoding. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2023)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Shen, D., Wu, G., Suk, H.I.: Deep learning in medical image analysis. Annu. Rev. Biomed. Eng. 19, 221–248 (2017)
Sherman, B.E., Graves, K.N., Turk-Browne, N.B.: The prevalence and importance of statistical learning in human cognition and behavior. Curr. Opin. Behav. Sci. 32, 15–20 (2020)
Tajbakhsh, N., Jeyaseelan, L., Li, Q., Chiang, J.N., Wu, Z., Ding, X.: Embracing imperfect datasets: a review of deep learning solutions for medical image segmentation. Med. Image Anal. 63, 101693 (2020)
Wang, H., et al.: Mixed transformer u-net for medical image segmentation. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2022)
Wang, Z., et al.: Cris: clip-driven referring image segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
Yang, Z., Wang, J., Tang, Y., Chen, K., Zhao, H., Torr, P.H.: Lavt: language-aware vision transformer for referring image segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
Zhang, S., et al.: Large-scale domain-specific pretraining for biomedical vision-language processing. arXiv preprint arXiv:2303.00915 (2023)
Zhang, Z., Fu, H., Dai, H., Shen, J., Pang, Y., Shao, L.: ET-Net: a generic edge-attention guidance network for medical image segmentation. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11764, pp. 442–450. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32239-7_49
Zhou, Z., Rahman Siddiquee, M.M., Tajbakhsh, N., Liang, J.: Unet++: A nested u-net architecture for medical image segmentation. In: Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
En, Q., Guo, Y. (2024). AKGNet: Attribute Knowledge Guided Unsupervised Lung-Infected Area Segmentation. In: Bifet, A., Davis, J., Krilavičius, T., Kull, M., Ntoutsi, E., Žliobaitė, I. (eds) Machine Learning and Knowledge Discovery in Databases. Research Track. ECML PKDD 2024. Lecture Notes in Computer Science(), vol 14943. Springer, Cham. https://doi.org/10.1007/978-3-031-70352-2_16
Download citation
DOI: https://doi.org/10.1007/978-3-031-70352-2_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70351-5
Online ISBN: 978-3-031-70352-2
eBook Packages: Computer ScienceComputer Science (R0)