Skip to main content

CP-CLIP: Core-Periphery Feature Alignment CLIP for Zero-Shot Medical Image Analysis

  • Conference paper
  • First Online:
Medical Image Computing and Computer Assisted Intervention – MICCAI 2024 (MICCAI 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15003))

  • 1603 Accesses

Abstract

Multi-modality learning, exemplified by the language and image pair pre-trained CLIP model, has demonstrated remarkable performance in enhancing zero-shot capabilities and has gained significant attention in the field. However, simply applying language-image pre-trained CLIP to medical image analysis encounters substantial domain shifts, resulting in significant performance degradation due to inherent disparities between natural (non-medical) and medical image characteristics. To address this challenge and uphold or even enhance CLIP’s zero-shot capability in medical image analysis, we develop a novel framework, Core-Periphery feature alignment for CLIP (CP-CLIP), tailored for handling medical images and corresponding clinical reports. Leveraging the foundational core-periphery organization that has been widely observed in brain networks, we augment CLIP by integrating a novel core-periphery-guided neural network. This auxiliary CP network not only aligns text and image features into a unified latent space more efficiently but also ensures the alignment is driven by domain-specific core information, e.g., in medical images and clinical reports. In this way, our approach effectively mitigates and further enhances CLIP’s zero-shot performance in medical image analysis. More importantly, our designed CP-CLIP exhibits excellent explanatory capability, enabling the automatic identification of critical regions in clinical analysis. Extensive experimentation and evaluation across five public datasets underscore the superiority of our CP-CLIP in zero-shot medical image prediction and critical area detection, showing its promising utility in multimodal feature alignment in current medical applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Chavoshnejad, P., Chen, L., Yu, X., Hou, J., Filla, N., Zhu, D., Liu, T., Li, G., Razavi, M.J., Wang, X.: An integrated finite element method and machine learning algorithm for brain morphology prediction. Cerebral Cortex 33(15), 9354–9366 (2023)

    Article  Google Scholar 

  2. Huang, Z., Long, G., Wessler, B., Hughes, M.C.: Tmed 2: a dataset for semi-supervised classification of echocardiograms. DataPerf: Benchmarking Data for Data-Centric AI Workshop (2022)

    Google Scholar 

  3. Irvin, J., Rajpurkar, P., Ko, M., Yu, Y., Ciurea-Ilcus, S., Chute, C., Marklund, H., Haghgoo, B., Ball, R., Shpanskaya, K., et al.: Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison. In: AAAI. vol. 33, pp. 590–597 (2019)

    Google Scholar 

  4. Johnson, A.E., Pollard, T.J., Berkowitz, S.J., Greenbaum, N.R., Lungren, M.P., Deng, C.y., Mark, R.G., Horng, S.: Mimic-cxr, a de-identified publicly available database of chest radiographs with free-text reports. Scientific data 6(1),  317 (2019)

    Google Scholar 

  5. Liu, Z., Jiang, H., Zhong, T., Wu, Z., Ma, C., Li, Y., Yu, X., et al.: Holistic evaluation of gpt-4v for biomedical imaging. arXiv preprint arXiv:2312.05256 (2023)

  6. Lyu, Y., Yu, X., Zhang, L., Zhu, D.: Classification of mild cognitive impairment by fusing neuroimaging and gene expression data. In: Proceedings of the 15th international conference on PErvasive technologies related to assistive environments. pp. 26–32 (2021)

    Google Scholar 

  7. Lyu, Y., Yu, X., Zhu, D., Zhang, L.: Classification of alzheimer’s disease via vision transformer. In: Proceedings of the 15th international conference on PErvasive technologies related to assistive environments. pp. 463–468 (2022)

    Google Scholar 

  8. Ma, C., Jiang, H., Chen, W., Wu, Z., Yu, X., et al.: Eye-gaze guided multi-modal alignment framework for radiology. arXiv preprint arXiv:2403.12416 (2024)

  9. Moreira, I.C., Amaral, I., Domingues, I., Cardoso, A., Cardoso, M.J., Cardoso, J.S.: Inbreast: toward a full-field digital mammographic database. Academic radiology 19(2), 236–248 (2012)

    Article  Google Scholar 

  10. Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., et al.: Learning transferable visual models from natural language supervision. In: International conference on machine learning. pp. 8748–8763. PMLR (2021)

    Google Scholar 

  11. Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-cam: Visual explanations from deep networks via gradient-based localization. In: CVPR. pp. 618–626 (2017)

    Google Scholar 

  12. Stephens, K.: Acr, siim name winners of pneumothorax detection machine learning challenge. AXIS Imaging News (2019)

    Google Scholar 

  13. Wang, X., Peng, Y., Lu, L., Lu, Z., Bagheri, M., Summers, R.M.: Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In: CVPR. pp. 2097–2106 (2017)

    Google Scholar 

  14. Wang, Z., Wu, Z., Agarwal, D., Sun, J.: Medclip: Contrastive learning from unpaired medical images and text. arXiv preprint arXiv:2210.10163 (2022)

  15. Xiao, Z., Chen, Y., Yao, J., Zhang, L., Liu, Z., Wu, Z., Yu, X., et al.: Instruction-vit: Multi-modal prompts for instruction learning in vision transformer. Information Fusion p. 102204 (2024)

    Google Scholar 

  16. Yu, X., Hu, D., Zhang, L., Huang, Y., Wu, Z., Liu, T., Wang, L., Lin, W., Zhu, D., Li, G.: Longitudinal infant functional connectivity prediction via conditional intensive triplet network. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. pp. 255–264 (2022)

    Google Scholar 

  17. Yu, X., Scheel, N., Zhang, L., Zhu, D.C., Zhang, R., Zhu, D.: Free water in t2 flair white matter hyperintensity lesions. In: Alzheimer’s & Dementia. p. e057398 (2021)

    Google Scholar 

  18. Yu, X., Zhang, L., Dai, H., Lyu, Y., Zhao, L., Wu, Z., Liu, D., Liu, T., Zhu, D.: Core-periphery principle guided redesign of self-attention in transformers. arXiv preprint arXiv:2303.15569 (2023)

  19. Yu, X., Zhang, L., Dai, H., Zhao, L., Lyu, Y., Wu, Z., Liu, T., Dajiang, Z.: Gyri vs. sulci: Disentangling brain core-periphery functional networks via twin-transformer. arXiv preprint arXiv:2302.00146 (2023)

  20. Yu, X., Zhang, L., Lyu, Y., Liu, T., Zhu, D.: Supervised deep tree in alzheimer’s disease. In: IEEE 20th International Symposium on Biomedical Imaging (ISBI). pp. 1–5 (2023)

    Google Scholar 

  21. Yu, X., Zhang, L., Zhao, L., Lyu, Y., Liu, T., Dajiang, Z.: Disentangling spatial-temporal functional brain networks via twin-transformers. arXiv preprint arXiv:2204.09225 (2022)

  22. Yu, X., Zhang, L., Zhu, D., Liu, T.: Robust core-periphery constrained transformer for domain adaptation. arXiv preprint arXiv:2308.13515 (2023)

  23. Zhang, L., Liu, Z., Zhang, L., Wu, Z., Yu, X., Holmes, J., Feng, H., Dai, H., Li, X., Li, Q., Wong, W.W., Vora, S.A., Zhu, D., Liu, T., Liu, W.: Generalizable and promptable artificial intelligence model to augment clinical delineation in radiation oncology. Medical Physics (2024)

    Google Scholar 

  24. Zhang, L., Na, S., Liu, T., Zhu, D., Huang, J.: Multimodal deep fusion in hyperbolic space for mild cognitive impairment study. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. pp. 674–684. Springer (2023)

    Google Scholar 

  25. Zhang, L., Wang, L., Gao, J., Risacher, S.L., Yan, J., Li, G., Liu, T., Zhu, D., Initiative, A.D.N., et al.: Deep fusion of brain structure-function in mild cognitive impairment. Medical image analysis 72, 102082 (2021)

    Article  Google Scholar 

  26. Zhang, L., Wang, L., Liu, T., Zhu, D.: Disease2vec: Encoding alzheimer’s progression via disease embedding tree. Pharmacological Research 199, 107038 (2024)

    Article  Google Scholar 

  27. Zhang, L., Wang, L., Zhu, D.: Jointly analyzing alzheimer’s disease related structure-function using deep cross-model attention network. In: 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI). pp. 563–567. IEEE (2020)

    Google Scholar 

  28. Zhang, L., Yu, X., Lyu, Y., Liu, T., Zhu, D.: Representative functional connectivity learning for multiple clinical groups in alzheimer’s disease. In: IEEE 20th International Symposium on Biomedical Imaging (ISBI). pp. 1–5 (2023)

    Google Scholar 

  29. Zhang, L., Zaman, A., Wang, L., Yan, J., Zhu, D.: A cascaded multi-modality analysis in mild cognitive impairment. In: Machine Learning in Medical Imaging: 10th International Workshop, MLMI, Proceedings 10. pp. 557–565. Springer (2019)

    Google Scholar 

  30. Zhao, L., Zhang, L., Wu, Z., Chen, Y., Dai, H., Yu, X., Liu, Z., Zhang, T., Hu, X., Jiang, X., et al.: When brain-inspired ai meets agi. Meta-Radiology p. 100005 (2023)

    Google Scholar 

Download references

Acknowledgments

This work was supported by National Institutes of Health (R01AG075582 and RF1NS128534).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dajiang Zhu .

Editor information

Editors and Affiliations

Ethics declarations

Disclosure of Interests

The authors have no competing interests to declare that are relevant to the content of this article.

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yu, X., Wu, Z., Zhang, L., Zhang, J., Lyu, Y., Zhu, D. (2024). CP-CLIP: Core-Periphery Feature Alignment CLIP for Zero-Shot Medical Image Analysis. In: Linguraru, M.G., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2024. MICCAI 2024. Lecture Notes in Computer Science, vol 15003. Springer, Cham. https://doi.org/10.1007/978-3-031-72384-1_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-72384-1_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-72383-4

  • Online ISBN: 978-3-031-72384-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics