CP-CLIP: Core-Periphery Feature Alignment CLIP for Zero-Shot Medical Image Analysis

Yu, Xiaowei; Wu, Zihao; Zhang, Lu; Zhang, Jing; Lyu, Yanjun; Zhu, Dajiang

doi:10.1007/978-3-031-72384-1_9

Xiaowei Yu¹⁴,
Zihao Wu¹⁵,
Lu Zhang¹⁴,
Jing Zhang¹⁴,
Yanjun Lyu¹⁴ &
…
Dajiang Zhu¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15003))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

1603 Accesses

Abstract

Multi-modality learning, exemplified by the language and image pair pre-trained CLIP model, has demonstrated remarkable performance in enhancing zero-shot capabilities and has gained significant attention in the field. However, simply applying language-image pre-trained CLIP to medical image analysis encounters substantial domain shifts, resulting in significant performance degradation due to inherent disparities between natural (non-medical) and medical image characteristics. To address this challenge and uphold or even enhance CLIP’s zero-shot capability in medical image analysis, we develop a novel framework, Core-Periphery feature alignment for CLIP (CP-CLIP), tailored for handling medical images and corresponding clinical reports. Leveraging the foundational core-periphery organization that has been widely observed in brain networks, we augment CLIP by integrating a novel core-periphery-guided neural network. This auxiliary CP network not only aligns text and image features into a unified latent space more efficiently but also ensures the alignment is driven by domain-specific core information, e.g., in medical images and clinical reports. In this way, our approach effectively mitigates and further enhances CLIP’s zero-shot performance in medical image analysis. More importantly, our designed CP-CLIP exhibits excellent explanatory capability, enabling the automatic identification of critical regions in clinical analysis. Extensive experimentation and evaluation across five public datasets underscore the superiority of our CP-CLIP in zero-shot medical image prediction and critical area detection, showing its promising utility in multimodal feature alignment in current medical applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Improving Medical Multi-modal Contrastive Learning with Expert Annotations

MedIM: Boost Medical Image Representation via Radiology Report-Guided Masking

Benchmarking PathCLIP for Pathology Image Analysis

Article 09 July 2024

References

Chavoshnejad, P., Chen, L., Yu, X., Hou, J., Filla, N., Zhu, D., Liu, T., Li, G., Razavi, M.J., Wang, X.: An integrated finite element method and machine learning algorithm for brain morphology prediction. Cerebral Cortex 33(15), 9354–9366 (2023)
Article Google Scholar
Huang, Z., Long, G., Wessler, B., Hughes, M.C.: Tmed 2: a dataset for semi-supervised classification of echocardiograms. DataPerf: Benchmarking Data for Data-Centric AI Workshop (2022)
Google Scholar
Irvin, J., Rajpurkar, P., Ko, M., Yu, Y., Ciurea-Ilcus, S., Chute, C., Marklund, H., Haghgoo, B., Ball, R., Shpanskaya, K., et al.: Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison. In: AAAI. vol. 33, pp. 590–597 (2019)
Google Scholar
Johnson, A.E., Pollard, T.J., Berkowitz, S.J., Greenbaum, N.R., Lungren, M.P., Deng, C.y., Mark, R.G., Horng, S.: Mimic-cxr, a de-identified publicly available database of chest radiographs with free-text reports. Scientific data 6(1), 317 (2019)
Google Scholar
Liu, Z., Jiang, H., Zhong, T., Wu, Z., Ma, C., Li, Y., Yu, X., et al.: Holistic evaluation of gpt-4v for biomedical imaging. arXiv preprint arXiv:2312.05256 (2023)
Lyu, Y., Yu, X., Zhang, L., Zhu, D.: Classification of mild cognitive impairment by fusing neuroimaging and gene expression data. In: Proceedings of the 15th international conference on PErvasive technologies related to assistive environments. pp. 26–32 (2021)
Google Scholar
Lyu, Y., Yu, X., Zhu, D., Zhang, L.: Classification of alzheimer’s disease via vision transformer. In: Proceedings of the 15th international conference on PErvasive technologies related to assistive environments. pp. 463–468 (2022)
Google Scholar
Ma, C., Jiang, H., Chen, W., Wu, Z., Yu, X., et al.: Eye-gaze guided multi-modal alignment framework for radiology. arXiv preprint arXiv:2403.12416 (2024)
Moreira, I.C., Amaral, I., Domingues, I., Cardoso, A., Cardoso, M.J., Cardoso, J.S.: Inbreast: toward a full-field digital mammographic database. Academic radiology 19(2), 236–248 (2012)
Article Google Scholar
Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., et al.: Learning transferable visual models from natural language supervision. In: International conference on machine learning. pp. 8748–8763. PMLR (2021)
Google Scholar
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-cam: Visual explanations from deep networks via gradient-based localization. In: CVPR. pp. 618–626 (2017)
Google Scholar
Stephens, K.: Acr, siim name winners of pneumothorax detection machine learning challenge. AXIS Imaging News (2019)
Google Scholar
Wang, X., Peng, Y., Lu, L., Lu, Z., Bagheri, M., Summers, R.M.: Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In: CVPR. pp. 2097–2106 (2017)
Google Scholar
Wang, Z., Wu, Z., Agarwal, D., Sun, J.: Medclip: Contrastive learning from unpaired medical images and text. arXiv preprint arXiv:2210.10163 (2022)
Xiao, Z., Chen, Y., Yao, J., Zhang, L., Liu, Z., Wu, Z., Yu, X., et al.: Instruction-vit: Multi-modal prompts for instruction learning in vision transformer. Information Fusion p. 102204 (2024)
Google Scholar
Yu, X., Hu, D., Zhang, L., Huang, Y., Wu, Z., Liu, T., Wang, L., Lin, W., Zhu, D., Li, G.: Longitudinal infant functional connectivity prediction via conditional intensive triplet network. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. pp. 255–264 (2022)
Google Scholar
Yu, X., Scheel, N., Zhang, L., Zhu, D.C., Zhang, R., Zhu, D.: Free water in t2 flair white matter hyperintensity lesions. In: Alzheimer’s & Dementia. p. e057398 (2021)
Google Scholar
Yu, X., Zhang, L., Dai, H., Lyu, Y., Zhao, L., Wu, Z., Liu, D., Liu, T., Zhu, D.: Core-periphery principle guided redesign of self-attention in transformers. arXiv preprint arXiv:2303.15569 (2023)
Yu, X., Zhang, L., Dai, H., Zhao, L., Lyu, Y., Wu, Z., Liu, T., Dajiang, Z.: Gyri vs. sulci: Disentangling brain core-periphery functional networks via twin-transformer. arXiv preprint arXiv:2302.00146 (2023)
Yu, X., Zhang, L., Lyu, Y., Liu, T., Zhu, D.: Supervised deep tree in alzheimer’s disease. In: IEEE 20th International Symposium on Biomedical Imaging (ISBI). pp. 1–5 (2023)
Google Scholar
Yu, X., Zhang, L., Zhao, L., Lyu, Y., Liu, T., Dajiang, Z.: Disentangling spatial-temporal functional brain networks via twin-transformers. arXiv preprint arXiv:2204.09225 (2022)
Yu, X., Zhang, L., Zhu, D., Liu, T.: Robust core-periphery constrained transformer for domain adaptation. arXiv preprint arXiv:2308.13515 (2023)
Zhang, L., Liu, Z., Zhang, L., Wu, Z., Yu, X., Holmes, J., Feng, H., Dai, H., Li, X., Li, Q., Wong, W.W., Vora, S.A., Zhu, D., Liu, T., Liu, W.: Generalizable and promptable artificial intelligence model to augment clinical delineation in radiation oncology. Medical Physics (2024)
Google Scholar
Zhang, L., Na, S., Liu, T., Zhu, D., Huang, J.: Multimodal deep fusion in hyperbolic space for mild cognitive impairment study. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. pp. 674–684. Springer (2023)
Google Scholar
Zhang, L., Wang, L., Gao, J., Risacher, S.L., Yan, J., Li, G., Liu, T., Zhu, D., Initiative, A.D.N., et al.: Deep fusion of brain structure-function in mild cognitive impairment. Medical image analysis 72, 102082 (2021)
Article Google Scholar
Zhang, L., Wang, L., Liu, T., Zhu, D.: Disease2vec: Encoding alzheimer’s progression via disease embedding tree. Pharmacological Research 199, 107038 (2024)
Article Google Scholar
Zhang, L., Wang, L., Zhu, D.: Jointly analyzing alzheimer’s disease related structure-function using deep cross-model attention network. In: 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI). pp. 563–567. IEEE (2020)
Google Scholar
Zhang, L., Yu, X., Lyu, Y., Liu, T., Zhu, D.: Representative functional connectivity learning for multiple clinical groups in alzheimer’s disease. In: IEEE 20th International Symposium on Biomedical Imaging (ISBI). pp. 1–5 (2023)
Google Scholar
Zhang, L., Zaman, A., Wang, L., Yan, J., Zhu, D.: A cascaded multi-modality analysis in mild cognitive impairment. In: Machine Learning in Medical Imaging: 10th International Workshop, MLMI, Proceedings 10. pp. 557–565. Springer (2019)
Google Scholar
Zhao, L., Zhang, L., Wu, Z., Chen, Y., Dai, H., Yu, X., Liu, Z., Zhang, T., Hu, X., Jiang, X., et al.: When brain-inspired ai meets agi. Meta-Radiology p. 100005 (2023)
Google Scholar

Download references

Acknowledgments

This work was supported by National Institutes of Health (R01AG075582 and RF1NS128534).

Author information

Authors and Affiliations

The University of Texas at Arlington, Arlington, TX, 76019, USA
Xiaowei Yu, Lu Zhang, Jing Zhang, Yanjun Lyu & Dajiang Zhu
University of Georgia, Athens, GA, 30602, USA
Zihao Wu

Authors

Xiaowei Yu
View author publications
You can also search for this author in PubMed Google Scholar
Zihao Wu
View author publications
You can also search for this author in PubMed Google Scholar
Lu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jing Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yanjun Lyu
View author publications
You can also search for this author in PubMed Google Scholar
Dajiang Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dajiang Zhu .

Editor information

Editors and Affiliations

Children’s National Hospital/George Washington University, Washington, DC, USA
Marius George Linguraru
The Chinese University of Hong Kong, Hong Kong, China
Qi Dou
Technical University of Denmark, Kgs Lyngby, Denmark
Aasa Feragen
Imperial College London, London, UK
Stamatia Giannarou
Imperial College London, London, UK
Ben Glocker
Universitat de Barcelona, Barcelona, Spain
Karim Lekadir
Helmholtz Munich, Technical University of Munich and King’s College London, Munich, Germany
Julia A. Schnabel

Ethics declarations

Disclosure of Interests

The authors have no competing interests to declare that are relevant to the content of this article.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yu, X., Wu, Z., Zhang, L., Zhang, J., Lyu, Y., Zhu, D. (2024). CP-CLIP: Core-Periphery Feature Alignment CLIP for Zero-Shot Medical Image Analysis. In: Linguraru, M.G., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2024. MICCAI 2024. Lecture Notes in Computer Science, vol 15003. Springer, Cham. https://doi.org/10.1007/978-3-031-72384-1_9

Download citation

DOI: https://doi.org/10.1007/978-3-031-72384-1_9
Published: 03 October 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72383-4
Online ISBN: 978-3-031-72384-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)

CP-CLIP: Core-Periphery Feature Alignment CLIP for Zero-Shot Medical Image Analysis

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Improving Medical Multi-modal Contrastive Learning with Expert Annotations

MedIM: Boost Medical Image Representation via Radiology Report-Guided Masking

Benchmarking PathCLIP for Pathology Image Analysis

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Ethics declarations

Disclosure of Interests

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Subscribe and save

Buy Now

Navigation

CP-CLIP: Core-Periphery Feature Alignment CLIP for Zero-Shot Medical Image Analysis

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Improving Medical Multi-modal Contrastive Learning with Expert Annotations

MedIM: Boost Medical Image Representation via Radiology Report-Guided Masking

Benchmarking PathCLIP for Pathology Image Analysis

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Ethics declarations

Disclosure of Interests

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation