Data-Efficient Radiology Report Generation via Similar Report Features Enhancement

Li, Yanfeng; Sun, Jinghan; Wang, Liansheng

doi:10.1007/978-3-031-82007-6_24

Yanfeng Li¹⁰,
Jinghan Sun¹¹ &
Liansheng Wang^10,11

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15384))

Included in the following conference series:

International Workshop on Applications of Medical AI

6 Accesses

Abstract

The utilization of Artificial Intelligence in automatically generating radiology reports presents a promising solution for enhancing the efficiency of the diagnostic process and reducing human error. However, existing methods require training on large datasets of image-report pairs, which are often scarce. Moreover, the accuracy of reports generated with limited paired data significantly diminishes. To address these challenges, this study introduces a data-efficient method that integrates the retrieval of similar reports with text fusion enhancements to tackle the scarcity of image-report pairs and generate accurate radiology reports. Our method is compared with several state-of-the-art approaches, showing advancements on the MIMIC-CXR and IU X-ray benchmarks with the same limited data pairs. It achieves near-optimal results on MIMIC-CXR and comparable results on IU-Xray, highlighting not only its effectiveness and potential to improve radiological diagnosis with fewer image reports but also its ability to generate more accurate reports. By enhancing cross-modal feature interaction and demonstrating higher diagnostic accuracy, this work contributes to the fields of clinical medicine and artificial intelligence.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Chen, J., Guo, H., Yi, K., Li, B., Elhoseiny, M.: VisualGPT: data-efficient adaptation of pretrained language models for image captioning. In: CVPR, pp. 18030–18040 (2022)
Google Scholar
Chen, Z., Shen, Y., Song, Y., Wan, X.: Cross-modal memory networks for radiology report generation. In: ACL (2022)
Google Scholar
Chen, Z., Song, Y., Chang, T.H., Wan, X.: Generating radiology reports via memory-driven transformer. In: EMNLP (2020)
Google Scholar
Cornia, M., Stefanini, M., Baraldi, L., Cucchiara, R.: Meshed-memory transformer for image captioning. In: CVPR, pp. 10578–10587 (2020)
Google Scholar
Demner-Fushman, D., et al.: Preparing a collection of radiology examinations for distribution and retrieval. JAMIA 23(2), 304–310 (2016)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: CVPR, pp. 248–255. IEEE (2009)
Google Scholar
Denkowski, M., Lavie, A.: Meteor 1.3: Automatic metric for reliable optimization and evaluation of machine translation systems. In: Proceedings of the Sixth Workshop on Statistical Machine Translation, pp. 85–91 (2011)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
Google Scholar
Jin, H., Che, H., Lin, Y., Chen, H.: PromptMRG: diagnosis-driven prompts for medical report generation. In: AAAI, vol. 38, pp. 2607–2615 (2024)
Google Scholar
Johnson, A.E., et al.: MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports. Sci. Data 6(1), 317 (2019)
Google Scholar
Li, J., Li, D., Xiong, C., Hoi, S.: Blip: bootstrapping language-image pre-training for unified vision-language understanding and generation. In: ICML, pp. 12888–12900. PMLR (2022)
Google Scholar
Li, M., Lin, B., Chen, Z., Lin, H., Liang, X., Chang, X.: Dynamic graph enhanced contrastive learning for chest x-ray report generation. In: CVPR, pp. 3334–3343 (2023)
Google Scholar
Lin, C.Y.: Rouge: a package for automatic evaluation of summaries. In: Text Summarization Branches Out, pp. 74–81 (2004)
Google Scholar
Lu, Y., Guo, C., Dai, X., Wang, F.Y.: Data-efficient image captioning of fine art paintings via virtual-real semantic alignment training. Neurocomputing 490, 163–180 (2022)
Article MATH Google Scholar
Luo, Y., et al.: Dual-level collaborative transformer for image captioning. In: AAAI, vol. 35, pp. 2286–2293 (2021)
Google Scholar
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BleU: a method for automatic evaluation of machine translation. In: ACL, pp. 311–318 (2002)
Google Scholar
Qin, H., Song, Y.: Reinforced cross-modal alignment for radiology report generation. In: ACL, pp. 448–458 (2022)
Google Scholar
Radford, A., et al.: Learning transferable visual models from natural language supervision. In: ICML, pp. 8748–8763. PMLR (2021)
Google Scholar
Smit, A., Jain, S., Rajpurkar, P., Pareek, A., Ng, A.Y., Lungren, M.P.: ChexBERT: combining automatic labelers and expert annotations for accurate radiology report labeling using BERT. In: EMNLP (2020)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: NeurIPS, vol. 30 (2017)
Google Scholar
Wang, J., Bhalerao, A., He, Y.: Cross-modal prototype driven network for radiology report generation. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13695, pp. 563–579. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19833-5_33
Chapter Google Scholar
Wang, Z., Wu, Z., Agarwal, D., Sun, J.: MedClip: contrastive learning from unpaired medical images and text. In: EMNLP (2022)
Google Scholar
Yang, M., et al.: Multitask learning for cross-domain image captioning. IEEE TMM 21(4), 1047–1061 (2018)
MATH Google Scholar
Yang, S., Wu, X., Ge, S., Zheng, Z., Zhou, S.K., Xiao, L.: Radiology report generation with a learned knowledge base and multi-modal alignment. Med. Image Anal. 86, 102798 (2023)
Article MATH Google Scholar
You, K., et al.: CXR-CLIP: toward large scale chest x-ray language-image pre-training. In: Greenspan, H., et al. (eds.) MICCAI 2023. LNCS, vol. 14221, pp. 101–111. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-43895-0_10
Chapter MATH Google Scholar
Yu, L., Zhang, J., Wu, Q.: Dual attention on pyramid feature maps for image captioning. IEEE TMM 24, 1775–1786 (2021)
MATH Google Scholar
Zhang, K., et al.: Semi-supervised medical report generation via graph-guided hybrid feature consistency. IEEE TMM (2023)
Google Scholar
Zhang, T., Kishore, V., Wu, F., Weinberger, K.Q., Artzi, Y.: BERTscore: evaluating text generation with BERT. In: ICLR (2019)
Google Scholar

Download references

Acknowledgments

This work was supported by National Natural Science Foundation of China (Grant No. 62371409).

Author information

Authors and Affiliations

Department of Computer Science at School of Informatics, Xiamen University, Xiamen, China
Yanfeng Li & Liansheng Wang
National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, China
Jinghan Sun & Liansheng Wang

Authors

Yanfeng Li
View author publications
You can also search for this author in PubMed Google Scholar
Jinghan Sun
View author publications
You can also search for this author in PubMed Google Scholar
Liansheng Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Liansheng Wang .

Editor information

Editors and Affiliations

University of Pittsburgh, Pittsburgh, PA, USA
Shandong Wu
National Institute of Biomedical Imaging and Bioengineering, Bethesda, MD, USA
Behrouz Shabestari
Stanford University, Stanford, CA, USA
Lei Xing

Ethics declarations

Disclosure of Interests

The authors have no competing interests to declare that are relevant to the content of this article.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, Y., Sun, J., Wang, L. (2025). Data-Efficient Radiology Report Generation via Similar Report Features Enhancement. In: Wu, S., Shabestari, B., Xing, L. (eds) Applications of Medical Artificial Intelligence. AMAI 2024. Lecture Notes in Computer Science, vol 15384. Springer, Cham. https://doi.org/10.1007/978-3-031-82007-6_24

Download citation

DOI: https://doi.org/10.1007/978-3-031-82007-6_24
Published: 08 February 2025
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-82006-9
Online ISBN: 978-3-031-82007-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)

Data-Efficient Radiology Report Generation via Similar Report Features Enhancement