Abstract
The utilization of Artificial Intelligence in automatically generating radiology reports presents a promising solution for enhancing the efficiency of the diagnostic process and reducing human error. However, existing methods require training on large datasets of image-report pairs, which are often scarce. Moreover, the accuracy of reports generated with limited paired data significantly diminishes. To address these challenges, this study introduces a data-efficient method that integrates the retrieval of similar reports with text fusion enhancements to tackle the scarcity of image-report pairs and generate accurate radiology reports. Our method is compared with several state-of-the-art approaches, showing advancements on the MIMIC-CXR and IU X-ray benchmarks with the same limited data pairs. It achieves near-optimal results on MIMIC-CXR and comparable results on IU-Xray, highlighting not only its effectiveness and potential to improve radiological diagnosis with fewer image reports but also its ability to generate more accurate reports. By enhancing cross-modal feature interaction and demonstrating higher diagnostic accuracy, this work contributes to the fields of clinical medicine and artificial intelligence.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chen, J., Guo, H., Yi, K., Li, B., Elhoseiny, M.: VisualGPT: data-efficient adaptation of pretrained language models for image captioning. In: CVPR, pp. 18030–18040 (2022)
Chen, Z., Shen, Y., Song, Y., Wan, X.: Cross-modal memory networks for radiology report generation. In: ACL (2022)
Chen, Z., Song, Y., Chang, T.H., Wan, X.: Generating radiology reports via memory-driven transformer. In: EMNLP (2020)
Cornia, M., Stefanini, M., Baraldi, L., Cucchiara, R.: Meshed-memory transformer for image captioning. In: CVPR, pp. 10578–10587 (2020)
Demner-Fushman, D., et al.: Preparing a collection of radiology examinations for distribution and retrieval. JAMIA 23(2), 304–310 (2016)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: CVPR, pp. 248–255. IEEE (2009)
Denkowski, M., Lavie, A.: Meteor 1.3: Automatic metric for reliable optimization and evaluation of machine translation systems. In: Proceedings of the Sixth Workshop on Statistical Machine Translation, pp. 85–91 (2011)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
Jin, H., Che, H., Lin, Y., Chen, H.: PromptMRG: diagnosis-driven prompts for medical report generation. In: AAAI, vol. 38, pp. 2607–2615 (2024)
Johnson, A.E., et al.: MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports. Sci. Data 6(1), 317 (2019)
Li, J., Li, D., Xiong, C., Hoi, S.: Blip: bootstrapping language-image pre-training for unified vision-language understanding and generation. In: ICML, pp. 12888–12900. PMLR (2022)
Li, M., Lin, B., Chen, Z., Lin, H., Liang, X., Chang, X.: Dynamic graph enhanced contrastive learning for chest x-ray report generation. In: CVPR, pp. 3334–3343 (2023)
Lin, C.Y.: Rouge: a package for automatic evaluation of summaries. In: Text Summarization Branches Out, pp. 74–81 (2004)
Lu, Y., Guo, C., Dai, X., Wang, F.Y.: Data-efficient image captioning of fine art paintings via virtual-real semantic alignment training. Neurocomputing 490, 163–180 (2022)
Luo, Y., et al.: Dual-level collaborative transformer for image captioning. In: AAAI, vol. 35, pp. 2286–2293 (2021)
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BleU: a method for automatic evaluation of machine translation. In: ACL, pp. 311–318 (2002)
Qin, H., Song, Y.: Reinforced cross-modal alignment for radiology report generation. In: ACL, pp. 448–458 (2022)
Radford, A., et al.: Learning transferable visual models from natural language supervision. In: ICML, pp. 8748–8763. PMLR (2021)
Smit, A., Jain, S., Rajpurkar, P., Pareek, A., Ng, A.Y., Lungren, M.P.: ChexBERT: combining automatic labelers and expert annotations for accurate radiology report labeling using BERT. In: EMNLP (2020)
Vaswani, A., et al.: Attention is all you need. In: NeurIPS, vol. 30 (2017)
Wang, J., Bhalerao, A., He, Y.: Cross-modal prototype driven network for radiology report generation. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13695, pp. 563–579. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19833-5_33
Wang, Z., Wu, Z., Agarwal, D., Sun, J.: MedClip: contrastive learning from unpaired medical images and text. In: EMNLP (2022)
Yang, M., et al.: Multitask learning for cross-domain image captioning. IEEE TMM 21(4), 1047–1061 (2018)
Yang, S., Wu, X., Ge, S., Zheng, Z., Zhou, S.K., Xiao, L.: Radiology report generation with a learned knowledge base and multi-modal alignment. Med. Image Anal. 86, 102798 (2023)
You, K., et al.: CXR-CLIP: toward large scale chest x-ray language-image pre-training. In: Greenspan, H., et al. (eds.) MICCAI 2023. LNCS, vol. 14221, pp. 101–111. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-43895-0_10
Yu, L., Zhang, J., Wu, Q.: Dual attention on pyramid feature maps for image captioning. IEEE TMM 24, 1775–1786 (2021)
Zhang, K., et al.: Semi-supervised medical report generation via graph-guided hybrid feature consistency. IEEE TMM (2023)
Zhang, T., Kishore, V., Wu, F., Weinberger, K.Q., Artzi, Y.: BERTscore: evaluating text generation with BERT. In: ICLR (2019)
Acknowledgments
This work was supported by National Natural Science Foundation of China (Grant No. 62371409).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Disclosure of Interests
The authors have no competing interests to declare that are relevant to the content of this article.
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Li, Y., Sun, J., Wang, L. (2025). Data-Efficient Radiology Report Generation via Similar Report Features Enhancement. In: Wu, S., Shabestari, B., Xing, L. (eds) Applications of Medical Artificial Intelligence. AMAI 2024. Lecture Notes in Computer Science, vol 15384. Springer, Cham. https://doi.org/10.1007/978-3-031-82007-6_24
Download citation
DOI: https://doi.org/10.1007/978-3-031-82007-6_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-82006-9
Online ISBN: 978-3-031-82007-6
eBook Packages: Computer ScienceComputer Science (R0)