Skip to main content

Data-Efficient Radiology Report Generation via Similar Report Features Enhancement

  • Conference paper
  • First Online:
Applications of Medical Artificial Intelligence (AMAI 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15384))

Included in the following conference series:

  • 6 Accesses

Abstract

The utilization of Artificial Intelligence in automatically generating radiology reports presents a promising solution for enhancing the efficiency of the diagnostic process and reducing human error. However, existing methods require training on large datasets of image-report pairs, which are often scarce. Moreover, the accuracy of reports generated with limited paired data significantly diminishes. To address these challenges, this study introduces a data-efficient method that integrates the retrieval of similar reports with text fusion enhancements to tackle the scarcity of image-report pairs and generate accurate radiology reports. Our method is compared with several state-of-the-art approaches, showing advancements on the MIMIC-CXR and IU X-ray benchmarks with the same limited data pairs. It achieves near-optimal results on MIMIC-CXR and comparable results on IU-Xray, highlighting not only its effectiveness and potential to improve radiological diagnosis with fewer image reports but also its ability to generate more accurate reports. By enhancing cross-modal feature interaction and demonstrating higher diagnostic accuracy, this work contributes to the fields of clinical medicine and artificial intelligence.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Chen, J., Guo, H., Yi, K., Li, B., Elhoseiny, M.: VisualGPT: data-efficient adaptation of pretrained language models for image captioning. In: CVPR, pp. 18030–18040 (2022)

    Google Scholar 

  2. Chen, Z., Shen, Y., Song, Y., Wan, X.: Cross-modal memory networks for radiology report generation. In: ACL (2022)

    Google Scholar 

  3. Chen, Z., Song, Y., Chang, T.H., Wan, X.: Generating radiology reports via memory-driven transformer. In: EMNLP (2020)

    Google Scholar 

  4. Cornia, M., Stefanini, M., Baraldi, L., Cucchiara, R.: Meshed-memory transformer for image captioning. In: CVPR, pp. 10578–10587 (2020)

    Google Scholar 

  5. Demner-Fushman, D., et al.: Preparing a collection of radiology examinations for distribution and retrieval. JAMIA 23(2), 304–310 (2016)

    Google Scholar 

  6. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: CVPR, pp. 248–255. IEEE (2009)

    Google Scholar 

  7. Denkowski, M., Lavie, A.: Meteor 1.3: Automatic metric for reliable optimization and evaluation of machine translation systems. In: Proceedings of the Sixth Workshop on Statistical Machine Translation, pp. 85–91 (2011)

    Google Scholar 

  8. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)

    Google Scholar 

  9. Jin, H., Che, H., Lin, Y., Chen, H.: PromptMRG: diagnosis-driven prompts for medical report generation. In: AAAI, vol. 38, pp. 2607–2615 (2024)

    Google Scholar 

  10. Johnson, A.E., et al.: MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports. Sci. Data 6(1), 317 (2019)

    Google Scholar 

  11. Li, J., Li, D., Xiong, C., Hoi, S.: Blip: bootstrapping language-image pre-training for unified vision-language understanding and generation. In: ICML, pp. 12888–12900. PMLR (2022)

    Google Scholar 

  12. Li, M., Lin, B., Chen, Z., Lin, H., Liang, X., Chang, X.: Dynamic graph enhanced contrastive learning for chest x-ray report generation. In: CVPR, pp. 3334–3343 (2023)

    Google Scholar 

  13. Lin, C.Y.: Rouge: a package for automatic evaluation of summaries. In: Text Summarization Branches Out, pp. 74–81 (2004)

    Google Scholar 

  14. Lu, Y., Guo, C., Dai, X., Wang, F.Y.: Data-efficient image captioning of fine art paintings via virtual-real semantic alignment training. Neurocomputing 490, 163–180 (2022)

    Article  MATH  Google Scholar 

  15. Luo, Y., et al.: Dual-level collaborative transformer for image captioning. In: AAAI, vol. 35, pp. 2286–2293 (2021)

    Google Scholar 

  16. Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BleU: a method for automatic evaluation of machine translation. In: ACL, pp. 311–318 (2002)

    Google Scholar 

  17. Qin, H., Song, Y.: Reinforced cross-modal alignment for radiology report generation. In: ACL, pp. 448–458 (2022)

    Google Scholar 

  18. Radford, A., et al.: Learning transferable visual models from natural language supervision. In: ICML, pp. 8748–8763. PMLR (2021)

    Google Scholar 

  19. Smit, A., Jain, S., Rajpurkar, P., Pareek, A., Ng, A.Y., Lungren, M.P.: ChexBERT: combining automatic labelers and expert annotations for accurate radiology report labeling using BERT. In: EMNLP (2020)

    Google Scholar 

  20. Vaswani, A., et al.: Attention is all you need. In: NeurIPS, vol. 30 (2017)

    Google Scholar 

  21. Wang, J., Bhalerao, A., He, Y.: Cross-modal prototype driven network for radiology report generation. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13695, pp. 563–579. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19833-5_33

    Chapter  Google Scholar 

  22. Wang, Z., Wu, Z., Agarwal, D., Sun, J.: MedClip: contrastive learning from unpaired medical images and text. In: EMNLP (2022)

    Google Scholar 

  23. Yang, M., et al.: Multitask learning for cross-domain image captioning. IEEE TMM 21(4), 1047–1061 (2018)

    MATH  Google Scholar 

  24. Yang, S., Wu, X., Ge, S., Zheng, Z., Zhou, S.K., Xiao, L.: Radiology report generation with a learned knowledge base and multi-modal alignment. Med. Image Anal. 86, 102798 (2023)

    Article  MATH  Google Scholar 

  25. You, K., et al.: CXR-CLIP: toward large scale chest x-ray language-image pre-training. In: Greenspan, H., et al. (eds.) MICCAI 2023. LNCS, vol. 14221, pp. 101–111. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-43895-0_10

    Chapter  MATH  Google Scholar 

  26. Yu, L., Zhang, J., Wu, Q.: Dual attention on pyramid feature maps for image captioning. IEEE TMM 24, 1775–1786 (2021)

    MATH  Google Scholar 

  27. Zhang, K., et al.: Semi-supervised medical report generation via graph-guided hybrid feature consistency. IEEE TMM (2023)

    Google Scholar 

  28. Zhang, T., Kishore, V., Wu, F., Weinberger, K.Q., Artzi, Y.: BERTscore: evaluating text generation with BERT. In: ICLR (2019)

    Google Scholar 

Download references

Acknowledgments

This work was supported by National Natural Science Foundation of China (Grant No. 62371409).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liansheng Wang .

Editor information

Editors and Affiliations

Ethics declarations

Disclosure of Interests

The authors have no competing interests to declare that are relevant to the content of this article.

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, Y., Sun, J., Wang, L. (2025). Data-Efficient Radiology Report Generation via Similar Report Features Enhancement. In: Wu, S., Shabestari, B., Xing, L. (eds) Applications of Medical Artificial Intelligence. AMAI 2024. Lecture Notes in Computer Science, vol 15384. Springer, Cham. https://doi.org/10.1007/978-3-031-82007-6_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-82007-6_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-82006-9

  • Online ISBN: 978-3-031-82007-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics