Skip to main content
Log in

Radiological Report Generation from Chest X-ray Images Using Pre-trained Word Embeddings

  • Published:
Wireless Personal Communications Aims and scope Submit manuscript

Abstract

The deep neural networks have facilitated the radiologists to large extent by automating the process of radiological report generation. Majority of the researchers have focussed on improving the learning focus of the model using attention mechanism, reinforcement learning and other techniques. Most of them, have not considered the textual information present in the ground truth radiological reports. In downstream language tasks like text classification, word embedding has played vital role in extracting textual features. Inspired from the same, we empirically study the impact of different word embedding techniques on radiological report generation tasks. In this work, we have used a convolutional neural network and large language model to extract visual and textual features, respectively. Recurrent neural network is used to generate the reports. The proposed method outperforms most of the state-of-the-art methods by achieving following evaluation metrics scores: BLEU-1 = 0.612, BLEU-2 = 0.610, BLEU-3 = 0.608, BLEU-4 = 0.606, ROUGE = 0.811, and CIDEr = 0.317. This work confirms that pre-trained large language model gives significantly better results that other word embedding techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Fig. 4
Fig. 5

Similar content being viewed by others

Data Availability

We used a standard publically available dataset, and is available at https://openi.nlm.nih.gov/faq#collection

Code Availability

The code for this internal research study is available upon request.

References

  1. Tubiana, M. (1996). Wilhelm conrad röntgen and the discovery of x-rays. Bulletin de l’Academie nationale de medecine, 180(1), 97–108.

    CAS  PubMed  Google Scholar 

  2. NHS England and NHS Improvement. (2021). Performance analysis team. Diagnostic imaging dataset statistical release.

  3. Pennington, J., Socher, R., & Manning, C. D. (2014). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532–1543).

  4. Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2017). Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5, 135–146.

    Article  Google Scholar 

  5. Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L. (2018). Deep contextualized word representations. In Proceedings of the 2018 conference of the North American chapter of the Association for Computational Linguistics: Human language technologies (Vol. 1, pp. 2227–2237).

  6. Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805

  7. Demner-Fushman, D., Antani, S., Simpson, M., & Thoma, G. R. (2012). Design and development of a multimodal biomedical information retrieval system. Journal of Computing Science and Engineering, 6(2), 168–177.

    Article  Google Scholar 

  8. Kaur, N., Mittal, A., & Singh, G. (2021). Methods for automatic generation of radiological reports of chest radiographs: A comprehensive survey. Multimedia Tools and Applications, 81, 1–31.

    Google Scholar 

  9. Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078

  10. Shin, H.-C., Roberts, K., Lu, L., Demner-Fushman, D., Yao, J., & Summers, R. M. (2016). Learning to read chest X-rays: Recurrent neural cascade model for automated image annotation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2497–2506).

  11. Vinyals, O., Toshev, A., Bengio, S., & Erhan, D. (2015). Show and tell: A neural image caption generator. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3156–3164).

  12. Krause, J., Johnson, J., Krishna, R., & Fei-Fei, L. (2017). A hierarchical approach for generating descriptive image paragraphs. In Computer vision and pattern recognition (CVPR).

  13. Yin, C., Qian, B., Wei, J., Li, X., Zhang, X., Li, Y., & Zheng, Q. (2019). Automatic generation of medical imaging diagnostic report with hierarchical recurrent neural network. In 2019 IEEE international conference on data mining (ICDM) (pp. 728–737). IEEE.

  14. Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A., Salakhudinov, R., Zemel, R., & Bengio, Y. (2015). Show, attend and tell: Neural image caption generation with visual attention. In International conference on machine learning (pp. 2048–2057). PMLR.

  15. Zhang, Z., Xie, Y., Xing, F., McGough, M., & Yang, L. (2017). MDNet: A semantically and visually interpretable medical image diagnosis network. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 6428–6436).

  16. Jing, B., Xie, P., & Xing, E. (2017). On the automatic generation of medical imaging reports. arXiv preprint arXiv:1711.08195

  17. Rennie, S. J., Marcheret, E., Mroueh, Y., Ross, J. & Goel, V. (2017). Self-critical sequence training for image captioning. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7008–7024).

  18. Li, Y., Liang, X., Hu, Z., & Xing, E. P. (2018). Hybrid retrieval-generation reinforced agent for medical image report generation. In Advances in neural information processing systems (pp. 1530–1540).

  19. Xiong, Y., Du, B., & Yan, P. (2019). Reinforced transformer for medical image captioning. In International workshop on machine learning in medical imaging (pp. 673–680). Springer.

  20. Jing, B., Wang, Z., & Xing, E. (2020). Show, describe and conclude: On exploiting the structure information of chest X-ray reports. arXiv preprint arXiv:2004.12274

  21. Liu, G., Hsu, T.-M. H., McDermott, M., Boag, W., Weng, W.-H., Szolovits, P., & Ghassemi, M. (2019). Clinically accurate chest X-ray report generation. arXiv preprint arXiv:1904.02633

  22. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł, & Polosukhin, I. (2017). Attention is all you need. Adv. Neural Inf. Process. Syst., 30, 789.

    Google Scholar 

  23. Chen, Z., Song, Y., Chang, T.-H., & Wan, X. (2020). Generating radiology reports via memory-driven transformer. arXiv preprint arXiv:2010.16056

  24. Nooralahzadeh, F., Gonzalez, N. P., Frauenfelder, T., Fujimoto, K., & Krauthammer, M. (2021). Progressive transformer-based generation of radiology reports. arXiv preprint arXiv:2102.09777

  25. Alfarghaly, O., Khaled, R., Elkorany, A., Helal, M., & Fahmy, A. (2021). Automated radiology report generation using conditioned transformers. Informatics in Medicine Unlocked, 24, 100557.

    Article  Google Scholar 

  26. Wang, Y., Liu, S., Afzal, N., Rastegar-Mojarad, M., Wang, L., Shen, F., Kingsbury, P., & Liu, H. (2018). A comparison of word embeddings for the biomedical natural language processing. Journal of Biomedical Informatics, 87, 12–20.

    Article  PubMed  PubMed Central  Google Scholar 

  27. Kalyan, K. S., & Sangeetha, S. (2020). SECNLP: A survey of embeddings in clinical natural language processing. Journal of Biomedical Informatics, 101, 103323.

    Article  PubMed  Google Scholar 

  28. Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781

  29. Banerjee, I., Chen, M. C., Lungren, M. P., & Rubin, D. L. (2018). Radiology report annotation using intelligent word embeddings: Applied to multi-institutional chest CT cohort. Journal of Biomedical Informatics, 77, 11–20.

    Article  PubMed  Google Scholar 

  30. Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

  31. Harzig, P., Chen, Y.-Y., Chen, F., & Lienhart, R. (2019). Addressing data bias problems for chest X-ray image report generation. arXiv preprint arXiv:1908.02123

  32. Huang, X., Yan, F., Wei, X., & Li, M. (2019). Multi-attention and incorporating background information model for chest X-ray image report generation. IEEE Access, 7, 154808–154817.

    Article  Google Scholar 

  33. Kaur, N., & Mittal, A. (2022). CADxReport: Chest X-ray report generation using co-attention mechanism and reinforcement learning. Computers in Biology and Medicine, 145, 105498.

    Article  PubMed  Google Scholar 

  34. Kaur, N., & Mittal, A. (2022). RadioBERT: A deep learning-based system for medical report generation from chest X-ray images using contextual embeddings. Journal of Biomedical Informatics, 135, 104220.

    Article  PubMed  Google Scholar 

  35. Li, X., Cao, R., & Zhu, D. (2019). Vispi: Automatic Visual Perception and Interpretation of Chest X-rays. arXiv preprint arXiv:1906.05190

  36. Li, C.Y., Liang, X., Hu, Z., & Xing, E. P. (2019). Knowledge-driven encode, retrieve, paraphrase for medical image report generation. arXiv preprint arXiv:1903.10122

  37. Wang, X., Peng, Y., Lu, L., Lu, Z., & Summers, R. M. (2018). Tienet: Text-image embedding network for common thorax disease classification and reporting in chest X-rays. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 9049–9058).

  38. Yuan, J., Liao, H., Luo, R., & Luo, J. (2019). Automatic radiology report generation based on multi-view image fusion and medical concept enrichment. In International conference on medical image computing and computer-assisted intervention (pp. 721–729). Springer.

  39. Zhang, Y., Wang, X., Xu, Z., Yu, Q., Yuille, A., & Xu, D. (2020). When radiology report generation meets knowledge graph. arXiv preprint arXiv:2002.08277

Download references

Funding

This work is supported by the funds received from King Abdulaziz University, Jeddah, Saudi Arabia

Author information

Authors and Affiliations

Authors

Contributions

FSA: Methodology, Software, Writing and Revising the manuscript critically for important intellectual content, Supervision. NK: Conception and Design of Study, Acquisition of Data, Analysis and interpretation of Data, Methodology, Software, Writing and Revising the manuscript.

Corresponding author

Correspondence to Navdeep Kaur.

Ethics declarations

Conflict of interest

The authors affirm that none of their known financial or personal relationships or conflicts of interest may have seemed to have an impact on the work presented in this paper.

Consent to Participate

This study uses standard publically available dataset, thus no consent to Participate is required.

Consent to Publish

This study uses standard publically available dataset, thus consent to Publish is required.

Ethical Approval

This study uses standard publically available dataset, thus no ethical approval is required.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Alotaibi, F.S., Kaur, N. Radiological Report Generation from Chest X-ray Images Using Pre-trained Word Embeddings. Wireless Pers Commun 133, 2525–2540 (2023). https://doi.org/10.1007/s11277-024-10886-x

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11277-024-10886-x

Keywords

Navigation