Radiological Report Generation from Chest X-ray Images Using Pre-trained Word Embeddings

Alotaibi, Fahd Saleh; Kaur, Navdeep

doi:10.1007/s11277-024-10886-x

Radiological Report Generation from Chest X-ray Images Using Pre-trained Word Embeddings

Published: 24 February 2024

Volume 133, pages 2525–2540, (2023)
Cite this article

Wireless Personal Communications Aims and scope Submit manuscript

114 Accesses
Explore all metrics

Abstract

The deep neural networks have facilitated the radiologists to large extent by automating the process of radiological report generation. Majority of the researchers have focussed on improving the learning focus of the model using attention mechanism, reinforcement learning and other techniques. Most of them, have not considered the textual information present in the ground truth radiological reports. In downstream language tasks like text classification, word embedding has played vital role in extracting textual features. Inspired from the same, we empirically study the impact of different word embedding techniques on radiological report generation tasks. In this work, we have used a convolutional neural network and large language model to extract visual and textual features, respectively. Recurrent neural network is used to generate the reports. The proposed method outperforms most of the state-of-the-art methods by achieving following evaluation metrics scores: BLEU-1 = 0.612, BLEU-2 = 0.610, BLEU-3 = 0.608, BLEU-4 = 0.606, ROUGE = 0.811, and CIDEr = 0.317. This work confirms that pre-trained large language model gives significantly better results that other word embedding techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 2

Multimodal Recurrent Model with Attention for Automated Radiology Report Generation

Automatic Report Generation for Chest X-Ray Images: A Multilevel Multi-attention Approach

RadTex: Learning Efficient Radiograph Representations from Text Reports

Data Availability

We used a standard publically available dataset, and is available at https://openi.nlm.nih.gov/faq#collection

Code Availability

The code for this internal research study is available upon request.

References

Tubiana, M. (1996). Wilhelm conrad röntgen and the discovery of x-rays. Bulletin de l’Academie nationale de medecine, 180(1), 97–108.
CAS PubMed Google Scholar
NHS England and NHS Improvement. (2021). Performance analysis team. Diagnostic imaging dataset statistical release.
Pennington, J., Socher, R., & Manning, C. D. (2014). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532–1543).
Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2017). Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5, 135–146.
Article Google Scholar
Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L. (2018). Deep contextualized word representations. In Proceedings of the 2018 conference of the North American chapter of the Association for Computational Linguistics: Human language technologies (Vol. 1, pp. 2227–2237).
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805
Demner-Fushman, D., Antani, S., Simpson, M., & Thoma, G. R. (2012). Design and development of a multimodal biomedical information retrieval system. Journal of Computing Science and Engineering, 6(2), 168–177.
Article Google Scholar
Kaur, N., Mittal, A., & Singh, G. (2021). Methods for automatic generation of radiological reports of chest radiographs: A comprehensive survey. Multimedia Tools and Applications, 81, 1–31.
Google Scholar
Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078
Shin, H.-C., Roberts, K., Lu, L., Demner-Fushman, D., Yao, J., & Summers, R. M. (2016). Learning to read chest X-rays: Recurrent neural cascade model for automated image annotation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2497–2506).
Vinyals, O., Toshev, A., Bengio, S., & Erhan, D. (2015). Show and tell: A neural image caption generator. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3156–3164).
Krause, J., Johnson, J., Krishna, R., & Fei-Fei, L. (2017). A hierarchical approach for generating descriptive image paragraphs. In Computer vision and pattern recognition (CVPR).
Yin, C., Qian, B., Wei, J., Li, X., Zhang, X., Li, Y., & Zheng, Q. (2019). Automatic generation of medical imaging diagnostic report with hierarchical recurrent neural network. In 2019 IEEE international conference on data mining (ICDM) (pp. 728–737). IEEE.
Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A., Salakhudinov, R., Zemel, R., & Bengio, Y. (2015). Show, attend and tell: Neural image caption generation with visual attention. In International conference on machine learning (pp. 2048–2057). PMLR.
Zhang, Z., Xie, Y., Xing, F., McGough, M., & Yang, L. (2017). MDNet: A semantically and visually interpretable medical image diagnosis network. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 6428–6436).
Jing, B., Xie, P., & Xing, E. (2017). On the automatic generation of medical imaging reports. arXiv preprint arXiv:1711.08195
Rennie, S. J., Marcheret, E., Mroueh, Y., Ross, J. & Goel, V. (2017). Self-critical sequence training for image captioning. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7008–7024).
Li, Y., Liang, X., Hu, Z., & Xing, E. P. (2018). Hybrid retrieval-generation reinforced agent for medical image report generation. In Advances in neural information processing systems (pp. 1530–1540).
Xiong, Y., Du, B., & Yan, P. (2019). Reinforced transformer for medical image captioning. In International workshop on machine learning in medical imaging (pp. 673–680). Springer.
Jing, B., Wang, Z., & Xing, E. (2020). Show, describe and conclude: On exploiting the structure information of chest X-ray reports. arXiv preprint arXiv:2004.12274
Liu, G., Hsu, T.-M. H., McDermott, M., Boag, W., Weng, W.-H., Szolovits, P., & Ghassemi, M. (2019). Clinically accurate chest X-ray report generation. arXiv preprint arXiv:1904.02633
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł, & Polosukhin, I. (2017). Attention is all you need. Adv. Neural Inf. Process. Syst., 30, 789.
Google Scholar
Chen, Z., Song, Y., Chang, T.-H., & Wan, X. (2020). Generating radiology reports via memory-driven transformer. arXiv preprint arXiv:2010.16056
Nooralahzadeh, F., Gonzalez, N. P., Frauenfelder, T., Fujimoto, K., & Krauthammer, M. (2021). Progressive transformer-based generation of radiology reports. arXiv preprint arXiv:2102.09777
Alfarghaly, O., Khaled, R., Elkorany, A., Helal, M., & Fahmy, A. (2021). Automated radiology report generation using conditioned transformers. Informatics in Medicine Unlocked, 24, 100557.
Article Google Scholar
Wang, Y., Liu, S., Afzal, N., Rastegar-Mojarad, M., Wang, L., Shen, F., Kingsbury, P., & Liu, H. (2018). A comparison of word embeddings for the biomedical natural language processing. Journal of Biomedical Informatics, 87, 12–20.
Article PubMed PubMed Central Google Scholar
Kalyan, K. S., & Sangeetha, S. (2020). SECNLP: A survey of embeddings in clinical natural language processing. Journal of Biomedical Informatics, 101, 103323.
Article PubMed Google Scholar
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781
Banerjee, I., Chen, M. C., Lungren, M. P., & Rubin, D. L. (2018). Radiology report annotation using intelligent word embeddings: Applied to multi-institutional chest CT cohort. Journal of Biomedical Informatics, 77, 11–20.
Article PubMed Google Scholar
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Harzig, P., Chen, Y.-Y., Chen, F., & Lienhart, R. (2019). Addressing data bias problems for chest X-ray image report generation. arXiv preprint arXiv:1908.02123
Huang, X., Yan, F., Wei, X., & Li, M. (2019). Multi-attention and incorporating background information model for chest X-ray image report generation. IEEE Access, 7, 154808–154817.
Article Google Scholar
Kaur, N., & Mittal, A. (2022). CADxReport: Chest X-ray report generation using co-attention mechanism and reinforcement learning. Computers in Biology and Medicine, 145, 105498.
Article PubMed Google Scholar
Kaur, N., & Mittal, A. (2022). RadioBERT: A deep learning-based system for medical report generation from chest X-ray images using contextual embeddings. Journal of Biomedical Informatics, 135, 104220.
Article PubMed Google Scholar
Li, X., Cao, R., & Zhu, D. (2019). Vispi: Automatic Visual Perception and Interpretation of Chest X-rays. arXiv preprint arXiv:1906.05190
Li, C.Y., Liang, X., Hu, Z., & Xing, E. P. (2019). Knowledge-driven encode, retrieve, paraphrase for medical image report generation. arXiv preprint arXiv:1903.10122
Wang, X., Peng, Y., Lu, L., Lu, Z., & Summers, R. M. (2018). Tienet: Text-image embedding network for common thorax disease classification and reporting in chest X-rays. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 9049–9058).
Yuan, J., Liao, H., Luo, R., & Luo, J. (2019). Automatic radiology report generation based on multi-view image fusion and medical concept enrichment. In International conference on medical image computing and computer-assisted intervention (pp. 721–729). Springer.
Zhang, Y., Wang, X., Xu, Z., Yu, Q., Yuille, A., & Xu, D. (2020). When radiology report generation meets knowledge graph. arXiv preprint arXiv:2002.08277

Download references

Funding

This work is supported by the funds received from King Abdulaziz University, Jeddah, Saudi Arabia

Author information

Authors and Affiliations

Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia
Fahd Saleh Alotaibi
Department of Computer Science and Applications, Mehr Chand Mahajan DAV College for Women, Chandigarh, India
Navdeep Kaur

Authors

Fahd Saleh Alotaibi
View author publications
You can also search for this author in PubMed Google Scholar
Navdeep Kaur
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

FSA: Methodology, Software, Writing and Revising the manuscript critically for important intellectual content, Supervision. NK: Conception and Design of Study, Acquisition of Data, Analysis and interpretation of Data, Methodology, Software, Writing and Revising the manuscript.

Corresponding author

Correspondence to Navdeep Kaur.

Ethics declarations

Conflict of interest

The authors affirm that none of their known financial or personal relationships or conflicts of interest may have seemed to have an impact on the work presented in this paper.

Consent to Participate

This study uses standard publically available dataset, thus no consent to Participate is required.

Consent to Publish

This study uses standard publically available dataset, thus consent to Publish is required.

Ethical Approval

This study uses standard publically available dataset, thus no ethical approval is required.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Alotaibi, F.S., Kaur, N. Radiological Report Generation from Chest X-ray Images Using Pre-trained Word Embeddings. Wireless Pers Commun 133, 2525–2540 (2023). https://doi.org/10.1007/s11277-024-10886-x

Download citation

Accepted: 29 January 2024
Published: 24 February 2024
Issue Date: December 2023
DOI: https://doi.org/10.1007/s11277-024-10886-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Radiological Report Generation from Chest X-ray Images Using Pre-trained Word Embeddings

Abstract

Access this article

Similar content being viewed by others

Multimodal Recurrent Model with Attention for Automated Radiology Report Generation

Automatic Report Generation for Chest X-Ray Images: A Multilevel Multi-attention Approach

RadTex: Learning Efficient Radiograph Representations from Text Reports

Data Availability

Code Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Consent to Participate

Consent to Publish

Ethical Approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Radiological Report Generation from Chest X-ray Images Using Pre-trained Word Embeddings

Abstract

Access this article

Similar content being viewed by others

Multimodal Recurrent Model with Attention for Automated Radiology Report Generation

Automatic Report Generation for Chest X-Ray Images: A Multilevel Multi-attention Approach

RadTex: Learning Efficient Radiograph Representations from Text Reports

Data Availability

Code Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Consent to Participate

Consent to Publish

Ethical Approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation