Skip to main content

Knowledge Graph Embeddings for Multi-lingual Structured Representations of Radiology Reports

  • Conference paper
  • First Online:
Data Augmentation, Labelling, and Imperfections (MICCAI 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14379))

  • 281 Accesses

Abstract

The way we analyse clinical texts has undergone major changes over the last years. The introduction of language models such as BERT led to adaptations for the (bio)medical domain like PubMedBERT and ClinicalBERT. These models rely on large databases of archived medical documents. While performing well in terms of accuracy, both the lack of interpretability and limitations to transfer across languages limit their use in clinical setting. We introduce a novel light-weight graph-based embedding method specifically catering radiology reports. It takes into account the structure and composition of the report, while also connecting medical terms in the report through the multi-lingual SNOMED Clinical Terms knowledge base. The resulting graph embedding uncovers the underlying relationships among clinical terms, achieving a representation that is better understandable for clinicians and clinically more accurate, without reliance on large pre-training datasets. We show the use of this embedding on two tasks namely disease classification of X-ray reports and image classification. For disease classification our model is competitive with its BERT-based counterparts, while being magnitudes smaller in size and training data requirements. For image classification, we show the effectiveness of the graph embedding leveraging cross-modal knowledge transfer and show how this method is usable across different languages.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    github.com/tjvsonsbeek/knowledge_graphs_for_radiology_reports.git.

References

  1. Alsentzer, E., et al.: Publicly available clinical BERT embeddings. NAACL HLT 2019, 72 (2019)

    Google Scholar 

  2. Aronson, A.R., Lang, F.M.: An overview of MetaMap: historical perspective and recent advances. JAMIA 17(3), 229–236 (2010)

    Google Scholar 

  3. Beam, A.L., et al.: Clinical concept embeddings learned from massive sources of multimodal medical data. In: Pacific Symposium on Biocomputing 2020, pp. 295–306. World Scientific (2019)

    Google Scholar 

  4. Bodenreider, O.: The unified medical language system (UMLS): integrating biomedical terminology. Nucleic Acids Res. 32, D267–D270 (2004)

    Article  Google Scholar 

  5. Bustos, A., Pertusa, A., Salinas, J.M., de la Iglesia-Vayá, M.: PadChest: a large chest x-ray image dataset with multi-label annotated reports. Med. Image Anal. 66, 101797 (2020)

    Article  Google Scholar 

  6. Carrino, C.P., et al.: Biomedical and clinical language models for Spanish: on the benefits of domain-specific pretraining in a mid-resource scenario (2021)

    Google Scholar 

  7. Casey, A., et al.: A systematic review of natural language processing applied to radiology reports. BMC Med. Inform. Decis. Mak. 21(1), 1–18 (2021)

    Article  Google Scholar 

  8. Cañete, J., Chaperon, G., Fuentes, R., Ho, J.H., Kang, H., Pérez, J.: Spanish pre-trained BERT model and evaluation data. In: PML4DC at ICLR 2020 (2020)

    Google Scholar 

  9. Chang, D., BalaĹľević, I., Allen, C., Chawla, D., Brandt, C., Taylor, R.A.: Benchmark and best practices for biomedical knowledge graph embeddings. In: Proceedings of the Conference Association for Computational Linguistics Meeting, vol. 2020, p. 167. NIH Public Access (2020)

    Google Scholar 

  10. Demner-Fushman, D., et al.: Preparing a collection of radiology examinations for distribution and retrieval. J. Am. Med. Inform. Assoc. 23(2), 304–310 (2016)

    Article  Google Scholar 

  11. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding, pp. 4171–4186 (2019)

    Google Scholar 

  12. Gu, Y., et al.: Domain-specific language model pretraining for biomedical natural language processing (2020)

    Google Scholar 

  13. Gu, Y., et al.: Domain-specific language model pretraining for biomedical natural language processing. Trans. Comput. Healthcare 3(1), 1–23 (2021)

    Google Scholar 

  14. Heilig, N., Kirchhoff, J., Stumpe, F., Plepi, J., Flek, L., Paulheim, H.: Refining diagnosis paths for medical diagnosis based on an augmented knowledge graph. arXiv:2204.13329 (2022)

  15. Hu, J., et al.: Word graph guided summarization for radiology findings. In: Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pp. 4980–4990 (2021)

    Google Scholar 

  16. Hu, J., Li, Z., Chen, Z., Li, Z., Wan, X., Chang, T.H.: Graph enhanced contrastive learning for radiology findings summarization. arXiv:2204.00203 (2022)

  17. Jain, S., et al.: RadGraph: extracting clinical entities and relations from radiology reports. In: Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1) (2021)

    Google Scholar 

  18. Ji, S., Pan, S., Cambria, E., Marttinen, P., Philip, S.Y.: A survey on knowledge graphs: representation, acquisition, and applications. IEEE Trans. Neural Netw. Learn. Syst. (2021)

    Google Scholar 

  19. Johnson, A.E., et al.: MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports. Sci. Data 6(1), 1–8 (2019)

    Article  Google Scholar 

  20. Kale, K., et al.: Knowledge graph construction and its application in automatic radiology report generation from radiologist’s dictation. arXiv preprint:2206.06308 (2022)

    Google Scholar 

  21. Lee, J., et al.: BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36(4), 1234–1240 (2020)

    Article  MathSciNet  Google Scholar 

  22. Litjens, G., et al.: A survey on deep learning in medical image analysis. Med. Image Anal. 42, 60–88 (2017)

    Article  Google Scholar 

  23. Liu, F., Wu, X., Ge, S., Fan, W., Zou, Y.: Exploring and distilling posterior and prior knowledge for radiology report generation. In: CVPR, pp. 13753–13762 (2021)

    Google Scholar 

  24. Liu, F., et al.: Auto-encoding knowledge graph for unsupervised medical report generation. NeurIPS 34, 16266–16279 (2021)

    Google Scholar 

  25. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)

  26. Perez, N., et al.: Cross-lingual semantic annotation of biomedical literature: experiments in Spanish and English. Bioinformatics 36(6), 1872–1880 (2019)

    Article  Google Scholar 

  27. Prabhakar, C., et al.: Structured knowledge graphs for classifying unseen patterns in radiographs. In: GeoMeDIA (2022)

    Google Scholar 

  28. Sohn, K., Lee, H., Yan, X.: Learning structured output representation using deep conditional generative models. In: NeurIPS, pp. 3483–3491 (2015)

    Google Scholar 

  29. van Sonsbeek, T., Zhen, X., Worring, M., Shao, L.: Variational knowledge distillation for disease classification in chest X-rays. In: Feragen, A., Sommer, S., Schnabel, J., Nielsen, M. (eds.) IPMI 2021. LNCS, vol. 12729, pp. 334–345. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-78191-0_26

    Chapter  Google Scholar 

  30. Veličković, P., Cucurull, G., Casanova, A., Romero, A., Liò, P., Bengio, Y.: Graph attention networks. In: International Conference on Learning Representations (2018)

    Google Scholar 

  31. Yan, S.: Memory-aligned knowledge graph for clinically accurate radiology image report generation. In: BioNLP, pp. 116–122 (2022)

    Google Scholar 

  32. Yang, S., Wu, X., Ge, S., Zhou, S.K., Xiao, L.: Knowledge matters: radiology report generation with general and specific knowledge. arXiv:2112.15009 (2021)

  33. Zhang, D., Ren, A., Liang, J., Liu, Q., Wang, H., Ma, Y.: Improving medical x-ray report generation by using knowledge graph. Appl. Sci. 12(21) (2022)

    Google Scholar 

  34. Zhang, Y., Chen, Q., Yang, Z., Lin, H., Lu, Z.: BioWordVec, improving biomedical word embeddings with subword information and MeSH. Sci. data 6(1), 1–9 (2019)

    Article  Google Scholar 

  35. Zhang, Y., Wang, X., Xu, Z., Yu, Q., Yuille, A., Xu, D.: When radiology report generation meets knowledge graph. In: AAAI, vol. 34, pp. 12910–12917 (2020)

    Google Scholar 

  36. Zhou, J., et al.: Graph neural networks: a review of methods and applications. AI Open 1, 57–81 (2020)

    Article  Google Scholar 

Download references

Acknowledgements

This work is financially supported by the Inception Institute of Artificial Intelligence, the University of Amsterdam and the allowance Top consortia for Knowledge and Innovation (TKIs) from the Netherlands Ministry of Economic Affairs and Climate Policy.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tom van Sonsbeek .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

van Sonsbeek, T., Zhen, X., Worring, M. (2024). Knowledge Graph Embeddings for Multi-lingual Structured Representations of Radiology Reports. In: Xue, Y., Chen, C., Chen, C., Zuo, L., Liu, Y. (eds) Data Augmentation, Labelling, and Imperfections. MICCAI 2023. Lecture Notes in Computer Science, vol 14379. Springer, Cham. https://doi.org/10.1007/978-3-031-58171-7_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-58171-7_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-58170-0

  • Online ISBN: 978-3-031-58171-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics