Abstract
The way we analyse clinical texts has undergone major changes over the last years. The introduction of language models such as BERT led to adaptations for the (bio)medical domain like PubMedBERT and ClinicalBERT. These models rely on large databases of archived medical documents. While performing well in terms of accuracy, both the lack of interpretability and limitations to transfer across languages limit their use in clinical setting. We introduce a novel light-weight graph-based embedding method specifically catering radiology reports. It takes into account the structure and composition of the report, while also connecting medical terms in the report through the multi-lingual SNOMED Clinical Terms knowledge base. The resulting graph embedding uncovers the underlying relationships among clinical terms, achieving a representation that is better understandable for clinicians and clinically more accurate, without reliance on large pre-training datasets. We show the use of this embedding on two tasks namely disease classification of X-ray reports and image classification. For disease classification our model is competitive with its BERT-based counterparts, while being magnitudes smaller in size and training data requirements. For image classification, we show the effectiveness of the graph embedding leveraging cross-modal knowledge transfer and show how this method is usable across different languages.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
github.com/tjvsonsbeek/knowledge_graphs_for_radiology_reports.git.
References
Alsentzer, E., et al.: Publicly available clinical BERT embeddings. NAACL HLT 2019, 72 (2019)
Aronson, A.R., Lang, F.M.: An overview of MetaMap: historical perspective and recent advances. JAMIA 17(3), 229–236 (2010)
Beam, A.L., et al.: Clinical concept embeddings learned from massive sources of multimodal medical data. In: Pacific Symposium on Biocomputing 2020, pp. 295–306. World Scientific (2019)
Bodenreider, O.: The unified medical language system (UMLS): integrating biomedical terminology. Nucleic Acids Res. 32, D267–D270 (2004)
Bustos, A., Pertusa, A., Salinas, J.M., de la Iglesia-Vayá, M.: PadChest: a large chest x-ray image dataset with multi-label annotated reports. Med. Image Anal. 66, 101797 (2020)
Carrino, C.P., et al.: Biomedical and clinical language models for Spanish: on the benefits of domain-specific pretraining in a mid-resource scenario (2021)
Casey, A., et al.: A systematic review of natural language processing applied to radiology reports. BMC Med. Inform. Decis. Mak. 21(1), 1–18 (2021)
Cañete, J., Chaperon, G., Fuentes, R., Ho, J.H., Kang, H., Pérez, J.: Spanish pre-trained BERT model and evaluation data. In: PML4DC at ICLR 2020 (2020)
Chang, D., Balažević, I., Allen, C., Chawla, D., Brandt, C., Taylor, R.A.: Benchmark and best practices for biomedical knowledge graph embeddings. In: Proceedings of the Conference Association for Computational Linguistics Meeting, vol. 2020, p. 167. NIH Public Access (2020)
Demner-Fushman, D., et al.: Preparing a collection of radiology examinations for distribution and retrieval. J. Am. Med. Inform. Assoc. 23(2), 304–310 (2016)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding, pp. 4171–4186 (2019)
Gu, Y., et al.: Domain-specific language model pretraining for biomedical natural language processing (2020)
Gu, Y., et al.: Domain-specific language model pretraining for biomedical natural language processing. Trans. Comput. Healthcare 3(1), 1–23 (2021)
Heilig, N., Kirchhoff, J., Stumpe, F., Plepi, J., Flek, L., Paulheim, H.: Refining diagnosis paths for medical diagnosis based on an augmented knowledge graph. arXiv:2204.13329 (2022)
Hu, J., et al.: Word graph guided summarization for radiology findings. In: Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pp. 4980–4990 (2021)
Hu, J., Li, Z., Chen, Z., Li, Z., Wan, X., Chang, T.H.: Graph enhanced contrastive learning for radiology findings summarization. arXiv:2204.00203 (2022)
Jain, S., et al.: RadGraph: extracting clinical entities and relations from radiology reports. In: Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1) (2021)
Ji, S., Pan, S., Cambria, E., Marttinen, P., Philip, S.Y.: A survey on knowledge graphs: representation, acquisition, and applications. IEEE Trans. Neural Netw. Learn. Syst. (2021)
Johnson, A.E., et al.: MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports. Sci. Data 6(1), 1–8 (2019)
Kale, K., et al.: Knowledge graph construction and its application in automatic radiology report generation from radiologist’s dictation. arXiv preprint:2206.06308 (2022)
Lee, J., et al.: BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36(4), 1234–1240 (2020)
Litjens, G., et al.: A survey on deep learning in medical image analysis. Med. Image Anal. 42, 60–88 (2017)
Liu, F., Wu, X., Ge, S., Fan, W., Zou, Y.: Exploring and distilling posterior and prior knowledge for radiology report generation. In: CVPR, pp. 13753–13762 (2021)
Liu, F., et al.: Auto-encoding knowledge graph for unsupervised medical report generation. NeurIPS 34, 16266–16279 (2021)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Perez, N., et al.: Cross-lingual semantic annotation of biomedical literature: experiments in Spanish and English. Bioinformatics 36(6), 1872–1880 (2019)
Prabhakar, C., et al.: Structured knowledge graphs for classifying unseen patterns in radiographs. In: GeoMeDIA (2022)
Sohn, K., Lee, H., Yan, X.: Learning structured output representation using deep conditional generative models. In: NeurIPS, pp. 3483–3491 (2015)
van Sonsbeek, T., Zhen, X., Worring, M., Shao, L.: Variational knowledge distillation for disease classification in chest X-rays. In: Feragen, A., Sommer, S., Schnabel, J., Nielsen, M. (eds.) IPMI 2021. LNCS, vol. 12729, pp. 334–345. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-78191-0_26
Veličković, P., Cucurull, G., Casanova, A., Romero, A., Liò, P., Bengio, Y.: Graph attention networks. In: International Conference on Learning Representations (2018)
Yan, S.: Memory-aligned knowledge graph for clinically accurate radiology image report generation. In: BioNLP, pp. 116–122 (2022)
Yang, S., Wu, X., Ge, S., Zhou, S.K., Xiao, L.: Knowledge matters: radiology report generation with general and specific knowledge. arXiv:2112.15009 (2021)
Zhang, D., Ren, A., Liang, J., Liu, Q., Wang, H., Ma, Y.: Improving medical x-ray report generation by using knowledge graph. Appl. Sci. 12(21) (2022)
Zhang, Y., Chen, Q., Yang, Z., Lin, H., Lu, Z.: BioWordVec, improving biomedical word embeddings with subword information and MeSH. Sci. data 6(1), 1–9 (2019)
Zhang, Y., Wang, X., Xu, Z., Yu, Q., Yuille, A., Xu, D.: When radiology report generation meets knowledge graph. In: AAAI, vol. 34, pp. 12910–12917 (2020)
Zhou, J., et al.: Graph neural networks: a review of methods and applications. AI Open 1, 57–81 (2020)
Acknowledgements
This work is financially supported by the Inception Institute of Artificial Intelligence, the University of Amsterdam and the allowance Top consortia for Knowledge and Innovation (TKIs) from the Netherlands Ministry of Economic Affairs and Climate Policy.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
van Sonsbeek, T., Zhen, X., Worring, M. (2024). Knowledge Graph Embeddings for Multi-lingual Structured Representations of Radiology Reports. In: Xue, Y., Chen, C., Chen, C., Zuo, L., Liu, Y. (eds) Data Augmentation, Labelling, and Imperfections. MICCAI 2023. Lecture Notes in Computer Science, vol 14379. Springer, Cham. https://doi.org/10.1007/978-3-031-58171-7_9
Download citation
DOI: https://doi.org/10.1007/978-3-031-58171-7_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-58170-0
Online ISBN: 978-3-031-58171-7
eBook Packages: Computer ScienceComputer Science (R0)