Skip to main content

Relation Extraction from Clinical Cases for a Knowledge Graph

  • Conference paper
  • First Online:
New Trends in Database and Information Systems (ADBIS 2022)

Abstract

We describe a system for automatic extraction of semantic relations between entities in a medical corpus of clinical cases. It builds upon a previously developed module for entity extraction and upon a morphosyntactic parser. It uses experimentally designed rules based on syntactic dependencies and trigger words, as well as on sequencing and nesting of entities of particular types. The results obtained on a small corpus are promising. Our larger perspective is transforming information extracted from medical texts into knowledge graphs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Extract from PubMed: https://pubmed.ncbi.nlm.nih.gov/35365471/.

  2. 2.

    https://www.slideshare.net/lyonwj/natural-language-processing-with-graph-databases-and-neo4j.

  3. 3.

    Conditional Random Fields [15] are probabilistic models often used in NLP for sequence labelling tasks as they take into account the context of the samples to label.

  4. 4.

    Entity types are listed in French and with the English translation, if it differs from French. In the rest of the paper, the English names are used in the core of the texts, while the French ones appear in figures. The entity types substance, anatomy and treatment will sometimes be abbreviated by sub, anat and treat, respectively.

  5. 5.

    https://spacy.io/.

  6. 6.

    https://spacy.io/models/fr.

  7. 7.

    https://github.com/UniversalDependencies/UD_French-Sequoia.

  8. 8.

    Word \(w_1\) is a syntactic head of word \(w_2\) if there is a syntactic dependency link outgoing from \(w_1\) and incoming in \(w_2\). Most dependency parsing models ensure that each word (except the root of the sentence) has exactly one head, i.e. the dependency graph is a tree.

  9. 9.

    An asterisk following an entity type signals a negated entity occurrence.

References

  1. Abacha, A.B., Zweigenbaum, P.: Automatic extraction of semantic relations between medical entities: a rule based approach. J. Biomed. Semant. 2(Suppl 5), S4+ (2011)

    Google Scholar 

  2. Amavi, J., Halfeld Ferrari, M., Hiot, N.: Natural language querying system through entity enrichment. In: Bellatreche, L., et al. (eds.) TPDL/ADBIS -2020. CCIS, vol. 1260, pp. 36–48. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-55814-7_3

    Chapter  Google Scholar 

  3. Bodenreider, O.: The unified medical language system (UMLS): integrating biomedical terminology. Nucleic Acids Res. 32(Database-Issue), 267–270 (2004)

    Google Scholar 

  4. Campillos, L., Deléger, L., Grouin, C., Hamon, T., Ligozat, A.L., Névéol, A.: A French clinical corpus with comprehensive semantic annotations: development of the medical entity and relation LIMSI annotated text corpus (MERLOT). Lang. Resour. Eval. 52(2), 571–601 (2017)

    Google Scholar 

  5. Cardon, R., Grabar, N., Grouin, C., Hamon, T.: Présentation de la campagne d’évaluation DEFT 2020 : similarité textuelle en domaine ouvert et extraction d’information précise dans des cas cliniques. In: Cardon, R., Grabar, N., Grouin, C., Hamon, T. (eds.) 6e conférence conjointe Journées d’Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Atelier DÉfi Fouille de Textes, pp. 1–13. ATALA, Nancy, France (2020). https://hal.archives-ouvertes.fr/hal-02784737

  6. Culotta, A., Sorensen, J.: Dependency tree kernels for relation extraction. In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-2004), pp. 423–429 (2004)

    Google Scholar 

  7. Embarek, M., Ferret, O.: Learning patterns for building resources about semantic relations in the medical domain. In: Proceedings of the International Conference on Language Resources and Evaluation, LREC 2008, 26 May–1 June 2008, Marrakech, Morocco (2008)

    Google Scholar 

  8. Fader, A., Soderland, S., Etzioni, O.: Identifying relations for open information extraction. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pp. 1535–1545. Association for Computational Linguistics, Edinburgh, Scotland, UK, July 2011. https://aclanthology.org/D11-1142

  9. Francis, N., et al.: Cypher: an evolving query language for property graphs. In: Das, G., Jermaine, C.M., Bernstein, P.A. (eds.) Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference 2018, Houston, TX, USA, 10–15 June 2018, pp. 1433–1445. ACM (2018)

    Google Scholar 

  10. Franciscus, N., Ren, X., Stantic, B.: Dependency graph for short text extraction and summarization. J. Inf. Telecommun. 3(4), 413–429 (2019)

    Google Scholar 

  11. Fundel, K., Küffner, R., Zimmer, R.: RelEx-relation extraction using dependency parse trees. Bioinformatics 23, 365–371 (2007)

    Google Scholar 

  12. Grabar, N., Grouin, C., Hamon, T., Claveau, V.: Corpus annoté de cas cliniques en français. In: TALN 2019–26e Conference on Traitement Automatique des Langues Naturelles, pp. 1–14. Toulouse, France, July 2019. https://hal.archives-ouvertes.fr/hal-02391878

  13. Grouin, C., Grabar, N., Illouz, G.: Classification de cas cliniques et évaluation automatique de réponses d’étudiants : présentation de la campagne DEFT 2021. In: Denis, P., et al. (eds.) Traitement Automatique des Langues Naturelles, pp. 1–13. ATALA, Lille, France (2021). https://hal.archives-ouvertes.fr/hal-03265926

  14. Hearst, M.A.: Automatic acquisition of hyponyms from large text corpora. In: COLING 1992 Volume 2: The 14th International Conference on Computational Linguistics (1992). https://aclanthology.org/C92-2082

  15. Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 282–289. ICML 2001, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (2001)

    Google Scholar 

  16. Li, Z., Yang, Z., Shen, C., Xu, J., Zhang, Y., Xu, H.: Integrating shortest dependency path and sentence sequence into a deep learning framework for relation extraction in clinical text. BMC Med. Inform. Decis. Mak. 19, 22 (2019)

    Article  Google Scholar 

  17. Mihalcea, R., Tarau, P.: Textrank: bringing order into text. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, EMNLP 2004, A meeting of SIGDAT, a Special Interest Group of the ACL, held in conjunction with ACL 2004, 25–26 July 2004, Barcelona, Spain, pp. 404–411. ACL (2004)

    Google Scholar 

  18. Minard, A.L., Ligozat, A.L., Grau, B.: Multi-class SVM for relation extraction from clinical reports. In: Recent Advances in Natural Language Processing, RANLP 2011, 12–14 September, 2011, Hissar, Bulgaria, pp. 604–609 (2011)

    Google Scholar 

  19. Minard, A.L., Roques, A., Hiot, N., Halfeld Ferrari, M., Savary, A.: DOING@DEFT: cascade de CRF pour l’annotation d’entités cliniques imbriquées. In: Cardon, R., Grabar, N., Grouin, C., Hamon, T. (eds.) 6e conférence conjointe Journées d’Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Atelier DÉfi Fouille de Textes, pp. 66–78. ATALA, Nancy, France (2020). https://hal.archives-ouvertes.fr/hal-02784743

  20. Mintz, M., Bills, S., Snow, R., Jurafsky, D.: Distant supervision for relation extraction without labeled data. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, pp. 1003–1011. Association for Computational Linguistics, Suntec, Singapore, August 2009. https://aclanthology.org/P09-1113

  21. Ramadier, L., Lafourcade, M.: Patrons sémantiques pour l’extraction de relations entre termes - Application aux comptes rendus radiologiques. In: TALN: Traitement Automatique des Langues Naturelles. jep-taln2016, Paris, France, July 2016. https://hal.archives-ouvertes.fr/hal-01382323

  22. Rindflesch, T.C., Bean, C.A., Sneiderman, C.A.: Argument identification for arterial branching predications asserted in cardiac catheterization reports. In: AMIA Annual Symposium Proceedings, pp. 704–708 (2000)

    Google Scholar 

  23. Uzuner, O., Mailoa, J., Ryan, R., Sibanda, T.: Semantic relations for problem-oriented medical records. Artif. Intell. Med. 50, 63–73 (2010)

    Article  Google Scholar 

Download references

Acknowledgements

Work partly supported by the ICVL federation and RTR-DIAMS. It is done in the context of DOING action of the GDR-MADICS.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Agata Savary .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Savary, A., Silvanovich, A., Minard, AL., Hiot, N., Halfeld Ferrari, M. (2022). Relation Extraction from Clinical Cases for a Knowledge Graph. In: Chiusano, S., et al. New Trends in Database and Information Systems. ADBIS 2022. Communications in Computer and Information Science, vol 1652. Springer, Cham. https://doi.org/10.1007/978-3-031-15743-1_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-15743-1_33

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-15742-4

  • Online ISBN: 978-3-031-15743-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics