Relation Extraction from Clinical Cases for a Knowledge Graph

Savary, Agata; Silvanovich, Alena; Minard, Anne-Lyse; Hiot, Nicolas; Halfeld Ferrari, Mirian

doi:10.1007/978-3-031-15743-1_33

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1652))

Included in the following conference series:

European Conference on Advances in Databases and Information Systems

971 Accesses
1 Citations

Abstract

We describe a system for automatic extraction of semantic relations between entities in a medical corpus of clinical cases. It builds upon a previously developed module for entity extraction and upon a morphosyntactic parser. It uses experimentally designed rules based on syntactic dependencies and trigger words, as well as on sequencing and nesting of entities of particular types. The results obtained on a small corpus are promising. Our larger perspective is transforming information extracted from medical texts into knowledge graphs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Extract from PubMed: https://pubmed.ncbi.nlm.nih.gov/35365471/.
2.
https://www.slideshare.net/lyonwj/natural-language-processing-with-graph-databases-and-neo4j.
3.
Conditional Random Fields [15] are probabilistic models often used in NLP for sequence labelling tasks as they take into account the context of the samples to label.
4.
Entity types are listed in French and with the English translation, if it differs from French. In the rest of the paper, the English names are used in the core of the texts, while the French ones appear in figures. The entity types substance, anatomy and treatment will sometimes be abbreviated by sub, anat and treat, respectively.
5.
https://spacy.io/.
6.
https://spacy.io/models/fr.
7.
https://github.com/UniversalDependencies/UD_French-Sequoia.
8.
Word \(w_1\) is a syntactic head of word \(w_2\) if there is a syntactic dependency link outgoing from \(w_1\) and incoming in \(w_2\). Most dependency parsing models ensure that each word (except the root of the sentence) has exactly one head, i.e. the dependency graph is a tree.
9.
An asterisk following an entity type signals a negated entity occurrence.

References

Abacha, A.B., Zweigenbaum, P.: Automatic extraction of semantic relations between medical entities: a rule based approach. J. Biomed. Semant. 2(Suppl 5), S4+ (2011)
Google Scholar
Amavi, J., Halfeld Ferrari, M., Hiot, N.: Natural language querying system through entity enrichment. In: Bellatreche, L., et al. (eds.) TPDL/ADBIS -2020. CCIS, vol. 1260, pp. 36–48. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-55814-7_3
Chapter Google Scholar
Bodenreider, O.: The unified medical language system (UMLS): integrating biomedical terminology. Nucleic Acids Res. 32(Database-Issue), 267–270 (2004)
Google Scholar
Campillos, L., Deléger, L., Grouin, C., Hamon, T., Ligozat, A.L., Névéol, A.: A French clinical corpus with comprehensive semantic annotations: development of the medical entity and relation LIMSI annotated text corpus (MERLOT). Lang. Resour. Eval. 52(2), 571–601 (2017)
Google Scholar
Cardon, R., Grabar, N., Grouin, C., Hamon, T.: Présentation de la campagne d’évaluation DEFT 2020 : similarité textuelle en domaine ouvert et extraction d’information précise dans des cas cliniques. In: Cardon, R., Grabar, N., Grouin, C., Hamon, T. (eds.) 6e conférence conjointe Journées d’Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Atelier DÉfi Fouille de Textes, pp. 1–13. ATALA, Nancy, France (2020). https://hal.archives-ouvertes.fr/hal-02784737
Culotta, A., Sorensen, J.: Dependency tree kernels for relation extraction. In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-2004), pp. 423–429 (2004)
Google Scholar
Embarek, M., Ferret, O.: Learning patterns for building resources about semantic relations in the medical domain. In: Proceedings of the International Conference on Language Resources and Evaluation, LREC 2008, 26 May–1 June 2008, Marrakech, Morocco (2008)
Google Scholar
Fader, A., Soderland, S., Etzioni, O.: Identifying relations for open information extraction. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pp. 1535–1545. Association for Computational Linguistics, Edinburgh, Scotland, UK, July 2011. https://aclanthology.org/D11-1142
Francis, N., et al.: Cypher: an evolving query language for property graphs. In: Das, G., Jermaine, C.M., Bernstein, P.A. (eds.) Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference 2018, Houston, TX, USA, 10–15 June 2018, pp. 1433–1445. ACM (2018)
Google Scholar
Franciscus, N., Ren, X., Stantic, B.: Dependency graph for short text extraction and summarization. J. Inf. Telecommun. 3(4), 413–429 (2019)
Google Scholar
Fundel, K., Küffner, R., Zimmer, R.: RelEx-relation extraction using dependency parse trees. Bioinformatics 23, 365–371 (2007)
Google Scholar
Grabar, N., Grouin, C., Hamon, T., Claveau, V.: Corpus annoté de cas cliniques en français. In: TALN 2019–26e Conference on Traitement Automatique des Langues Naturelles, pp. 1–14. Toulouse, France, July 2019. https://hal.archives-ouvertes.fr/hal-02391878
Grouin, C., Grabar, N., Illouz, G.: Classification de cas cliniques et évaluation automatique de réponses d’étudiants : présentation de la campagne DEFT 2021. In: Denis, P., et al. (eds.) Traitement Automatique des Langues Naturelles, pp. 1–13. ATALA, Lille, France (2021). https://hal.archives-ouvertes.fr/hal-03265926
Hearst, M.A.: Automatic acquisition of hyponyms from large text corpora. In: COLING 1992 Volume 2: The 14th International Conference on Computational Linguistics (1992). https://aclanthology.org/C92-2082
Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 282–289. ICML 2001, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (2001)
Google Scholar
Li, Z., Yang, Z., Shen, C., Xu, J., Zhang, Y., Xu, H.: Integrating shortest dependency path and sentence sequence into a deep learning framework for relation extraction in clinical text. BMC Med. Inform. Decis. Mak. 19, 22 (2019)
Article Google Scholar
Mihalcea, R., Tarau, P.: Textrank: bringing order into text. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, EMNLP 2004, A meeting of SIGDAT, a Special Interest Group of the ACL, held in conjunction with ACL 2004, 25–26 July 2004, Barcelona, Spain, pp. 404–411. ACL (2004)
Google Scholar
Minard, A.L., Ligozat, A.L., Grau, B.: Multi-class SVM for relation extraction from clinical reports. In: Recent Advances in Natural Language Processing, RANLP 2011, 12–14 September, 2011, Hissar, Bulgaria, pp. 604–609 (2011)
Google Scholar
Minard, A.L., Roques, A., Hiot, N., Halfeld Ferrari, M., Savary, A.: DOING@DEFT: cascade de CRF pour l’annotation d’entités cliniques imbriquées. In: Cardon, R., Grabar, N., Grouin, C., Hamon, T. (eds.) 6e conférence conjointe Journées d’Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Atelier DÉfi Fouille de Textes, pp. 66–78. ATALA, Nancy, France (2020). https://hal.archives-ouvertes.fr/hal-02784743
Mintz, M., Bills, S., Snow, R., Jurafsky, D.: Distant supervision for relation extraction without labeled data. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, pp. 1003–1011. Association for Computational Linguistics, Suntec, Singapore, August 2009. https://aclanthology.org/P09-1113
Ramadier, L., Lafourcade, M.: Patrons sémantiques pour l’extraction de relations entre termes - Application aux comptes rendus radiologiques. In: TALN: Traitement Automatique des Langues Naturelles. jep-taln2016, Paris, France, July 2016. https://hal.archives-ouvertes.fr/hal-01382323
Rindflesch, T.C., Bean, C.A., Sneiderman, C.A.: Argument identification for arterial branching predications asserted in cardiac catheterization reports. In: AMIA Annual Symposium Proceedings, pp. 704–708 (2000)
Google Scholar
Uzuner, O., Mailoa, J., Ryan, R., Sibanda, T.: Semantic relations for problem-oriented medical records. Artif. Intell. Med. 50, 63–73 (2010)
Article Google Scholar

Download references

Acknowledgements

Work partly supported by the ICVL federation and RTR-DIAMS. It is done in the context of DOING action of the GDR-MADICS.

Author information

Authors and Affiliations

LISN, Paris-Saclay University, CNRS, Orsay, France
Agata Savary
LIFO, Université d’Orléans, INSA CVL, Orléans, France
Alena Silvanovich, Nicolas Hiot & Mirian Halfeld Ferrari
LLL, Université d’Orléans, Orléans, France
Anne-Lyse Minard
EnnovLabs – Ennov, Paris, France
Nicolas Hiot

Authors

Agata Savary
View author publications
You can also search for this author in PubMed Google Scholar
Alena Silvanovich
View author publications
You can also search for this author in PubMed Google Scholar
Anne-Lyse Minard
View author publications
You can also search for this author in PubMed Google Scholar
Nicolas Hiot
View author publications
You can also search for this author in PubMed Google Scholar
Mirian Halfeld Ferrari
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Agata Savary .

Editor information

Editors and Affiliations

Politecnico di Torino, Turin, Italy
Silvia Chiusano
Politecnico di Torino, Turin, Italy
Tania Cerquitelli
Poznań University of Technology, Poznań, Poland
Robert Wrembel
Norwegian University of Science and Technology, Trondheim, Norway
Kjetil Nørvåg
University of Genoa, Genoa, Italy
Barbara Catania
CNRS, Villeurbanne Cedex, France
Genoveva Vargas-Solar
University of Calabria, Rende, Italy
Ester Zumpano

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Savary, A., Silvanovich, A., Minard, AL., Hiot, N., Halfeld Ferrari, M. (2022). Relation Extraction from Clinical Cases for a Knowledge Graph. In: Chiusano, S., et al. New Trends in Database and Information Systems. ADBIS 2022. Communications in Computer and Information Science, vol 1652. Springer, Cham. https://doi.org/10.1007/978-3-031-15743-1_33

Download citation

DOI: https://doi.org/10.1007/978-3-031-15743-1_33
Published: 29 August 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15742-4
Online ISBN: 978-3-031-15743-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Relation Extraction from Clinical Cases for a Knowledge Graph