skip to main content
10.1145/3587259.3627556acmconferencesArticle/Chapter ViewAbstractPublication Pagesk-capConference Proceedingsconference-collections
research-article

A Full-Fledged Framework for Combining Entity Linking Systems and Components

Published: 05 December 2023 Publication History

Abstract

Named entity recognition and disambiguation, often referred to as entity linking systems, refers to the task of automatically identifying knowledge graph entities in text documents. While a variety of entity linking systems based on very different approaches exist, these systems implicitly share certain processing steps in their pipeline. Despite this fact, they have been mainly used as stand-alone solutions. In this paper, we propose a framework for combining entity linking methods. This allows multiple entity linking systems and especially their components to be used in combination to an unlimited extent, thus allowing to achieve the best possible performance. In addition, the framework allows user-developed entity linking systems or components to be easily tested and automatically evaluated against other systems without having to set up other systems first. Essentially, our framework is knowledge graph agnostic and entity linking systems can be compared across knowledge graphs. Furthermore, our framework enables entity linking method or component recommendation, supporting the goal of achieving the best performance in a given context. We demonstrate that non-domain-expert users are able to deploy the framework within minutes and integrate unknown homebrew systems into it in less than an hour. Our framework is fully open source and available on GitHub1 along with Docker containers and tutorials2 (incl. Jupyter Notebooks).

References

[1]
Lorenzo Canale, Pasquale Lisena, and Raphaël Troncy. 2018. A Novel Ensemble Method for Named Entity Recognition and Disambiguation Based on Neural Network. In The Semantic Web – ISWC 2018, Denny Vrandečić, Kalina Bontcheva, Mari Carmen Suárez-Figueroa, Valentina Presutti, Irene Celino, Marta Sabou, Lucie-Aimée Kaffee, and Elena Simperl (Eds.). Springer International Publishing, Cham, 91–107.
[2]
Francesco Corcoglioniti, Alessio Palmero Aprosio, Yaroslav Nechaev, and C. Giuliano. 2016. MicroNeel: Combining NLP Tools to Perform Named Entity Detection and Linking on Microposts. In CLiC-it/EVALITA.
[3]
Antonin Delpeuch. 2020. OpenTapioca: Lightweight Entity Linking for Wikidata. In Proceedings of the 1st Wikidata Workshop co-located with the 19th International Semantic Web Conference (Virtual Event) (Wikidata’20, Vol. 2773). CEUR-WS.org.
[4]
Milan Dojchinovski and Tomás Kliegr. 2013. Entityclassifier.eu: Real-Time Classification of Entities in Text with Wikipedia. In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases (Prague, Czech Republic) (ECML-PKDD’13). Springer, 654–658.
[5]
Akbik et al.2019. FLAIR: An Easy-to-Use Framework for State-of-the-Art NLP. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Demonstrations, Waleed Ammar, Annie Louis, and Nasrin Mostafazadeh (Eds.). Association for Computational Linguistics, 54–59. https://doi.org/10.18653/v1/n19-4010
[6]
Hoffart et al.2011. Robust Disambiguation of Named Entities in Text. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (Edinburgh, UK) (EMNLP’11). ACL, 782–792.
[7]
Kluyver et al.2016. Jupyter Notebooks - a publishing format for reproducible computational workflows. In Positioning and Power in Academic Publishing: Players, Agents and Agendas, 20th International Conference on Electronic Publishing, Göttingen, Germany, June 7-9, 2016. IOS Press, 87–90. https://doi.org/10.3233/978-1-61499-649-1-87
[8]
Shen et al.2015. Entity Linking with a Knowledge Base: Issues, Techniques, and Solutions. IEEE Trans. Knowl. Data Eng. 27, 2 (2015), 443–460. https://doi.org/10.1109/TKDE.2014.2327028
[9]
Explosion. 2021. spaCy, Industrial-Strength Natural Language Processing. https://spacy.io/
[10]
Tiziano Flati and Roberto Navigli. 2014. Three Birds (in the LLOD Cloud) with One Stone: BabelNet, Babelfy and the Wikipedia Bitaxonomy. In Proceedings of the Posters and Demos Track of 10th International Conference on Semantic Systems (Leipzig, Germany) (SEMANTiCS’14, Vol. 1224). CEUR-WS.org, 10–13.
[11]
Ganea and Hofmann. 2017. Deep Joint Entity Disambiguation with Local Neural Attention. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017, Copenhagen, Denmark, September 9-11, 2017. Association for Computational Linguistics, 2619–2629. https://doi.org/10.18653/v1/d17-1277
[12]
Renato Stoffalette João, Pavlos Fafalios, and Stefan Dietze. 2020. Better Together: An Ensemble Learner for Combining the Results of Ready-Made Entity Linking Systems. In Proceedings of the 35th Annual ACM Symposium on Applied Computing (Brno, Czech Republic) (SAC ’20). Association for Computing Machinery, New York, NY, USA, 851–858. https://doi.org/10.1145/3341105.3373883
[13]
Phong Le and Ivan Titov. 2018. Improving Entity Linking by Modeling Latent Relations between Mentions. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Volume 1: Long Papers, Iryna Gurevych and Yusuke Miyao (Eds.). Association for Computational Linguistics, 1595–1604. https://doi.org/10.18653/v1/P18-1148
[14]
Pablo N. Mendes, Max Jakob, Andrés García-Silva, and Christian Bizer. 2011. DBpedia Spotlight: Shedding Light on the Web of Documents. In Proceedings the 7th International Conference on Semantic Systems (Graz, Austria) (I-SEMANTICS’11). ACM, 1–8.
[15]
Kristian Noullet. 2020. KG-Agnostic Entity Linking Orchestration. In Proceedings of the Doctoral Consortium at ISWC 2020 co-located with 19th International Semantic Web Conference (ISWC 2020), Athens, Greece, November 3rd, 2020(CEUR Workshop Proceedings, Vol. 2798). CEUR-WS.org, 41–48. http://ceur-ws.org/Vol-2798/paper6.pdf
[16]
Kristian Noullet, Samuel Printz, and Michael Färber. 2021. CLiT: Combining Linking Techniques for Everyone. In The Semantic Web: ESWC 2021 Satellite Events - Virtual Event, June 6-10, 2021, Revised Selected Papers(Lecture Notes in Computer Science, Vol. 12739). Springer, 88–92. https://doi.org/10.1007/978-3-030-80418-3_16
[17]
Francesco Piccinno and Paolo Ferragina. 2014. From TagME to WAT: a new entity annotator. In ERD’14, Proceedings of the First ACM International Workshop on Entity Recognition & Disambiguation, July 11, 2014, Gold Coast, Queensland, Australia. ACM, 55–62. https://doi.org/10.1145/2633211.2634350
[18]
Giuseppe Rizzo 2012. NERD: A Framework for Unifying Named Entity Recognition and Disambiguation Extraction Tools. In Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics (Avignon, France) (EACL ’12). Association for Computational Linguistics, USA, 73–76.
[19]
Pablo Ruiz and Thierry Poibeau. 2015. Combining Open Source Annotators for Entity Linking through Weighted Voting. In Proceedings of the Fourth Joint Conference on Lexical and Computational Semantics. Association for Computational Linguistics, Denver, Colorado, 211–215. https://doi.org/10.18653/v1/S15-1025
[20]
Ahmad Sakor, Kuldeep Singh, Anery Patel, and Maria-Esther Vidal. 2020. Falcon 2.0: An Entity and Relation Linking Tool over Wikidata. In CIKM ’20: The 29th ACM International Conference on Information and Knowledge Management, Virtual Event, Ireland, October 19-23, 2020, Mathieu d’Aquin, Stefan Dietze, Claudia Hauff, Edward Curry, and Philippe Cudré-Mauroux (Eds.). ACM, 3141–3148. https://doi.org/10.1145/3340531.3412777
[21]
Ozge Sevgili, Artem Shelmanov, Mikhail Arkhipov, Alexander Panchenko, and Chris Biemann. 2020. Neural Entity Linking: A Survey of Models based on Deep Learning. arxiv:2006.00575 [cs.CL]
[22]
René Speck and Axel-Cyrille Ngonga Ngomo. 2014. Named Entity Recognition using FOX. In Proceedings of the ISWC 2014 Posters & Demonstrations Track of the 13th International Semantic Web Conference (Riva del Garda, Italy) (ISWC’14, Vol. 1272). CEUR-WS.org, 85–88.
[23]
TextRazor Ltd.2023. TextRazor, Extract Meaning from your Text.https://www.textrazor.com/
[24]
Ricardo Usbeck, Michael Röder, Axel-Cyrille Ngonga Ngomo, Ciro Baron, Andreas Both, Martin Brümmer, Diego Ceccarelli, Marco Cornolti, Didier Cherix, Bernd Eickmann, Paolo Ferragina, Christiane Lemke, Andrea Moro, Roberto Navigli, Francesco Piccinno, Giuseppe Rizzo, Harald Sack, René Speck, Raphaël Troncy, Jörg Waitelonis, and Lars Wesemann. 2015. GERBIL: General Entity Annotator Benchmarking Framework. In Proceedings of the 24th International Conference on World Wide Web (Florence, Italy) (WWW’15). ACM, 1133–1143.
[25]
van Erp et al.2016. Evaluating Entity Linking: An Analysis of Current Benchmark Datasets and a Roadmap for Doing a Better Job. In Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, Portorož, Slovenia, May 23-28, 2016. European Language Resources Association (ELRA). http://www.lrec-conf.org/proceedings/lrec2016/summaries/926.html
[26]
Johannes M. van Hulst, Faegheh Hasibi, Koen Dercksen, Krisztian Balog, and Arjen P. de Vries. 2020. REL: An Entity Linker Standing on the Shoulders of Giants. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR ’20). ACM.
[27]
Ikuya Yamada, Akari Asai, Hiroyuki Shindo, Hideaki Takeda, and Yoshiyasu Takefuji. 2018. Wikipedia2Vec: An Optimized Tool for Learning Embeddings of Words and Entities from Wikipedia. CoRR abs/1812.06280 (2018). arXiv:1812.06280http://arxiv.org/abs/1812.06280

Cited By

View all
  • (2024)Understanding the Impact of Entity Linking on the Topology of Entity Co-occurrence Networks for Social Media AnalysisKnowledge Engineering and Knowledge Management10.1007/978-3-031-77792-9_5(69-85)Online publication date: 20-Nov-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
K-CAP '23: Proceedings of the 12th Knowledge Capture Conference 2023
December 2023
270 pages
ISBN:9798400701412
DOI:10.1145/3587259
  • Editors:
  • Brent Venable,
  • Daniel Garijo,
  • Brian Jalaian
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 December 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Entity Linking
  2. FAIR
  3. Framework
  4. NERD Orchestration.
  5. NLP
  6. Recommender System
  7. Semantic Web

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • BMBF

Conference

K-CAP '23
Sponsor:
K-CAP '23: Knowledge Capture Conference 2023
December 5 - 7, 2023
FL, Pensacola, USA

Acceptance Rates

Overall Acceptance Rate 55 of 198 submissions, 28%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)17
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Understanding the Impact of Entity Linking on the Topology of Entity Co-occurrence Networks for Social Media AnalysisKnowledge Engineering and Knowledge Management10.1007/978-3-031-77792-9_5(69-85)Online publication date: 20-Nov-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media