skip to main content
10.1145/3308560.3314199acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

Towards Better Entity Linking Evaluation

Published:13 May 2019Publication History

ABSTRACT

The Entity Linking (EL) task is concerned with linking entity mentions in a text collection with their corresponding knowledge-base entries. Despite the progress made in the evaluation of EL systems, there is still much work to be done, where this Ph.D. research tackles issues concerning EL evaluation. Among these issues, we stress (a) the lack of consensus about the definition of “entity” and the lack of evaluation metrics that allow for different notions of entities, (b) the lack of datasets that allow for cross-language comparison, and (c) the focus on evaluating high-level systems rather than low-level techniques. By addressing these challenges and better understanding the performance of EL systems, our hypothesis is that we can create a more general, more configurable EL framework that can be better adapted to the needs of a particular application. In the early stages of this PhD work, we have identified these problems and begun to address (a) and (b), publishing initial results that constitute a significant step forward in our investigation. However, there are still further challenges that must be addressed before we reach our goal. Our next steps thus involve proposing a more fluid definition of “entity” adaptable to different applications, the definition of quality measures that allow for comparing EL approaches targeting different types of entities, as well as the creation of a customizable EL framework that allows for composing and evaluating individual techniques as appropriate to a particular task.

References

  1. Carmen Brando, Francesca Frontini, and Jean-Gabriel Ganascia. 2016. REDEN: named entity linking in digital literary editions using linked data sets. CSIMQ7(2016), 60–80.Google ScholarGoogle Scholar
  2. Martin Brümmer, Milan Dojchinovski, and Sebastian Hellmann. 2016. DBpedia Abstracts: A Large-Scale, Open, Multilingual NLP Training Corpus.. In LREC.Google ScholarGoogle Scholar
  3. Silviu Cucerzan. 2007. Large-Scale Named Entity Disambiguation Based on Wikipedia Data. EMNLP-CoNLL (2007), 708.Google ScholarGoogle Scholar
  4. Milan Dojchinovski and Tomás Kliegr. 2013. Entityclassifier.eu: Real-Time Classification of Entities in Text with Wikipedia. In ECML/PKDD. 654–658. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Alan Eckhardt, Juraj Hresko, Jan Procházka, and Otakar Smrs. 2014. Entity linking based on the co-occurrence graph and entity probability. In ERD. 37–44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Oren Etzioni, Michael J. Cafarella, Doug Downey, Ana-Maria Popescu, Tal Shaked, Stephen Soderland, Daniel S. Weld, and Alexander Yates. 2005. Unsupervised named-entity extraction from the Web: An experimental study. Artif. Intell. 165, 1 (2005), 91–134. Google ScholarGoogle ScholarCross RefCross Ref
  7. Paolo Ferragina and Ugo Scaiella. 2010. TAGME: on-the-fly annotation of short text fragments (by wikipedia entities). In CIKM. 1625–1628. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Michael Fleischman. 2001. Automated Subcategorization of Named Entities. In ACL. 25–30.Google ScholarGoogle Scholar
  9. Aldo Gangemi, Valentina Presutti, Diego Reforgiato Recupero, Andrea Giovanni Nuzzolese, Francesco Draicchio, and Misael Mongiovì. 2017. Semantic Web Machine Reading with FRED. Semantic Web 8, 6 (2017), 873–893.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Ralph Grishman and Beth Sundheim. 1996. Message Understanding Conference- 6: A Brief History. In COLING. 466–471. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Johannes Hoffart and et al.2011. Robust disambiguation of named entities in text. In EMNLP. ACL, 782–792. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Johannes Hoffart, Stephan Seufert, Dat Ba Nguyen, Martin Theobald, and Gerhard Weikum. 2012. KORE: keyphrase overlap relatedness for entity disambiguation. In CIKM. 545–554. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Johannes Hoffart, Stephan Seufert, Dat Ba Nguyen, Martin Theobald, and Gerhard Weikum. 2012. KORE: keyphrase overlap relatedness for entity disambiguation. In CIKM. 545–554. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Johannes Hoffart, Mohamed Amir Yosef, Ilaria Bordino, Hagen Fürstenau, Manfred Pinkal, Marc Spaniol, Bilyana Taneva, Stefan Thater, and Gerhard Weikum. 2011. Robust Disambiguation of Named Entities in Text. In EMNLP. 782–792. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Kunal Jha, Michael Röder, and Axel-Cyrille Ngonga Ngomo. 2017. All that Glitters Is Not Gold - Rule-Based Curation of Reference Datasets for Named Entity Recognition and Entity Linking. In ESWC. 305–320.Google ScholarGoogle Scholar
  16. Sayali Kulkarni, Amit Singh, Ganesh Ramakrishnan, and Soumen Chakrabarti. 2009. Collective annotation of Wikipedia entities in web text. In SIGKDD. 457–466. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Xiao Ling, Sameer Singh, and Daniel S. Weld. 2015. Design Challenges for Entity Linking. TACL 3(2015), 315–328.Google ScholarGoogle ScholarCross RefCross Ref
  18. Pablo N. Mendes, Max Jakob, Andrés García-Silva, and Christian Bizer. 2011. DBpedia spotlight: shedding light on the web of documents. In I-SEMANTICS. 1–8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Pablo N Mendes, Max Jakob, Andrés García-Silva, and Christian Bizer. 2011. DBpedia spotlight: shedding light on the web of documents. In I-SEMANTICS. ACM, 1–8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A.L. Minard and et al.2016. MEANTIME, the NewsReader multilingual event and time corpus. (2016).Google ScholarGoogle Scholar
  21. Anne-Lyse Minard, Manuela Speranza, Ruben Urizar, Begoña Altuna, Marieke van Erp, Anneleen Schoen, and Chantal van Son. 2016. MEANTIME, the NewsReader Multilingual Event and Time Corpus. In LREC.Google ScholarGoogle Scholar
  22. Andrea Moro and Roberto Navigli. 2015. SemEval-2015 Task 13: Multilingual All-Words Sense Disambiguation and Entity Linking.. In SemEval. 288–297.Google ScholarGoogle Scholar
  23. Andrea Moro, Alessandro Raganato, and Roberto Navigli. 2014. Entity Linking meets Word Sense Disambiguation: a Unified Approach. TACL 2(2014), 231–244.Google ScholarGoogle ScholarCross RefCross Ref
  24. Diego Moussallem, Ricardo Usbeck, Michael Röder, and Axel-Cyrille Ngonga Ngomo. 2018. Entity Linking in 40 Languages Using MAG. In ESWC. 176–181.Google ScholarGoogle Scholar
  25. Dat Ba Nguyen, Martin Theobald, and Gerhard Weikum. 2016. J-NERD: Joint Named Entity Recognition and Disambiguation with Rich Linguistic Features. TACL 4(2016), 215–229.Google ScholarGoogle ScholarCross RefCross Ref
  26. Julien Plu, Giuseppe Rizzo, and Raphaël Troncy. 2016. Enhancing Entity Linking by Combining NER Models. In ESWC. 17–32.Google ScholarGoogle Scholar
  27. Lev Ratinov, Dan Roth, Doug Downey, and Mike Anderson. 2011. Local and global algorithms for disambiguation to Wikipedia. In NAACL-HLT. 1375–1384. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Michael Röder, Ricardo Usbeck, Sebastian Hellmann, Daniel Gerber, and Andreas Both. 2014. N3 - A Collection of Datasets for Named Entity Recognition and Disambiguation in the NLP Interchange Format. In LREC. 3529–3533.Google ScholarGoogle Scholar
  29. Henry Rosales-Méndez, Aidan Hogan, and Barbara Poblete. 2018. Machine Translation vs. Multilingual Approaches for Entity Linking. In ISWC (P&D/Industry/BlueSky).Google ScholarGoogle Scholar
  30. Henry Rosales-Méndez, Aidan Hogan, and Barbara Poblete. 2018. VoxEL: A Benchmark Dataset for Multilingual Entity Linking. In ISWC. 170–186.Google ScholarGoogle Scholar
  31. Henry Rosales-Méndez, Barbara Poblete, and Aidan Hogan. 2017. Multilingual Entity Linking: Comparing English and Spanish. In LD4IE. 62–73.Google ScholarGoogle Scholar
  32. Henry Rosales-Méndez, Barbara Poblete, and Aidan Hogan. 2018. What Should Entity Linking link?. In AMW.Google ScholarGoogle Scholar
  33. Felix Sasaki, Milan Dojchinovski, and Jan Nehring. 2016. Chainable and Extendable Knowledge Integration Web Services. In NLP&DBpedia. 89–101.Google ScholarGoogle Scholar
  34. Chen-Tse Tsai and Dan Roth. 2016. Cross-lingual Wikification Using Multilingual Embeddings. In HLT-NAACL. 589–598.Google ScholarGoogle Scholar
  35. Victoria S. Uren, Philipp Cimiano, José Iria, Siegfried Handschuh, Maria Vargas-Vera, Enrico Motta, and Fabio Ciravegna. 2006. Semantic annotation for knowledge management: Requirements and a survey of the state of the art. Journal of Web Semantics 4, 1 (2006), 14–28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Jörg Waitelonis, Claudia Exeler, and Harald Sack. 2015. Linked data enabled generalized vector space model to improve document retrieval. In NLP&DBpedia.Google ScholarGoogle Scholar
  37. Longyue Wang, Shuo Li, Derek F. Wong, and Lidia S. Chao. 2012. A Joint Chinese Named Entity Recognition and Disambiguation System. In CIPS-SIGHAN. 146–151.Google ScholarGoogle Scholar

Index Terms

  1. Towards Better Entity Linking Evaluation
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        WWW '19: Companion Proceedings of The 2019 World Wide Web Conference
        May 2019
        1331 pages
        ISBN:9781450366755
        DOI:10.1145/3308560

        Copyright © 2019 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 13 May 2019

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

        Acceptance Rates

        Overall Acceptance Rate1,899of8,196submissions,23%
      • Article Metrics

        • Downloads (Last 12 months)8
        • Downloads (Last 6 weeks)4

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format