skip to main content
10.1145/2513166.2513177acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Exploring re-ranking approaches for joint named-entityrecognition and linking

Published:02 November 2013Publication History

ABSTRACT

Recognizing names and linking them to structured data is a fundamental task in text analysis. Existing approaches typically perform these two steps using a pipeline architecture: they use a Named-Entity Recognition (NER) system to find the boundaries of mentions in text, and an Entity Linking (EL) system to connect the mentions to entries in structured or semi-structured repositories like Wikipedia. However, the two tasks are tightly coupled, and each type of system can benefit significantly from the kind of information provided by the other. In this proposal, we present a joint model for NER and EL, called NEREL, that takes a large set of candidate mentions from typical NER systems and a large set of candidate entity links from EL systems, and ranks the candidate mention-entity pairs together to make joint predictions. In our initial NER and EL experiments across three datasets, NEREL significantly outperforms or comes close to the performance of two state-of-the-art NER systems, and it outperforms 6 competing EL systems. On the benchmark MSNBC dataset, NEREL provides a 60% reduction in error over the next-best NER system and a 68% reduction in error over the next-best EL system.

References

  1. R. Bunescu and M. Pasca. Using encyclopedic knowledge for named entity disambiguation. In EACL, 2006.Google ScholarGoogle Scholar
  2. Y. Chen and J. Martin. Towards Robust Unsupervised Personal Name Disambiguation. In EMNLP, pages 190--198, 2007.Google ScholarGoogle Scholar
  3. R. Cilibrasi and P. Vitanyi. The google similarity distance. IEEE Transactions on Knowledge and Data Engineering, 19(3):370--383, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Cucerzan. Large-scale named entity disambiguation based on wikipedia data. In EMNLP-CoNLL, pages 708--716, 2007.Google ScholarGoogle Scholar
  5. A. Davis, A. Veloso, A. S. da Silva, W. Meira Jr, and A. H. Laender. Named entity disambiguation in streaming data. In ACL, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. P. Ferragina and U. Scaiella. Tagme: on-the-fly annotation of short text fragments (by wikipedia entities). In CIKM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. R. Finkel, T. Grenager, and C. D. Manning. Incorporating non-local information into information extraction systems by gibbs sampling. In ACL, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. S. Guo, M.-W. Chang, and E. Kıcıman. To link or not to link? a study on end-to-end tweet entity linking. In NAACL, 2013.Google ScholarGoogle Scholar
  9. X. Han, L. Sun, and J. Zhao. Collective entity linking in web text: a graph-based method. In SIGIR, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. X. Han and J. Zhao. Named entity disambiguation by leveraging Wikipedia semantic knowledge. In CIKM, pages 215--224, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Hoffart, M. A. Yosef, I. Bordino, H. Furstenau, M. Pinkal, M. Spaniol, B. Taneva, S. Thater, and G. Weikum1. Robust Disambiguation of Named Entities in Text. In EMNLP, pages 782--792, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Kulkarni, A. Singh, G. Ramakrishnan, and S. Chakrabarti. Collective annotation of wikipedia entities in web text. In KDD, pages 457--466, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. T. Kwiatkowski, L. Zettlemoyer, S. Goldwater, and M. Steedman. Lexical Generalization in CCG Grammar Induction for Semantic Parsing. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. T. Lin, Mausam, and O. Etzioni. Entity Linking at Web Scale. In AKBC-WEKEX, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. T. Lin, Mausam, and O. Etzioni. No Noun Phrase Left Behind: Detecting and Typing Unlinkable Entities. In EMNLP, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. G. Mann and D. Yarowsky. Unsupervised personal name disambiguation. In CoNLL, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. E. Meij, W. Weerkamp, and M. de Rijke. Adding semantics to microblog posts. In WSDM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. P. N. Mendes, M. Jakob, and C. Bizer. Evaluating DBpedia Spotlight for the TAC-KBP Entity Linking Task. In TAC, 2011.Google ScholarGoogle Scholar
  19. P. N. Mendes, M. Jakob, and C. Bizer. DBpedia for NLP: A Multilingual Cross-domain Knowledge Base. In LREC, 2012.Google ScholarGoogle Scholar
  20. R. Mihalcea and A. Csomai. Wikify!: Linking documents to encyclopedic knowledge. In CIKM, pages 233--242, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. D. Milne and I. H. Witten. Learning to link with wikipedia. In CIKM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. V. Punyakanok and D. Roth. The use of classifiers in sequential inference. 2001.Google ScholarGoogle Scholar
  23. L. Ratinov and D. Roth. Design challenges and misconceptions in named entity recognition. In CoNLL, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. L. Ratinov, D. Roth, D. Downey, and M. Anderson. Local and global algorithms for disambiguation to wikipedia. In ACL, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A. Sil, E. Cronin, P. Nie, Y. Yang, A.-M. Popescu, and A. Yates. Linking Named Entities to Any Database. In EMNLP-CoNLL, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. B. Taskar, C. Guestrin, and D. Koller. Max-margin markov networks. NIPS, 2003.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. E. F. Tjong Kim Sang and F. De Meulder. Introduction to the conll-2003 shared task: Language-independent named entity recognition. In Seventh Conference on Natural language learning at HLT-NAACL 2003-Volume 4, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. I. Tsochantaridis, T. Joachims, T. Hofmann, Y. Altun, and Y. Singer. Large margin methods for structured and interdependent output variables. JMLR, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. P. D. Turney. Thumbs up or thumbs down? semantic orientation applied to unsupervised classification of reviews. In Procs. of ACL, pages 417--424, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Y. Zhou, L. Nie, O. Rouhani-Kalleh, F. Vasile, and S. Gaffney. Resolving surface forms to wikipedia topics. In Coling, pages 1335--1343, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Exploring re-ranking approaches for joint named-entityrecognition and linking

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        PIKM '13: Proceedings of the sixth workshop on Ph.D. students in information and knowledge management
        November 2013
        52 pages
        ISBN:9781450324229
        DOI:10.1145/2513166

        Copyright © 2013 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 2 November 2013

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        PIKM '13 Paper Acceptance Rate6of13submissions,46%Overall Acceptance Rate25of62submissions,40%

        Upcoming Conference

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader