research-article

Exploring re-ranking approaches for joint named-entityrecognition and linking

Author:
Avirup Sil

Temple University, Philadelphia, PA, USA

Temple University, Philadelphia, PA, USA
View Profile

PIKM '13: Proceedings of the sixth workshop on Ph.D. students in information and knowledge managementNovember 2013Pages 11–18https://doi.org/10.1145/2513166.2513177

Published:02 November 2013Publication History

PIKM '13: Proceedings of the sixth workshop on Ph.D. students in information and knowledge management

Pages 11–18

ABSTRACT

Recognizing names and linking them to structured data is a fundamental task in text analysis. Existing approaches typically perform these two steps using a pipeline architecture: they use a Named-Entity Recognition (NER) system to find the boundaries of mentions in text, and an Entity Linking (EL) system to connect the mentions to entries in structured or semi-structured repositories like Wikipedia. However, the two tasks are tightly coupled, and each type of system can benefit significantly from the kind of information provided by the other. In this proposal, we present a joint model for NER and EL, called NEREL, that takes a large set of candidate mentions from typical NER systems and a large set of candidate entity links from EL systems, and ranks the candidate mention-entity pairs together to make joint predictions. In our initial NER and EL experiments across three datasets, NEREL significantly outperforms or comes close to the performance of two state-of-the-art NER systems, and it outperforms 6 competing EL systems. On the benchmark MSNBC dataset, NEREL provides a 60% reduction in error over the next-best NER system and a 68% reduction in error over the next-best EL system.

References

R. Bunescu and M. Pasca. Using encyclopedic knowledge for named entity disambiguation. In EACL, 2006.Google Scholar
Y. Chen and J. Martin. Towards Robust Unsupervised Personal Name Disambiguation. In EMNLP, pages 190--198, 2007.Google Scholar
R. Cilibrasi and P. Vitanyi. The google similarity distance. IEEE Transactions on Knowledge and Data Engineering, 19(3):370--383, 2007. Google ScholarDigital Library
S. Cucerzan. Large-scale named entity disambiguation based on wikipedia data. In EMNLP-CoNLL, pages 708--716, 2007.Google Scholar
A. Davis, A. Veloso, A. S. da Silva, W. Meira Jr, and A. H. Laender. Named entity disambiguation in streaming data. In ACL, 2012. Google ScholarDigital Library
P. Ferragina and U. Scaiella. Tagme: on-the-fly annotation of short text fragments (by wikipedia entities). In CIKM, 2010. Google ScholarDigital Library
J. R. Finkel, T. Grenager, and C. D. Manning. Incorporating non-local information into information extraction systems by gibbs sampling. In ACL, 2005. Google ScholarDigital Library
S. Guo, M.-W. Chang, and E. Kıcıman. To link or not to link? a study on end-to-end tweet entity linking. In NAACL, 2013.Google Scholar
X. Han, L. Sun, and J. Zhao. Collective entity linking in web text: a graph-based method. In SIGIR, 2011. Google ScholarDigital Library
X. Han and J. Zhao. Named entity disambiguation by leveraging Wikipedia semantic knowledge. In CIKM, pages 215--224, 2009. Google ScholarDigital Library
J. Hoffart, M. A. Yosef, I. Bordino, H. Furstenau, M. Pinkal, M. Spaniol, B. Taneva, S. Thater, and G. Weikum1. Robust Disambiguation of Named Entities in Text. In EMNLP, pages 782--792, 2011. Google ScholarDigital Library
S. Kulkarni, A. Singh, G. Ramakrishnan, and S. Chakrabarti. Collective annotation of wikipedia entities in web text. In KDD, pages 457--466, 2009. Google ScholarDigital Library
T. Kwiatkowski, L. Zettlemoyer, S. Goldwater, and M. Steedman. Lexical Generalization in CCG Grammar Induction for Semantic Parsing. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2011. Google ScholarDigital Library
T. Lin, Mausam, and O. Etzioni. Entity Linking at Web Scale. In AKBC-WEKEX, 2012. Google ScholarDigital Library
T. Lin, Mausam, and O. Etzioni. No Noun Phrase Left Behind: Detecting and Typing Unlinkable Entities. In EMNLP, 2012. Google ScholarDigital Library
G. Mann and D. Yarowsky. Unsupervised personal name disambiguation. In CoNLL, 2003. Google ScholarDigital Library
E. Meij, W. Weerkamp, and M. de Rijke. Adding semantics to microblog posts. In WSDM, 2012. Google ScholarDigital Library
P. N. Mendes, M. Jakob, and C. Bizer. Evaluating DBpedia Spotlight for the TAC-KBP Entity Linking Task. In TAC, 2011.Google Scholar
P. N. Mendes, M. Jakob, and C. Bizer. DBpedia for NLP: A Multilingual Cross-domain Knowledge Base. In LREC, 2012.Google Scholar
R. Mihalcea and A. Csomai. Wikify!: Linking documents to encyclopedic knowledge. In CIKM, pages 233--242, 2007. Google ScholarDigital Library
D. Milne and I. H. Witten. Learning to link with wikipedia. In CIKM, 2008. Google ScholarDigital Library
V. Punyakanok and D. Roth. The use of classifiers in sequential inference. 2001.Google Scholar
L. Ratinov and D. Roth. Design challenges and misconceptions in named entity recognition. In CoNLL, 2009. Google ScholarDigital Library
L. Ratinov, D. Roth, D. Downey, and M. Anderson. Local and global algorithms for disambiguation to wikipedia. In ACL, 2011. Google ScholarDigital Library
A. Sil, E. Cronin, P. Nie, Y. Yang, A.-M. Popescu, and A. Yates. Linking Named Entities to Any Database. In EMNLP-CoNLL, 2012. Google ScholarDigital Library
B. Taskar, C. Guestrin, and D. Koller. Max-margin markov networks. NIPS, 2003.Google ScholarDigital Library
E. F. Tjong Kim Sang and F. De Meulder. Introduction to the conll-2003 shared task: Language-independent named entity recognition. In Seventh Conference on Natural language learning at HLT-NAACL 2003-Volume 4, 2003. Google ScholarDigital Library
I. Tsochantaridis, T. Joachims, T. Hofmann, Y. Altun, and Y. Singer. Large margin methods for structured and interdependent output variables. JMLR, 2006. Google ScholarDigital Library
P. D. Turney. Thumbs up or thumbs down? semantic orientation applied to unsupervised classification of reviews. In Procs. of ACL, pages 417--424, 2002. Google ScholarDigital Library
Y. Zhou, L. Nie, O. Rouhani-Kalleh, F. Vasile, and S. Gaffney. Resolving surface forms to wikipedia topics. In Coling, pages 1335--1343, 2010. Google ScholarDigital Library

Index Terms

Exploring re-ranking approaches for joint named-entityrecognition and linking
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Language resources
2. Information systems
  1. Information retrieval
    1. Document representation
      1. Content analysis and feature selection

Recommendations

Re-ranking for joint named-entity recognition and linking
CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management

Recognizing names and linking them to structured data is a fundamental task in text analysis. Existing approaches typically perform these two steps using a pipeline architecture: they use a Named-Entity Recognition (NER) system to find the boundaries of ...
Read More
DAWT: Densely Annotated Wikipedia Texts Across Multiple Languages
WWW '17 Companion: Proceedings of the 26th International Conference on World Wide Web Companion

In this work, we open up the DAWT dataset - Densely Annotated Wikipedia Texts across multiple languages. The annotations include labeled text mentions mapping to entities (represented by their Freebase machine ids) as well as the type of the entity. The ...
Read More
Collective entity linking in web text: a graph-based method
SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval

Entity Linking (EL) is the task of linking name mentions in Web text with their referent entities in a knowledge base. Traditional EL methods usually link name mentions in a document by assuming them to be independent. However, there is often additional ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
PIKM '13: Proceedings of the sixth workshop on Ph.D. students in information and knowledge management
November 2013
52 pages
ISBN:9781450324229
DOI:10.1145/2513166
Program Chairs:
Fabian M. Suchanek
Télécom ParisTech, France
,
Anisoara Nica
SAP AG, Canada
Copyright © 2013 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 2 November 2013
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
entity disambiguation
entity linking
named entity recognition
Qualifiers
- research-article
Conference

Acceptance Rates
PIKM '13 Paper Acceptance Rate6of13submissions,46%Overall Acceptance Rate25of62submissions,40%
More
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3
  Total Citations
  View Citations
- 144
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Exploring re-ranking approaches for joint named-entityrecognition and linking

PIKM '13: Proceedings of the sixth workshop on Ph.D. students in information and knowledge management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Re-ranking for joint named-entity recognition and linking

DAWT: Densely Annotated Wikipedia Texts Across Multiple Languages

Collective entity linking in web text: a graph-based method