Skip to main content

Graph-Based Named Entity Linking with Wikipedia

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6997))

Abstract

Named entity linking (NEL) grounds entity mentions to their corresponding Wikipedia article. State-of-the-art supervised NEL systems use features over the rich Wikipedia document and link-graph structure.

Graph-based measures have been effective over WordNet for word sense disambiguation (wsd). We draw parallels between NEL and (wsd), motivating our unsupervised NEL approach that exploits the Wikipedia article and category link graphs. Our system achieves 85.5% accuracy on the TAC 2010 shared task — competitive with the best supervised and unsupervised systems.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bagga, A., Baldwin, B.: Entity-based cross-document coreferencing using the vector space model. In: Proceedings of the 17th International Conference on Computational Linguistics, Montreal, Quebec, Canada, pp. 79–85 (1998)

    Google Scholar 

  2. Barzilay, R., Elhadad, M.: Using lexical chains for text summarization. In: Proceedings of the ACL Workshop on Intelligent Scalable Text Summarization, Madrid, Spain, pp. 10–17 (1997)

    Google Scholar 

  3. Borgatti, S.: Identifying sets of key players in a network. In: Proceedings of the International Conference on Integration of Knowledge Intensive Multi-Agent Systems, Cambridge, MA, USA, pp. 127–131 (2003)

    Google Scholar 

  4. Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. In: Proceedings of the 7th International Conference on the World Wide Web, Brisbane, Australia, pp. 107–117 (1998)

    Google Scholar 

  5. Bunescu, R., Paşca, M.: Using encyclopedic knowledge for named entity disambiguation. In: Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics, Trento, Italy, pp. 9–16 (2006)

    Google Scholar 

  6. Cucerzan, S.: Large-scale named entity disambiguation based on Wikipedia data. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Prague, Czech Republic, pp. 708–716 (2007)

    Google Scholar 

  7. Curran, J.R., Clark, S.: Language independent NER using a maximum entropy tagger. In: Proceedings of the Seventh Conference on Natural Language Learning, Edmonton, Canada, pp. 164–167 (2003)

    Google Scholar 

  8. de Vries, A.P., Vercoustre, A.M., Thom, J.A., Craswell, N., Lalmas, M.: Overview of the INEX 2007 entity ranking track. In: Fuhr, N., Kamps, J., Lalmas, M., Trotman, A. (eds.) INEX 2007. LNCS, vol. 4862, pp. 245–251. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  9. Dredze, M., McNamee, P., Rao, D., Gerber, A., Finin, T.: Entity disambiguation for Knowledge Base Population. In: Proceedings of the 23rd International Conference on Computational Linguistics, Beijing, China, pp. 277–285 (2010)

    Google Scholar 

  10. Fader, A., Soderland, S., Etzioni, O.: Scaling Wikipedia-based named entity disambiguation to arbitrary web text. In: Proceedings of the IJCAI Workshop on User-contributed Knowledge and Artificial Intelligence: An Evolving Synergy, Pasadena, CA, USA, pp. 21–26 (2009)

    Google Scholar 

  11. Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)

    MATH  Google Scholar 

  12. Ferragina, P., Scaiella, U.: TAGME: on-the-fly annotation of short text fragments (by Wikipedia entities). In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, Toronto, ON, Canada, pp. 1625–1628 (2010)

    Google Scholar 

  13. Garoufi, K., Zesch, T., Gurevych, I.: Graph-theoretic analysis of collaborative knowledge bases in natural language processing. In: Proceedings of the 7th International Semantic Web Conference, Karlsruhe, Germany (2008)

    Google Scholar 

  14. Grishman, R., Sundheim, B.: Message Understanding Conference-6: a brief history. In: Proceedings of the 16th Conference on Computational Linguistics, Copenhagen, Denmark, pp. 466–471 (1996)

    Google Scholar 

  15. Hobbs, J.R.: Pronoun resolution. Tech. rep., Department of Computer Science, City University of New York (1976)

    Google Scholar 

  16. Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. In: Proceedings of the 9th Annual ACM-SIAM Symposium on Discrete Algorithms, San Francisco, CA, USA, pp. 668–677 (1998)

    Google Scholar 

  17. Kulkarni, S., Singh, A., Ramakrishnan, G., Chakrabarti, S.: Collective annotation of wikipedia entities in web text. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, pp. 457–466 (2009)

    Google Scholar 

  18. Lehmann, J., Monahan, S., Nezda, L., Jung, A., Shi, Y.: LCC approaches to knowledge base population at TAC 2010. In: Proceedings of the Text Analysis Conference, Gaithersburg, MD, USA (2010)

    Google Scholar 

  19. McNamee, P.: HLTCOE efforts in entity linking at TAC KBP 2010. In: Proceedings of the Text Analysis Conference, Gaithersburg, MD, USA (2010)

    Google Scholar 

  20. McNamee, P., Dang, H.T., Simpson, H., Schone, P., Strassel, S.M.: An evaluation of technologies for knowledge base population. In: Proceedings of the 7th International Conference on Language Resources and Evaluation, Valletta, Malta, pp. 369–372 (2010)

    Google Scholar 

  21. Mihalcea, R.: Unsupervised large-vocabulary word sense disambiguation with graph-based algorithms for sequence data labeling. In: Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver, BC, Canada, pp. 411–418 (2005)

    Google Scholar 

  22. Mihalcea, R., Csomai, A.: Wikify!: linking documents to encyclopedic knowledge. In: Proceedings of the 16th ACM conference on Conference on Information and Knowledge Management, Lisbon, Portugal, pp. 233–242 (2007)

    Google Scholar 

  23. Milne, D., Witten, I.H.: Learning to link with Wikipedia. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management, Napa Valley, CA, USA, pp. 509–518 (2008)

    Google Scholar 

  24. Navigli, R., Lapata, M.: Graph connectivity measures for unsupervised word sense disambiguation. In: Proceedings of the 20th International Joint Conference on Artificial Intelligence, Hyderabad, India, pp. 1683–1688 (2007)

    Google Scholar 

  25. Navigli, R., Lapata, M.: An experimental study of graph connectivity for unsupervised word sense disambiguation. IEEE Transactions on Pattern Analysis and Machine Intelligence 32(4), 678–692 (2010)

    Article  Google Scholar 

  26. NIST: Task description for knowledge-base population at TAC 2010 (2010), http://nlp.cs.qc.cuny.edu/kbp/2010/KBP2010_TaskDefinition.pdf (accessed August 20, 2010)

  27. Radford, W., Hachey, B., Nothman, J., Honnibal, M., Curran, J.R.: Cmcrc at tac10: Document-level entity linking with graph-based reranking. In: Proceedings of the Text Analysis Conference, Gaithersburg, MD, USA (2010)

    Google Scholar 

  28. Soon, W.M., Lim, D.C.Y., Ng, H.T.: A machine learning approach to coreference resolution of noun phrases. Computational Linguistics 27(4), 521–544 (2001)

    Article  Google Scholar 

  29. Tjong Kim Sang, E.F.: Introduction to the CoNLL-2002 shared task: Language-independent named entity recognition. In: Proceedings of the 6th Conference on Natural Language Learning, Taipei, Taiwan, pp. 1–4 (2002)

    Google Scholar 

  30. Varma, V., Bysani, P., Reddy, K., Bharat, V., GSK, S., Kumar, K., Kovelamudi, S., Kiran Kumar, N., Maganti, N.: IIIT Hyderabad at TAC 2009. In: Proceedings of the Text Analysis Conference. Gaithersburg, MD, USA (2009)

    Google Scholar 

  31. Zheng, Z., Li, F., Huang, M., Zhu, X.: Learning to link entities with knowledge base. In: Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics, Los Angeles, CA USA, pp. 483–491 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hachey, B., Radford, W., Curran, J.R. (2011). Graph-Based Named Entity Linking with Wikipedia. In: Bouguettaya, A., Hauswirth, M., Liu, L. (eds) Web Information System Engineering – WISE 2011. WISE 2011. Lecture Notes in Computer Science, vol 6997. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24434-6_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-24434-6_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-24433-9

  • Online ISBN: 978-3-642-24434-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics