Skip to main content

Supervised Learning of Entity Disambiguation Models by Negative Sample Selection

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10761))

Abstract

The objective of Entity Linking is to connect an entity mention in a text to a known entity in a knowledge base. The general approach for this task is to generate, for a given mention, a set of candidate entities from the base and determine, in a second step, the best one. This paper focuses on this last step and proposes a method based on learning a function that discriminates an entity from its most ambiguous ones. Our contribution lies in the strategy to learn efficiently such a model while keeping it compatible with large knowledge bases. We propose three strategies with different efficiency/performance trade-off, that are experimentally validated on six datasets of the TAC evaluation campaigns by using Freebase and DBpedia as reference knowledge bases.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    We used the tool MITIE for this step (https://github.com/mit-nlp/MITIE).

  2. 2.

    For the entity mention, we took the whole lexical context as we did not have an entity recognizer for all the entity types of the knowledge base.

  3. 3.

    https://www.csie.ntu.edu.tw/~cjlin/liblinear/.

  4. 4.

    The comparison is not absolutely fair since we used the data from other years for training, which were not available to the participants.

References

  1. Burkhard, W.A., Keller, R.M.: Some approaches to best-match file searching. Commun. ACM 16(4), 230–236 (1973)

    Article  Google Scholar 

  2. Cao, Z., Tao, Q., Tie-Yan, L., Ming-Feng, T., Hang, L.: Learning to rank: from pairwise approach to listwise approach. In: 24th International Conference on Machine Learning (ICML 2007), pp. 129–136. Corvalis, Oregon (2007)

    Google Scholar 

  3. Cassidy, T., et al.: CUNY-UIUC-SRI TAC-KBP2011 entity linking system description. In: Text Analysis Conference (TAC 2011) (2011)

    Google Scholar 

  4. Cucerzan, S.: Large-scale named entity disambiguation based on Wikipedia data. In: 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL 2007), Prague, Czech Republic, pp. 708–716 (2007)

    Google Scholar 

  5. Dredze, M., McNamee, P., Rao, D., Gerber, A., Finin, T.: Entity disambiguation for knowledge base population. In: 23rd International Conference on Computational Linguistics (COLING 2010), Beijing, China, pp. 277–285 (2010)

    Google Scholar 

  6. Fan, M., Zhou, Q., Zheng, T.F.: Distant supervision for entity linking. In: 29th Pacific Asia Conference on Language, Information and Computation (PACLIC 29), Shanghai, China, pp. 79–86 (2015)

    Google Scholar 

  7. Han, X., Zhao, J.: NLPR\_KBP in TAC 2009 KBP track: a two-stage method to entity linking. In: Text Analysis Conference (TAC 2009) (2009)

    Google Scholar 

  8. Ji, H., Nothman, J., Hachey, B.: Overview of TAC-KBP2014 entity discovery and linking tasks. In: Text Analysis Conference (TAC 2014) (2014)

    Google Scholar 

  9. Ji, H., Nothman, J., Hachey, B., Florian, R.: Overview of TAC-KBP2015 tri-lingual entity discovery and linking. In: Text Analysis Conference (TAC 2015) (2015)

    Google Scholar 

  10. Lehmann, J., Monahan, S., Nezda, L., Jung, A., Shi, Y.: LCC approaches to knowledge base population at TAC 2010. In: Text Analysis Conference (TAC 2010) (2010)

    Google Scholar 

  11. Ling, X., Singh, S., Weld, D.: Design challenges for entity linking. In: Transactions of the Association for Computational Linguistics (TACL), vol. 3, pp. 315–328 (2015)

    Google Scholar 

  12. Mihalcea, R., Csomai, A.: Wikify! linking documents to encyclopedic knowledge. In: Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, pp. 233–242. ACM, Lisbon(2007)

    Google Scholar 

  13. Moro, A., Raganato, A., Navigli, R.: Entity linking meets word sense disambiguation: a unified approach. Trans. Assoc. Comput. Linguist. (TACL) 2, 231–244 (2014)

    Google Scholar 

  14. Namee, P.M., Simpson, H., Dang, H.T.: Overview of the TAC 2009 knowledge base population track. In: Text Analysis Conference (TAC 2009) (2009)

    Google Scholar 

  15. Platt, J.C.: Advances in kernel methods. Fast Training of Support Vector Machines Using Sequential Minimal Optimization, pp. 185–208. MIT Press, Cambridge (1999)

    Google Scholar 

  16. Shen, W., Jianyong, W., Ping, L., Min, W.: LINDEN: linking named entities with knowledge base via semantic knowledge. In: 21st International Conference on World Wide Web (WWW 2012), Lyon, France, pp. 449–458 (2012)

    Google Scholar 

  17. Shen, W., Wang, J., Han, J.: Entity linking with a knowledge base: issues, techniques, and solutions. IEEE Trans. Knowl. Data Eng. 27(2), 443–460 (2015)

    Article  Google Scholar 

  18. Varma, V., et al.: IIT Hyderabad at TAC 2009. In: Text Analysis Conference (TAC 2009) (2009)

    Google Scholar 

  19. Zhang, W., Chuan, S.Y., Jian, S., Lim, T.C.: Entity linking with effective acronym expansion, instance selection and topic modeling. In: Twenty-Second International Joint Conference on Artificial Intelligence (IJCAI-11), Barcelona, Catalonia, Spain, pp. 1909–1914 (2011)

    Google Scholar 

  20. Zhang, W., Jian, S., Lim, T.C., Ting, W.W.: Entity linking leveraging: automatically generated annotation. In: 23rd International Conference on Computational Linguistics (COLING 2010), Beijing, China, pp. 1290–1298 (2010)

    Google Scholar 

  21. Zheng, Z., Xiance, S., Fangtao, L., Y, C.E., Xiaoyan, Z.: Entity disambiguation with freebase. In: 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology (WI-IAT 2012), pp. 82–89 (2012)

    Google Scholar 

Download references

Acknowledgments

This work was partly supported by the F1409071Q CuratedMedia project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Romaric Besançon .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Daher, H., Besançon, R., Ferret, O., Borgne, H.L., Daquo, AL., Tamaazousti, Y. (2018). Supervised Learning of Entity Disambiguation Models by Negative Sample Selection. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2017. Lecture Notes in Computer Science(), vol 10761. Springer, Cham. https://doi.org/10.1007/978-3-319-77113-7_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-77113-7_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-77112-0

  • Online ISBN: 978-3-319-77113-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics