Abstract
The objective of Entity Linking is to connect an entity mention in a text to a known entity in a knowledge base. The general approach for this task is to generate, for a given mention, a set of candidate entities from the base and determine, in a second step, the best one. This paper focuses on this last step and proposes a method based on learning a function that discriminates an entity from its most ambiguous ones. Our contribution lies in the strategy to learn efficiently such a model while keeping it compatible with large knowledge bases. We propose three strategies with different efficiency/performance trade-off, that are experimentally validated on six datasets of the TAC evaluation campaigns by using Freebase and DBpedia as reference knowledge bases.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
We used the tool MITIE for this step (https://github.com/mit-nlp/MITIE).
- 2.
For the entity mention, we took the whole lexical context as we did not have an entity recognizer for all the entity types of the knowledge base.
- 3.
- 4.
The comparison is not absolutely fair since we used the data from other years for training, which were not available to the participants.
References
Burkhard, W.A., Keller, R.M.: Some approaches to best-match file searching. Commun. ACM 16(4), 230–236 (1973)
Cao, Z., Tao, Q., Tie-Yan, L., Ming-Feng, T., Hang, L.: Learning to rank: from pairwise approach to listwise approach. In: 24th International Conference on Machine Learning (ICML 2007), pp. 129–136. Corvalis, Oregon (2007)
Cassidy, T., et al.: CUNY-UIUC-SRI TAC-KBP2011 entity linking system description. In: Text Analysis Conference (TAC 2011) (2011)
Cucerzan, S.: Large-scale named entity disambiguation based on Wikipedia data. In: 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL 2007), Prague, Czech Republic, pp. 708–716 (2007)
Dredze, M., McNamee, P., Rao, D., Gerber, A., Finin, T.: Entity disambiguation for knowledge base population. In: 23rd International Conference on Computational Linguistics (COLING 2010), Beijing, China, pp. 277–285 (2010)
Fan, M., Zhou, Q., Zheng, T.F.: Distant supervision for entity linking. In: 29th Pacific Asia Conference on Language, Information and Computation (PACLIC 29), Shanghai, China, pp. 79–86 (2015)
Han, X., Zhao, J.: NLPR\_KBP in TAC 2009 KBP track: a two-stage method to entity linking. In: Text Analysis Conference (TAC 2009) (2009)
Ji, H., Nothman, J., Hachey, B.: Overview of TAC-KBP2014 entity discovery and linking tasks. In: Text Analysis Conference (TAC 2014) (2014)
Ji, H., Nothman, J., Hachey, B., Florian, R.: Overview of TAC-KBP2015 tri-lingual entity discovery and linking. In: Text Analysis Conference (TAC 2015) (2015)
Lehmann, J., Monahan, S., Nezda, L., Jung, A., Shi, Y.: LCC approaches to knowledge base population at TAC 2010. In: Text Analysis Conference (TAC 2010) (2010)
Ling, X., Singh, S., Weld, D.: Design challenges for entity linking. In: Transactions of the Association for Computational Linguistics (TACL), vol. 3, pp. 315–328 (2015)
Mihalcea, R., Csomai, A.: Wikify! linking documents to encyclopedic knowledge. In: Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, pp. 233–242. ACM, Lisbon(2007)
Moro, A., Raganato, A., Navigli, R.: Entity linking meets word sense disambiguation: a unified approach. Trans. Assoc. Comput. Linguist. (TACL) 2, 231–244 (2014)
Namee, P.M., Simpson, H., Dang, H.T.: Overview of the TAC 2009 knowledge base population track. In: Text Analysis Conference (TAC 2009) (2009)
Platt, J.C.: Advances in kernel methods. Fast Training of Support Vector Machines Using Sequential Minimal Optimization, pp. 185–208. MIT Press, Cambridge (1999)
Shen, W., Jianyong, W., Ping, L., Min, W.: LINDEN: linking named entities with knowledge base via semantic knowledge. In: 21st International Conference on World Wide Web (WWW 2012), Lyon, France, pp. 449–458 (2012)
Shen, W., Wang, J., Han, J.: Entity linking with a knowledge base: issues, techniques, and solutions. IEEE Trans. Knowl. Data Eng. 27(2), 443–460 (2015)
Varma, V., et al.: IIT Hyderabad at TAC 2009. In: Text Analysis Conference (TAC 2009) (2009)
Zhang, W., Chuan, S.Y., Jian, S., Lim, T.C.: Entity linking with effective acronym expansion, instance selection and topic modeling. In: Twenty-Second International Joint Conference on Artificial Intelligence (IJCAI-11), Barcelona, Catalonia, Spain, pp. 1909–1914 (2011)
Zhang, W., Jian, S., Lim, T.C., Ting, W.W.: Entity linking leveraging: automatically generated annotation. In: 23rd International Conference on Computational Linguistics (COLING 2010), Beijing, China, pp. 1290–1298 (2010)
Zheng, Z., Xiance, S., Fangtao, L., Y, C.E., Xiaoyan, Z.: Entity disambiguation with freebase. In: 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology (WI-IAT 2012), pp. 82–89 (2012)
Acknowledgments
This work was partly supported by the F1409071Q CuratedMedia project.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Daher, H., Besançon, R., Ferret, O., Borgne, H.L., Daquo, AL., Tamaazousti, Y. (2018). Supervised Learning of Entity Disambiguation Models by Negative Sample Selection. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2017. Lecture Notes in Computer Science(), vol 10761. Springer, Cham. https://doi.org/10.1007/978-3-319-77113-7_26
Download citation
DOI: https://doi.org/10.1007/978-3-319-77113-7_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-77112-0
Online ISBN: 978-3-319-77113-7
eBook Packages: Computer ScienceComputer Science (R0)