Abstract
As one of the most important components in knowledge graph construction, entity linking has been drawing more and more attention in the last decade. In this paper, we propose two improvements towards better entity linking. On one hand, we propose a simple but effective coarse-to-fine unsupervised knowledge base(KB) extraction approach to improve the quality of KB, through which we can conduct entity linking more efficiently. On the other hand, we propose a highway network framework to bridge key words and sequential information captured with a self-attention mechanism to better represent both local and global information. Detailed experimentation on six public entity linking datasets verifies the great effectiveness of both our approaches.
Similar content being viewed by others
References
Pechsiri C, Piriyakul R. Explanation knowledge graph construction through causality extraction from texts. Journal of computer science and technology, 2010, 25(5): 1055–1070
Hoffmann R, Zhang C, Ling X, Zettlemoyer L, Weld D S. Knowledge-based weak supervision for information extraction of overlapping relations. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies. 2011, 541–550
Zhong Z, Cao Y, Guo M, Nie Z. CoLink: An unsupervised framework for user identity linkage. In: Proceedings of the Association for the Advance of Artificial Intelligence. 2018, 5714–5721
Yih S W, Chang M W, He X, Gao J. Semantic parsing via staged query graph generation: Question answering with knowledge base, 2015
Le P, Titov I. Improving entity linking by modeling latent relations between mentions. 2018, arXiv preprint arXiv:1804.10637
Vrandečić D, Krötzsch M. Wikidata: a free collaborative knowledgebase. Communications of the ACM, 2014, 57(10): 78–85
Lehmann J, Isele R, Jakob M, et al. DBpedia-a large-scale, multilingual knowledge base extracted from Wikipedia. Semantic web, 2015, 6(2): 167–195
Hoffart J, Suchanek F M, Berberich K, Weikum G. YAGO2: A spatially and temporally enhanced knowledge base from Wikipedia. Artificial Intelligence, 2013, 194: 28–61
Bollacker K, Evans C, Paritosh P, Sturge T, Taylor J. Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the 2008 ACM SIGMOD international conference on Management of data. 2008, 1247–1250
MacKinnon I, Vechtomova O. Improving complex interactive question answering with Wikipedia anchor text. In: Proceedings of European Conference on Information Retrieval. 2008, 438–445
Milne D, Witten I H. Learning to link with Wikipedia. In: Proceedings of the 17th ACM conference on Information and knowledge management. 2008, 509–518
Chen Z, Ji H. Collaborative ranking: A case study on entity linking. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing. 2011, 771–781
Dredze M, McNamee P, Rao D, Gerber A, Finin T. Entity disambiguation for knowledge base population. In: Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010). 2010, 277–285
Cornolti M, Ferragina P, Ciaramita M, Rüd S, Schütze H. A piggyback system for joint entity mention detection and linking in Web queries. In: Proceedings of the 25th International Conference on World Wide Web. 2016, 567–578
Cornolti M, Ferragina P, Ciaramita M, Schütze H, Rüd S. The SMAPH system for query entity recognition and disambiguation. In: Proceedings of the first international workshop on ntity recognition & disambiguation. 2014, 25–30
Tan C, Wei F, Ren P, Lv W, Zhou M. Entity linking for queries by searching wikipedia sentences. 2017, arXiv preprint arXiv:1704.02788
Cao Y, Huang L, Ji H, Chen X Li J. Bridge text and knowledge by learning multi-prototype entity mention embedding. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2017, 1623–1633
Gupta N, Singh S, Roth D. Entity linking via joint encoding of types, descriptions, and context. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017, 2681–2690
Sun Y, Lin L, Tang D, Yang N, Ji Z, Wang X. Modeling mention, context and entity with neural networks for entity disambiguation. In: Proceedings of the International Joint Conference on Artificial Intelligence. 2015, 15: 1333–1339
Francis-Landau M, Durrett G, Klein D. Capturing semantic similarity for entity linking with convolutional neural networks. 2016, arXiv preprint arXiv: 1604.00734
Ganea O E, Ganea M, Lucchi A, Eickhoff C, Hofmann T. Probabilistic bag-of-hyperlinks model for entity linking. In: Proceedings of the 25th International Conference on World Wide Web. 2016, 927–938
Ran C, Shen W, Wang J. An Attention Factor Graph Model for Tweet Entity Linking. In: Proceedings of the 2018 World Wide Web Conference. 2018, 1135–1144
Ganea O E, Hofmann T. Deep joint entity disam biguation with local neural attention. 2017, arXiv preprint arXiv: 1704.04920
Guo Z, Barbosa D. Robust named entity disambiguation with random walks. Semantic Web, 2018, 9(4): 459–479
Han X, Sun L, Zhao J. Collective entity linking in web text: a graph-based method. In: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval. 2011, 765–774
Zwicklbauer S, Seifert C, Granitzer M. Robust and collective entity disambiguation through semantic embeddings. In: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval. 2016, 425–434
Cao Y, Hou L, Li J, Liu Z. Neural collective entity linking. 2018, arXiv preprint arXiv:1811.08603
Xue M, Cai W, Su J, Song L, Ge Y, Liu Y, Wang B. Neural collective entity linking based on recurrent random walk network learning. 2019, arXiv preprint arXiv: 1906.09320
Fang Z, Cao Y, Li R, Zhang Z, Liu Y, Wang S. High quality candidate generation and sequential graph attention network for entity linking. In: Proceedings of The Web Conference 2020. 2020, 640–650
Fang Z, Cao Y, Li Q, Zhang D, Zhang Z, Liu Y. Joint entity linking with deep reinforcement learning. In: Proceedings of The World Wide Web Conference. 2019, 438–447
Peters M E, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L. Deep contextualized word representations. 2018, arXiv preprint arXiv:1802.05365
Boureau Y L, Ponce J, LeCun Y. A theoretical analysis of feature pooling in visual recognition. In: Proceedings of the 27th international conference on machine learning (ICML-10). 2010, 111–118
Danielsson P E. Euclidean distance mapping. Computer Graphics and image processing, 1980, 14(3): 227–248
Pennington J, Socher R, Manning C D. Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 2014, 1532–1543
Ceccarelli D, Lucchese C, Orlando S, Perego R, Trani S. Learning relatedness measures for entity linking. In: Proceedings of the 22nd ACM international conference on Information & Knowledge Management. 2013, 139–148
Busa-Fekete R, Szarvas G, Elteto T, Kégl B. An apple-to-apple comparison of learning-to-rank algorithms in terms of normalized discounted cumulative gain. In: Proceedings of ECAI 2012-20th European Conference on Artificial Intelligence: Preference Learning: Problems and Applications in AI Workshop, volume 242. Ios Press, 2012
Yue Y, Finley T, Radlinski F, Joachims T. A support vector method for optimizing average precision. In: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval. 2007, 271–278
Spitkovsky V I, Chang A X. A cross-lingual dictionary for english wikipedia concepts. 2012
Hoffart J, Yosef M A, Bordino I, Fürstenau H, Pinkal M, Spaniol M, Taneva B, Thater S, Weikum G. Robust disambiguation of named entities in text. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing. 2011, 782–792
Vaswani A, Shazeer N, Parmar N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser Ł, Polosukhin I. Attention is all you need. In: Proceedings of Advances in neural information processing systems, 2017, 5998–6008
Srivastava R K, Greff K, Schmidhuber J. Highway networks. 2015, arXiv preprint arXiv:1505.00387
Tseng H, Chang P C, Andrew G, Jurafsky D, Manning C D. A conditional random field word segmenter for sighan bakeoff 2005. In: Proceedings of the 4th SIGHAN workshop on Chinese language Processing. 2005
Wainwright M J, Jordan M I. Graphical models, exponential families, and variational inference. Now Publishers Inc, 2008
Denton E, Weston J, Paluri M, Bourdev L, Fergus R. User conditional hashtag prediction for images. In: Proceedings of the 21st ACM SIGKDD international conference on knowledge discovery and data mining. 2015, 1731–1740
Murphy K, Weiss Y, Jordan M I. Loopy belief propagation for approximate inference: An empirical study. 2013, arXivpreprint arXiv: 1301.6725
Chinchor N, Sundheim B. MUC-5 evaluation metrics. In: Proceedings of the 5th conference on Message understanding. 1993, 69–78
Gabrilovich E, Ringgaard M, Subramanya A. Facc1: Freebase annotation of clueweb corpora, version 1 (release date 2013-06-26, format version 1, correction level 0). 2013, 5: 140
Kingma D P, Ba J.Adam: A method for stochastic optimization. 2014, arXiv preprint arXiv:1412.6980
Acknowledgements
This work was supported by the key project of the National Natural Science Foundation of China (Grant No. 61836007), the normal project of the National Natural Science Foundation of China (Grant No. 61876118) and the project funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions.
Author information
Authors and Affiliations
Corresponding author
Additional information
Mingyang Li is now pursuing his master degree in the School of Computer Science and Technology at Soochow University, China. His research interests include knowledge graph, natural language processing and machine learning.
Yuqing Xing is now pursuing her master degree in the School of Computer Science and Technology at Soochow University, China. Her research interests include machine learning and natural language processing. Specifically, she majors in Chinese discourse analysis, which aims at organizing discourse texts in tree structures by rhetorical relationships between text spans.
Fang Kong is a full professor of the School of Computer Science and Technology at Soochow University, China. She received her PhD in Computer Science from the School of Computer Science and Technology at Soochow University, China in 2009. She worked as a postdoctoral research fellow at the National University of Singapore Singapore between 2011 and 2013. Her research interests include knowledge graph, discourse analysis and natural language processing.
Guodong Zhou received his PhD degree in 1999 from the National University of Singapore, Singapore. He is a full professor in the School of Computer Science and Technology, and the Director of the Natural Language Processing Laboratory from Soochow University, China. His research interests include information retrieval, discourse analysis and natural language processing.
Electronic Supplementary Material
Rights and permissions
About this article
Cite this article
Li, M., Xing, Y., Kong, F. et al. Towards better entity linking. Front. Comput. Sci. 16, 162308 (2022). https://doi.org/10.1007/s11704-020-0192-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11704-020-0192-9