Abstract
In this paper we propose two novel approaches to enhance cross-lingual entity linking (CLEL). One is based on cross-lingual information networks, aligned based on monolingual information extraction, and the other uses topic modeling to ensure global consistency. We enhance a strong baseline system derived from a combination of state-of-the-art machine translation and monolingual entity linking to achieve 11.2% improvement in B-Cubed+ F-measure. Our system achieved highly competitive results in the NIST Text Analysis Conference (TAC) Knowledge Base Population (KBP2011) evaluation. We also provide detailed qualitative and quantitative analysis on the contributions of each approach and the remaining challenges.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Artiles, J., Borthwick, A., Gonzalo, J., Sekine, S., Amigo, E.E.: WePS-3 Evaluation Campaign: Overview of the Web People Search Clustering and Attribute Extraction Task. In: Proc. CLEF 2010 (2010)
Adafre, S.F., de Rijke, M.: Language-Independent Identification of Parallel Sentences Using Wikipedia. In: Proc. WWW 2011 (2011)
Chen, Z., Ji, H.: Language Specific Issue and Feature Exploration in Chinese Event Extraction. In: Proc. HLT-NAACL 2009 (2009)
Chen, Z., Ji, H.: Collaborative Ranking: A Case Study on Entity Linking. In: Proc. EMNLP 2011 (2011)
Deng, H., Han, J., Zhao, B., Yu, Y., Lin, C.X.: Probabilistic Topic Models with Biased Propagation on Heterogeneous Information Networks. In: Proc. KDD 2011 (2011)
Erdmann, M., Nakayama, K., Hara, T., Nishio, S.: Improving the Extraction of Bilingual Terminology from Wikipedia. ACM Transactions on Multimedia Computing Communications and Applications (2009)
Fahrni, A., Strube, M.: HITS’ Cross-lingual Entity Linking System at TAC 2011: One Model for All Languages. In: Proc. TAC 2011 (2011)
Filatova, E.: Multilingual Wikipedia, Summarization, and Information Trustworthiness. In: Proc. SIGIR 2009 Workshop on Information Access in a Multilingual World (2009)
Gale, W.A., Church, K.W., Yarowsky, D.: One Sense Per Discourse. In: Proc. DARPA Speech and Natural Language Workshop (1992)
Harris, Z.: Distributional Structure. Word (1954)
Ji, H., Grishman, R.: Refining Event Extraction through Cross-Document Inference. In: Proc. of ACL 2008: HLT, pp. 254–262 (2008)
Ji, H., Grishman, R., Dang, H.T.: An Overview of the TAC 2011 Knowledge Base Population Track. In: Proc. Text Analytics Conference (TAC 2011) (2011)
Ji, H., Grishman, R., Freitag, D., Blume, M., Wang, J., Khadivi, S., Zens, R., Ney, H.: Name Translation for Distillation. In: Handbook of Natural Language Processing and Machine Translation: DARPA Global Autonomous Language Exploitation (2009)
Ji, H., Westbrook, D., Grishman, R.: Using Semantic Relations to Refine Coreference Decisions. In: Proc. EMNLP 2005 (2005)
Kozareva, Z., Ravi, S.: Unsupervised Name Ambiguity Resolution Using A Generative Model. In: Proc. EMNLP 2011 Workshop on Unsupervised Learning in NLP (2011)
Li, Q., Anzaroot, S., Lin, W.P., Li, X., Ji, H.: Joint Inference for Cross-document Information Extraction. In: Proc. CIKM 2011 (2011)
Lin, W.P., Snover, M., Ji, H.: Unsupervised Language-Independent Name Translation Mining from Wikipedia Infoboxes. In: Proc. EMNLP 2011 Workshop on Unsupervised Learning for NLP (2011)
McNamee, P., Mayfield, J., Lawrie, D., Oard, D.W., Doermann, D.: Cross-Language Entity Linking. In: Proc. IJCNLP 2011 (2011)
McNamee, P., Mayfield, J., Oard, D.W., Xu, T., Wu, K., Stoyanov, V., Doermann, D.: Cross-Language Entity Linking in Maryland during a Hurricane. In: Proc. TAC 2011 (2011)
Milne, D., Witten, I.H.: Learning to Link with Wikipedia. In: Proc. CIKM 2008 (2008)
Monahan, S., Lehmann, J., Nyberg, T., Plymale, J., Jung, A.: Cross-Lingual Cross-Document Coreference with Entity Linking. In: Proc. TAC 2011 (2011)
Richman, A.E., Schone, P.: Mining Wiki Resources for Multilingual Named Entity Recognition. In: Proc. ACL 2008 (2008)
You, G., Hwang, S., Song, Y., Jiang, L., Nie, Z.: Mining Name Translations from Entity Graph Mappings. In: Proc. EMNLP 2010 (2003)
Zheng, J., Ayan, N.F., Wang, W., Burkett, D.: Using Syntax in Large-scale Audio Document Translation. In: Proc. Interspeech (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cassidy, T., Ji, H., Deng, H., Zheng, J., Han, J. (2012). Analysis and Refinement of Cross-Lingual Entity Linking. In: Catarci, T., Forner, P., Hiemstra, D., Peñas, A., Santucci, G. (eds) Information Access Evaluation. Multilinguality, Multimodality, and Visual Analytics. CLEF 2012. Lecture Notes in Computer Science, vol 7488. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33247-0_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-33247-0_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33246-3
Online ISBN: 978-3-642-33247-0
eBook Packages: Computer ScienceComputer Science (R0)