Abstract
Entity Linking is widely used in entity retrieval and semantic search. It refers mentions in unstructured documents to their representations in a knowledge base (KB). The frequently used KB (e.g. Wikipedia) usually contains abundant information corresponding to each entity, such as properties, name variations and text descriptions, which can help to find candidates and disambiguate the links. In this paper, we link organization names in Chinese documents to a list-like KB. Compared to typical KBs, the records in our KB are simply Chinese organization full names. The massive variations, or abbreviations in the documents cannot be directly matched to any organization name in the KB and bring about ambiguities, thus make the linking task difficult. At first, we enrich the KB with the abbreviations. Making use of the information from Hudong Baike and other sources, we design a pattern based full name annotation method to help generate abbreviations for all the names in the KB. To resolve the ambiguity problem, we propose a two-stage linking generation approach utilizing the co-occurrence of abbreviations and full names in the same document or document cluster, where the linked full names in the first stage constraint the linking of abbreviations in the second stage. We apply our approach to police inquiry document corpus. The experiment results show the effectiveness of our approach and outperforms the one-stage approach significantly in terms of precision and recall.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Zhong, L.W., Zheng, F.: Study on approach to retrieval of chinese organization name based on its abbreviated name. J. Chin. Inf. Process. 21, 38–42 (2007)
Chua, T.S., Liu, J.: Learning pattern rules for chinese named entity extraction. In: Proceedings of AAAI/IAAI, 411–418 (2002)
Houfeng, W., Wuguang, S.: A simple rule-based approach to organization name recognition in chinese text. In: Gelbukh, A. (ed.) CICLing 2005. LNCS, vol. 3406, pp. 769–772. Springer, Heidelberg (2005)
Ling, Y., Yang, J., He, L.: Chinese organization name recognition based on multiple features. In: Chau, M., Wang, G., Yue, W.T., Chen, H. (eds.) PAISI 2012. LNCS, vol. 7299, pp. 136–144. Springer, Heidelberg (2012)
Fu, C., Fu, G.: A dual-layer CRFs based method for chinese nested named entity recognition. In: 9th International Conference on Fuzzy Systems and Knowledge Discovery, pp. 2546–2550. IEEE, New York (2012)
Wu, X., Wu, Z., Jia, J., et al.: Adaptive named entity recognition based on conditional random fields with automatic updated dynamic gazetteers. In: 8th International Symposium on Chinese Spoken Language Processing, pp. 363–367. IEEE, New York (2012)
Zhang, W., Su, J., Tan, C.L. et al.: Entity linking leveraging: automatically generated annotation. In: COLING 2010, pp. 1290–1298. ACL, Stroudsburg (2010)
Han, X., Sun, L., Zhao, J.: Collective entity linking in web text: a graph-based method. In: 34th ACM SIGIR International Conference on Research and Development in Information Retrieval, pp. 765–774. ACM, New York (2011)
Liu, X., Li, Y., Wu, H., et al.: Entity linking for tweets. In: The 51th Annual Meeting of the Association for Computational Linguistics, pp. 1304–1311. ACL, Stroudsburg (2013)
Shen, W., Wang, J., Luo, P., et al.: LIEGE: link entities in web lists with knowledge base. In: The 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1424–1432. ACM, New York (2012)
Acknowledgements
This work is funded by The 3rd Research Institute of The Ministry of Public Security through project No: C13601. We thank Tong Ruan for the guidance of the project, and thank Chen Wang for her proofreading.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Xue, C., Wang, H., Jin, B., Wang, M., Gao, D. (2014). Effective Chinese Organization Name Linking to a List-Like Knowledge Base. In: Zhao, D., Du, J., Wang, H., Wang, P., Ji, D., Pan, J. (eds) The Semantic Web and Web Science. CSWS 2014. Communications in Computer and Information Science, vol 480. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45495-4_9
Download citation
DOI: https://doi.org/10.1007/978-3-662-45495-4_9
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-45494-7
Online ISBN: 978-3-662-45495-4
eBook Packages: Computer ScienceComputer Science (R0)