Abstract
Relation extraction is to discover relations between entities mentioned in the plain text. It can be used to generate semantic data in form of RDF triples representing facts. In this paper, we focus on relation extraction from Chinese text, which is less studied compared with that for English. Chinese words and phrases have great ambiguities on syntax and semantic. Thus, Chinese NLP tools can be insufficient when the sentence is too long or the sentence structure is too complex. Unfortunately, this is the case in the real world data. In order to tackle the limitation of the current Chinese NLP tools, we propose a method called sentence rolling to generate several enhanced inputs from the original input to help generate the correct relation candidates. In order to rank these candidates in an appropriate way, a voting approach is applied based on several statistic-based ranking function. Further, a Relation KB is used to help determine the subject part and the object part for the selected relation candidate. We carried out comprehensive experiments on both real world news corpus and benchmark data combining Chinese Treebank and Chinese Dependency Treebank. The experimental results show that the method can improve the performance of relation extraction significantly compared with the existing ones and cost a reasonable time.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agichtein, E., Gravano, L., Pavel, J., Sokolova, V., Voskoboynik, A.: Snowball: A prototype system for extracting relations from large text collections. ACM SIGMOD Record 30, 612 (2001)
Björkelund, A., Hafdell, L., Nugues, P.: Multilingual semantic role labeling. In: Proceedings of the Thirteenth Conference on Computational Natural Language Learning: Shared Task, pp. 43–48. Association for Computational Linguistics (2009)
Bohnet, B.: Very high accuracy and fast dependency parsing is not a contradiction. In: Proceedings of the 23rd International Conference on Computational Linguistics, pp. 89–97. Association for Computational Linguistics (2010)
Etzioni, O., Cafarella, M., Downey, D., Kok, S., Popescu, A.-M., Shaked, T., Soderland, S., Weld, D.S., Yates, A.: Web-scale information extraction in knowitall: (preliminary results). In: Proceedings of the 13th International Conference on World Wide Web, pp. 100–110. ACM (2004)
Fader, A., Soderland, S., Etzioni, O.: Identifying relations for open information extraction. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1535–1545. Association for Computational Linguistics (2011)
Niu, X., Sun, X., Wang, H., Rong, S., Qi, G., Yu, Y.: Zhishi.me - weaving chinese linking open data. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part II. LNCS, vol. 7032, pp. 205–220. Springer, Heidelberg (2011)
Pan, J., Wang, H., Yu, Y.: Building large scale relation kb from text. In: International Semantic Web Conference (Posters and Demos) (2012)
Suchanek, F.M., Ifrim, G., Weikum, G.: Combining linguistic and statistical analysis to extract relations from web documents. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 712–717. ACM (2006)
Wang, C., Kalyanpur, A., Fan, J., Boguraev, B.K., Gondek, D.C.: Relation extraction and scoring in deepqa. IBM Journal of Research and Development 56(3-4), 9:1–9:12 (2012)
Wang, W.: Chinese news event 5w1h semantic elements extraction for event ontology population. In: Proceedings of the 21st International Conference Companion on World Wide Web, pp. 197–202. ACM (2012)
Wang, W., Zhao, D., Wang, D.: Chinese news event 5w1h elements extraction using semantic role labeling. In: 2010 Third International Symposium on Information Processing (ISIP), pp. 484–489. IEEE (2010)
Wang, W., Zhao, D., Zou, L., Wang, D., Zheng, W.: Extracting 5W1H event semantic elements from chinese online news. In: Chen, L., Tang, C., Yang, J., Gao, Y. (eds.) WAIM 2010. LNCS, vol. 6184, pp. 644–655. Springer, Heidelberg (2010)
Yates, A., Cafarella, M., Banko, M., Etzioni, O., Broadhead, M., Soderland, S.: Textrunner: open information extraction on the web. In: Proceedings of Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, pp. 25–26. Association for Computational Linguistics (2007)
Zhu, J., Nie, Z., Liu, X., Zhang, B., Wen, J.-R.: Statsnowball: a statistical approach to extracting entity relationships. In: Proceedings of the 18th International Conference on World Wide Web, pp. 101–110. ACM (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sheng, M., Qiu, L., Wu, C., Wang, H., Yu, Y. (2013). Effective Chinese Relation Extraction by Sentence Rolling and Candidate Ranking. In: Qi, G., Tang, J., Du, J., Pan, J.Z., Yu, Y. (eds) Linked Data and Knowledge Graph. CSWS 2013. Communications in Computer and Information Science, vol 406. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-54025-7_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-54025-7_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-54024-0
Online ISBN: 978-3-642-54025-7
eBook Packages: Computer ScienceComputer Science (R0)