Effective Chinese Relation Extraction by Sentence Rolling and Candidate Ranking

Sheng, Meilun; Qiu, Lin; Wu, Chenyang; Wang, Haofen; Yu, Yong

doi:10.1007/978-3-642-54025-7_13

Meilun Sheng⁶,
Lin Qiu⁶,
Chenyang Wu⁶,
Haofen Wang⁶ &
…
Yong Yu⁶

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 406))

Included in the following conference series:

China Semantic Web Symposium and Web Science Conference

1339 Accesses
1 Citations

Abstract

Relation extraction is to discover relations between entities mentioned in the plain text. It can be used to generate semantic data in form of RDF triples representing facts. In this paper, we focus on relation extraction from Chinese text, which is less studied compared with that for English. Chinese words and phrases have great ambiguities on syntax and semantic. Thus, Chinese NLP tools can be insufficient when the sentence is too long or the sentence structure is too complex. Unfortunately, this is the case in the real world data. In order to tackle the limitation of the current Chinese NLP tools, we propose a method called sentence rolling to generate several enhanced inputs from the original input to help generate the correct relation candidates. In order to rank these candidates in an appropriate way, a voting approach is applied based on several statistic-based ranking function. Further, a Relation KB is used to help determine the subject part and the object part for the selected relation candidate. We carried out comprehensive experiments on both real world news corpus and benchmark data combining Chinese Treebank and Chinese Dependency Treebank. The experimental results show that the method can improve the performance of relation extraction significantly compared with the existing ones and cost a reasonable time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

A Relation Extraction Method Based on Multi-layer Index and Cascading Binary Framework

QA4IE: A Question Answering Based Framework for Information Extraction

Efficient Chinese Relation Extraction with Multi-entity Dependency Tree Pruning and Path-Fusion

References

Agichtein, E., Gravano, L., Pavel, J., Sokolova, V., Voskoboynik, A.: Snowball: A prototype system for extracting relations from large text collections. ACM SIGMOD Record 30, 612 (2001)
Article Google Scholar
Björkelund, A., Hafdell, L., Nugues, P.: Multilingual semantic role labeling. In: Proceedings of the Thirteenth Conference on Computational Natural Language Learning: Shared Task, pp. 43–48. Association for Computational Linguistics (2009)
Google Scholar
Bohnet, B.: Very high accuracy and fast dependency parsing is not a contradiction. In: Proceedings of the 23rd International Conference on Computational Linguistics, pp. 89–97. Association for Computational Linguistics (2010)
Google Scholar
Etzioni, O., Cafarella, M., Downey, D., Kok, S., Popescu, A.-M., Shaked, T., Soderland, S., Weld, D.S., Yates, A.: Web-scale information extraction in knowitall: (preliminary results). In: Proceedings of the 13th International Conference on World Wide Web, pp. 100–110. ACM (2004)
Google Scholar
Fader, A., Soderland, S., Etzioni, O.: Identifying relations for open information extraction. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1535–1545. Association for Computational Linguistics (2011)
Google Scholar
Niu, X., Sun, X., Wang, H., Rong, S., Qi, G., Yu, Y.: Zhishi.me - weaving chinese linking open data. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part II. LNCS, vol. 7032, pp. 205–220. Springer, Heidelberg (2011)
Chapter Google Scholar
Pan, J., Wang, H., Yu, Y.: Building large scale relation kb from text. In: International Semantic Web Conference (Posters and Demos) (2012)
Google Scholar
Suchanek, F.M., Ifrim, G., Weikum, G.: Combining linguistic and statistical analysis to extract relations from web documents. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 712–717. ACM (2006)
Google Scholar
Wang, C., Kalyanpur, A., Fan, J., Boguraev, B.K., Gondek, D.C.: Relation extraction and scoring in deepqa. IBM Journal of Research and Development 56(3-4), 9:1–9:12 (2012)
Google Scholar
Wang, W.: Chinese news event 5w1h semantic elements extraction for event ontology population. In: Proceedings of the 21st International Conference Companion on World Wide Web, pp. 197–202. ACM (2012)
Google Scholar
Wang, W., Zhao, D., Wang, D.: Chinese news event 5w1h elements extraction using semantic role labeling. In: 2010 Third International Symposium on Information Processing (ISIP), pp. 484–489. IEEE (2010)
Google Scholar
Wang, W., Zhao, D., Zou, L., Wang, D., Zheng, W.: Extracting 5W1H event semantic elements from chinese online news. In: Chen, L., Tang, C., Yang, J., Gao, Y. (eds.) WAIM 2010. LNCS, vol. 6184, pp. 644–655. Springer, Heidelberg (2010)
Chapter Google Scholar
Yates, A., Cafarella, M., Banko, M., Etzioni, O., Broadhead, M., Soderland, S.: Textrunner: open information extraction on the web. In: Proceedings of Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, pp. 25–26. Association for Computational Linguistics (2007)
Google Scholar
Zhu, J., Nie, Z., Liu, X., Zhang, B., Wen, J.-R.: Statsnowball: a statistical approach to extracting entity relationships. In: Proceedings of the 18th International Conference on World Wide Web, pp. 101–110. ACM (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Apex Lab, Shanghai Jiao Tong University, China
Meilun Sheng, Lin Qiu, Chenyang Wu, Haofen Wang & Yong Yu

Authors

Meilun Sheng
View author publications
You can also search for this author in PubMed Google Scholar
Lin Qiu
View author publications
You can also search for this author in PubMed Google Scholar
Chenyang Wu
View author publications
You can also search for this author in PubMed Google Scholar
Haofen Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yong Yu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Southeast University, Jiangning, China
Guilin Qi
Tsinghua University, Beijing, China
Jie Tang
Guangdong University of Foreign Studies, China
Jianfeng Du
Department of Computing Science, University of Aberdeen, AB24 3UE, Aberdeen, UK
Jeff Z. Pan
Shanghai Jiao Tong University, China
Yong Yu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sheng, M., Qiu, L., Wu, C., Wang, H., Yu, Y. (2013). Effective Chinese Relation Extraction by Sentence Rolling and Candidate Ranking. In: Qi, G., Tang, J., Du, J., Pan, J.Z., Yu, Y. (eds) Linked Data and Knowledge Graph. CSWS 2013. Communications in Computer and Information Science, vol 406. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-54025-7_13

Download citation

DOI: https://doi.org/10.1007/978-3-642-54025-7_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-54024-0
Online ISBN: 978-3-642-54025-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics