A Semi-automated Entity Relation Extraction Mechanism with Weakly Supervised Learning for Chinese Medical Webpages

Liu, Zhao; Tong, Jian; Gu, Jinguang; Liu, Kai; Hu, Bo

doi:10.1007/978-3-319-59858-1_5

A Semi-automated Entity Relation Extraction Mechanism with Weakly Supervised Learning for Chinese Medical Webpages

Zhao Liu^16,17,
Jian Tong^16,17,
Jinguang Gu^16,17,
Kai Liu^16,17 &
…
Bo Hu¹⁸

Conference paper
First Online: 26 May 2017

1078 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10219))

Abstract

Medical entity relation extraction is of great significance for medical text data mining and medical knowledge graph. However, medical field requires very high data accuracy rate, the current medical entity relation extraction system is difficult to achieve the required accuracy. A main technical difficulty lies in how to obtain high-precision medical data, and automatically generate annotated training sample set. In this paper, a medical entity relation automatic extraction system based on weak supervision is proposed. At first, we designed a visual annotation tool, it can automatically generate crawl scripts, crawling the medical data from the site where the entity and its attributes are Separate stored. Then, based on the acquired data structure, we propose a weakly supervised hypothesis to automatically generate positive sample training data. Finally, we use CNN model to extract medical entity relation. Experiments show that the method is feasible and accurate.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Sarawagi, S.: Information extraction. J. Found. Trends Databases 3(1), 261–377 (2008)
MATH Google Scholar
Kambhatl, N.A.: Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations. In: The 42nd Annual Meeting on Association for Computational Linguistics on Interactive Poster and Demonstration Sessions, Association for Computational Linguistics, Stroudsburg (2004)
Google Scholar
Zhou, G.D., Su, J., Zhang, J., Zhang, M.: Exploring various knowledge in relation extraction. In: The 43rd Annual Meeting on Association for Computational Linguistics, pp, 427–434. Association for Computational Linguistics, Stroudsburg (2005)
Google Scholar
Jiang, J., Zhai, C.X.: A systematic exploration of the feature space for relation extraction. In: Proceedings of Human Language Technologies 2007 and the North American Chapter of the Association for Computational Linguistics, pp. 113–120. Association for Computational Linguistics, Stroudsburg (2007)
Google Scholar
Zelenko, D., Aone, C., Richardella, A.: Kernel methods for relation extraction. J. Mach. Learn. Res. 3, 1083–1106 (2003)
MathSciNet MATH Google Scholar
Zhang, M., Zhang, J., Su, J., Zhou, G.D.: A composite kernel to extract relations between entities with both flat and structured features. In: The 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, pp. 825–832. Association for Computational Linguistics, Stroudsburg (2006)
Google Scholar
Zhou, G.D., Zhang, M., Ji, D.H., Zhu, Q.M.: Tree kernel-based relation extraction with context-sensitive structured parse tree information. In: The 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 728–736 (2007)
Google Scholar
Craven M., Kumlien J.: Constructing biological knowledge bases by extracting information from text sources. In: The 7th International Conference on Intelligent Systems for Molecular Biology, pp. 77–86. AAAI, Heidelberg(1999)
Google Scholar
Riedel, S., Yao, L., McCallum, A.: Modeling relations and their mentions without labeled text. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010. LNCS, vol. 6323, pp. 148–163. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15939-8_10
Chapter Google Scholar
Surdeanu, M., Tibshirani, J., Nallapati, R., Manning, C.D.: Multi-instance multi-label learning for relation extraction. In: The 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 455–465. Association for Computational Linguistics, Stroudsburg (2012)
Google Scholar
Xu, W., Hoffmann, R., Zhao, L., Grishman, R.: Filling knowledge base gaps for distant supervision of relation extraction. In: The 51st Annual Meeting of the Association for Computational Linguistics, pp. 665–670. Association for Computational Linguistics, Stroudsburg (2013)
Google Scholar
Chen, Y., Geng, G.H., Jia, H.: Density center graph based weakly supervised classification algorithm. J. Comput. Eng. Appl. 6(51), 6–10 (2015)
Google Scholar
Collobert, R., Weston, J., Bottou, L.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)
MATH Google Scholar
Yih, W., He, X., Meek, C.: Semantic parsing for single- relation question answering. In: The Annual Meeting of the Association for Computational Linguistics, pp. 643–648 (2014)
Google Scholar
Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408-5882 (2014)
Zou, Y.W., Gu, J.G., Fu, H.D.: EARES: medical entity and attribute extraction system based on relation annotation. Wuhan Univ. J. Nat. Sci. 21(2), 145–150 (2016)
Article Google Scholar

Download references

Acknowledgement

This work is supported by the National Natural Science Foundation of China (61272110, 61602350), the Key Projects of National Social Science Foundation of China (11&ZD189), the State Key Lab of Software Engineering Open Foundation of Wuhan University (SKLSE2012-09-07) and NSF of Wuhan University of Science and technology Of China under grant number 2016xz016.

Author information

Authors and Affiliations

College of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, 430065, China
Zhao Liu, Jian Tong, Jinguang Gu & Kai Liu
Hubei Province Key Laboratory of Intelligent Information Processing and Real-Time Industrial System, Wuhan, 430065, China
Zhao Liu, Jian Tong, Jinguang Gu & Kai Liu
Kingdee Cloud Platform Department, Kingdee International Software Group Co., Ltd., Shenzhen, 518057, China
Bo Hu

Authors

Zhao Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jian Tong
View author publications
You can also search for this author in PubMed Google Scholar
Jinguang Gu
View author publications
You can also search for this author in PubMed Google Scholar
Kai Liu
View author publications
You can also search for this author in PubMed Google Scholar
Bo Hu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jinguang Gu .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Chunxiao Xing
Tsinghua University, Beijing, China
Yong Zhang
Beijing Foreign Studies University, Beijing, China
Ye Liang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, Z., Tong, J., Gu, J., Liu, K., Hu, B. (2017). A Semi-automated Entity Relation Extraction Mechanism with Weakly Supervised Learning for Chinese Medical Webpages. In: Xing, C., Zhang, Y., Liang, Y. (eds) Smart Health. ICSH 2016. Lecture Notes in Computer Science(), vol 10219. Springer, Cham. https://doi.org/10.1007/978-3-319-59858-1_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-59858-1_5
Published: 26 May 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59857-4
Online ISBN: 978-3-319-59858-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics