Abstract
Chinese chronic disease entity extraction aims to extract health related entities from online questions and answers (QA). Our research tackles challenges in Chinese chronic disease entity extraction from three aspects: Chinese health lexicons construction, feature development, and equivalence conjunctions tagging. We construct large scale Chinese health lexicons based on expert knowledge and the Web resources; develop a feature extraction approach that draws out character, part-of-speech, and lexical features from QA data; and improve the performance of answer entity extraction by leveraging equivalence conjunctions (punctuation marks and conjunctional words) in Chinese to capture dependencies between tags of entities. Experiments on question and answer entity extraction demonstrate that the Precision, Recall and F-1 score are improved using our proposed features, and the Precision and F-1 score can be further improved by considering equivalence conjunctions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Big Data Search and Mining Lab, BIT.: Natural Language Processing and Information Retrieval Sharing Platform. http://www.nlpir.org/
Cao, Y., Liu, F., Simpson, P., Antieau, L., Bennett, A., Cimino, J.J., Ely, J., Yu, H.: Askhermes: an online question answering system for complex clinical questions. J. Biomed. Inform. 44(2), 277–288 (2011)
Keretna, S., Lim, C.P., Creighton, D.C., Shaban, K.B.: Enhancing medical named entity recognition with an extended segment representation technique. Comput. Methods Programs Biomed. 119(2), 88–100 (2015)
Kudo, T.: CRF++: Yet Another CRF toolkit. http://taku910.github.io/crfpp/
Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data (2001)
Lee, M., Cimino, J., Zhu, H.R., Sable, C., Shanker, V., Ely, J., Yu, H.: Beyond information retrieval—medical question answering. In: AMIA Annual Symposium Proceedings, vol. 2006, p. 469. American Medical Informatics Association (2006)
McCallum, A., Li, W.: Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In: Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003, vol. 4, pp. 188–191. Association for Computational Linguistics (2003)
Nadeau, D., Sekine, S.: A survey of named entity recognition and classification. Lingvisticae Investigationes 30(1), 3–26 (2007)
National health and family planning commission of the people’s republic of China (2015). http://www.nhfpc.gov.cn/
Nlm.nih.gov: Unified Medical Language System (UMLS). http://www.nlm.nih.gov/research/umls/
Pasca, M., Lin, D., Bigham, J., Lifchits, A., Jain, A.: Organizing and searching the world wide web of facts-step one: the one-million fact extraction challenge. In: AAAI, vol. 6, pp. 1400–1405 (2006)
Pasupat, P., Liang, P.: Zero-shot entity extraction from web pages. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014. Long Papers, Baltimore, MD, USA, 22–27 June 2014, vol. 1, pp. 391–401 (2014)
Peng, X.Y., Chen, Y., Huang, Z.W.: A Chinese question answering system using web service on restricted domain. In: 2010 International Conference on Artificial Intelligence and Computational Intelligence (AICI), vol. 1, pp. 350–353. IEEE (2010)
Shaalan, K.: A survey of arabic named entity recognition and classification. Comput. Linguis. 40(2), 469–510 (2014). http://dx.doi.org/10.1162/COLI_a_00178
Zhang, H., Xu, S., Li, W., Zhu, L.: XML-based document retrieval in Chinese diseases question answering system. In: (Jong Hyuk) Park, J.J., Adeli, H., Park, N., Woungang, I. (eds.) Mobile, Ubiquitous, and Intelligent Computing. LNEE, vol. 274, pp. 211–217. Springer, Heidelberg (2014)
Zhao, H., Kit, C.: Unsupervised segmentation helps supervised learning of character tagging for word segmentation and named entity recognition. In: IJCNLP, pp. 106–111. Citeseer (2008)
Acknowledgments
This work was supported by the National High-tech R&D Program of China (Grant No. SS2015AA020102), National Basic Research Program of China (Grant No. 2011CB302302), the 1000-Talent program, and the Tsinghua University Initiative Scientific Research Program. We thank the research assistance provided by Qingbo Cao at Tsinghua University.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Zhang, Y., Zhang, Y., Yin, Y., Xu, J., Xing, C., Chen, H. (2016). Chronic Disease Related Entity Extraction in Online Chinese Question and Answer Services. In: Zheng, X., Zeng, D., Chen, H., Leischow, S. (eds) Smart Health. ICSH 2015. Lecture Notes in Computer Science(), vol 9545. Springer, Cham. https://doi.org/10.1007/978-3-319-29175-8_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-29175-8_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-29174-1
Online ISBN: 978-3-319-29175-8
eBook Packages: Computer ScienceComputer Science (R0)