Extraction of Temporal Events from Clinical Text Using Semi-supervised Conditional Random Fields

Moharasan, Gandhimathi; Ho, Tu-Bao

doi:10.1007/978-3-319-61845-6_41

Extraction of Temporal Events from Clinical Text Using Semi-supervised Conditional Random Fields

Gandhimathi Moharasan¹⁶ &
Tu-Bao Ho^16,17

Conference paper
First Online: 24 June 2017

3979 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10387))

Abstract

The huge amount of clinical text in Electronic Medical Records (EMRs) has opened a stage for text processing and information extraction for healthcare and medical research. Extracting temporal information in clinical text is much more difficult than the newswire text due to implicit expression of temporal information, domain-specific nature and lack of writing quality, among others. Despite of these constraints, some of the existing works established rule-based, machine learning and hybrid methods to extract temporal information with the help of annotated corpora. However obtaining the annotated corpora is costly, time consuming and requires much manual effort and thus their small size inevitably affects the processing quality. Motivated by this fact, in this work we propose a novel two-stage semi-supervised framework to exploit the abundant unannotated clinical text to automatically detect temporal events and then subsequently improve the stability and the accuracy of temporal event extraction. We trained and evaluated our semi-supervised model with the selected features of testing dataset, resulting F-measure of 89.76% for event extraction.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

References

Roden, D., Xu, H., Denny, J., Wilke, R.: Electronic medical records as a tool in clinical pharmacology: opportunities and challenges. Clin. Pharmacol. Ther. 91(6), 322–329 (2012)
Article Google Scholar
Norn, G., Hopstadius, J., Bate, A., Star, K., Edwards, I.: Temporal pattern discovery in longitudinal electronic patient records. Data Min. Knowl. Disc. 20(3), 361–387 (2010)
Article MathSciNet Google Scholar
Sun, W., Rumshisky, A., Uzuner, O.: Temporal reasoning over clinical text: the state of the art. J. Am. Med. Inf. Assoc. 20(5), 814–819 (2013)
Article Google Scholar
Miller, T.A., Bethard, S., Dligach, D., Lin, C.: Savova. Extracting time expressions from clinical text, G.K. (2015)
Google Scholar
Tang, B., Wu, Y., Jiang, M., Chen, Y., Denny, J.C., Xu, H.: A hybrid system for temporal information extraction from clinical text. J. Am. Med. Inf. Assoc. 20(5), 828–835 (2013)
Article Google Scholar
Sohn, S., Wagholikar, K., Li, D., Jonnalagaddaa, S., Tao, C., Elayavilli, R.K., Liu, H.: Comprehensive temporal information detection from clinical text: medical events, time, and tlink identification. JAMIA 20(5), 836–842 (2013)
Google Scholar
Weiyi, S., Anna, R., Ozlem, U.: Evaluating temporal relations in clinical text: 2012 i2b2 challenge. J. Am. Med. Inf. Assoc. 20(5), 806–813 (2013)
Article Google Scholar
Galescu, L., Nate, B.: A corpus of clinical narratives annotated with temporal information. In: ACM, pp. 715–720 (2012)
Google Scholar
Styler IV, W.F., Bethard, S., Finan, S., Palmer, M., Pradhan, S., de Groen, P.C., Erickson, B., Miller, T., Lin, C., Savova, G., Pustejovsky, J.: Temporal annotation in the clinical domain. Trans. Assoc. Comput. Linguist. 2, 143–154 (2013)
Google Scholar
Sun, W., Rumshisky, A., Uzuner, O.: Annotating temporal information in clinical narratives. J. Biomed. Inf. 46, s5–s12 (2013)
Article Google Scholar
Pustejovsky, J., Hanks, P., Sauri, R., See, A., Gaizauskas, R., Setzer, A., Radev, D., Sundheim, B., Day, D., Ferro, L., et al.: The timebank corpus. In: Corpus linguistics, vol. 2003, p. 40 (2003)
Google Scholar
Setzer, A., Gaizauskas, R.J.: Annotating events and temporal information in newswire texts. In: Proceedings of LREC-2000, vol. 2000, pp. 1287–1294 (2000)
Google Scholar
Verhagen, M., Gaizauskas, R., Schilder, F., Hepple, M., Moszkowicz, J., Pustejovsky, J.: The tempeval challenge: identifying temporal relations in text. Lang. Resour. Eval. 43(2), 161–179 (2009)
Article Google Scholar
Zhou, L., Friedman, C., Parsons, S., Hripcsak, G.: System architecture for temporal information extraction, representation and reasoning in clinical narrative reports. Am. Med. Inf. Assoc. 2005, 869 (2005)
Google Scholar
Lin, Y.-K., Chen, H., Brown, R.A.: Medtime: a temporal information extraction system for clinical narratives. J. Biomed. Inf. 46, s20–s28 (2013)
Article Google Scholar
Zhu, X., Cherry, C., Kiritchenko, S., Martin, J., De Bruijn, B.: Detecting concept relations in clinical text: insights from a state-of-the-art model. J. Biomed. Inf. 46(2), 275–285 (2013)
Article Google Scholar
Bethard, S., Derczynski, L., Savova, G., Pustejovsky, J., Verhagen, M.: Semeval-2015 task 6: cinical tempeval. In: Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), pp. 806–814 (2015)
Google Scholar
Bethard, S., Savova, G., Chen, W.T., Derczynski, L., Pustejovsky, J., Verhagen, M.: Semeval-2016 task 12: clinical tempeval. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval 2016), San Diego, California, June, pp. 962–972. Association for Computational Linguistics (2016)
Google Scholar
Higashiyama, S., Seki, K., Uehara, K.: Clinical entity recognition using cost-sensitive structured perceptron for ntcir-10 mednlp. Proc. NTCIR 10, 706–709 (2013)
Google Scholar
Kovačević, A., Dehghan, A., Filannino, M., Keane, J.A., Nenadic, G.: Combining rules and machine learning for extraction of temporal expressions and events from clinical narratives. J. Am. Med. Inf. Assoc. 20(5), 859–866 (2013)
Article Google Scholar
Xu, Y., Wang, Y., Liu, T., Tsujii, J., Eric, I., Chang, C.: An end-to-end system to identify temporal relation in discharge summaries: 2012 i2b2 challenge. J. Am. Med. Inf. Assoc. 20(5), 849–858 (2013)
Article Google Scholar
Jindal, P., Roth, D.: Extraction of events and temporal expressions from clinical narratives: 2012 i2b2 NLP challenge on temporal relations in clinical data. J. Biomed. Inf. 46, S13–S19 (2013)
Article Google Scholar
Zhu, X.: Semi-supervised learning literature survey. World 10, 10 (2005)
Google Scholar
Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning, ICML, vol. 1, pp. 282–289 (2001)
Google Scholar
Jiao, F., Wang, S., Lee, C.H., Greiner, R., Schuurmans, D.: Semi-supervised conditional random fields for improved sequence segmentation and labeling. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, pp. 209–216. Association for Computational Linguistics (2006)
Google Scholar
Moharasan, G., Ho, T.B.: A semi-supervised approach for temporal information extraction from clinical text. In: 2016 IEEE RIVF International Conference on Computing and Communication Technologies, Research, Innovation, and Vision for the Future (RIVF), pp. 7–12. IEEE (2016)
Google Scholar
Kanungo, T., Mount, D.M., Netanyahu, N.S., Piatko, C.D., Silverman, R., Wu, A.Y.: An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 881–892 (2002)
Article MATH Google Scholar

Download references

Acknowledgments

This work is partially funded by Vietnam National University at Ho Chi Minh City under the grant number B2015-42-02, and Japan Advanced Institute of Science and Technology under the Data Science Project. Also we thank mayo clinic and Informatics for Integrating Biology and the Bedside (I2B2) organizers for providing access to annotated I2B2 temporal relations corpus.

Author information

Authors and Affiliations

Japan Advanced Institute of Science and Technology, 1-1 Asahidai, Nomi, Ishikawa, Japan
Gandhimathi Moharasan & Tu-Bao Ho
John Von Neumann Institute, VNU-HCM, Ho Chi Minh City, Vietnam
Tu-Bao Ho

Authors

Gandhimathi Moharasan
View author publications
You can also search for this author in PubMed Google Scholar
Tu-Bao Ho
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gandhimathi Moharasan .

Editor information

Editors and Affiliations

Peking University, Beijing, China
Ying Tan
Kyushu University, Fukuoka, Japan
Hideyuki Takagi
Southern University of Science and Technology, Shenzhen, China
Yuhui Shi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Moharasan, G., Ho, TB. (2017). Extraction of Temporal Events from Clinical Text Using Semi-supervised Conditional Random Fields. In: Tan, Y., Takagi, H., Shi, Y. (eds) Data Mining and Big Data. DMBD 2017. Lecture Notes in Computer Science(), vol 10387. Springer, Cham. https://doi.org/10.1007/978-3-319-61845-6_41

Download citation

DOI: https://doi.org/10.1007/978-3-319-61845-6_41
Published: 24 June 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-61844-9
Online ISBN: 978-3-319-61845-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics