Skip to main content

Extraction of Temporal Events from Clinical Text Using Semi-supervised Conditional Random Fields

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10387))

Abstract

The huge amount of clinical text in Electronic Medical Records (EMRs) has opened a stage for text processing and information extraction for healthcare and medical research. Extracting temporal information in clinical text is much more difficult than the newswire text due to implicit expression of temporal information, domain-specific nature and lack of writing quality, among others. Despite of these constraints, some of the existing works established rule-based, machine learning and hybrid methods to extract temporal information with the help of annotated corpora. However obtaining the annotated corpora is costly, time consuming and requires much manual effort and thus their small size inevitably affects the processing quality. Motivated by this fact, in this work we propose a novel two-stage semi-supervised framework to exploit the abundant unannotated clinical text to automatically detect temporal events and then subsequently improve the stability and the accuracy of temporal event extraction. We trained and evaluated our semi-supervised model with the selected features of testing dataset, resulting F-measure of 89.76% for event extraction.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://www.i2b2.org/NLP/TemporalRelations/.

  2. 2.

    https://sourceforge.net/projects/crfpp/.

  3. 3.

    http://mallet.cs.umass.edu/semi-sup-fst.php.

  4. 4.

    https://metamap.nlm.nih.gov/.

  5. 5.

    http://www.nactem.ac.uk/GENIA/tagger/.

  6. 6.

    http://www.nltk.org/api/nltk.tag.html.

References

  1. Roden, D., Xu, H., Denny, J., Wilke, R.: Electronic medical records as a tool in clinical pharmacology: opportunities and challenges. Clin. Pharmacol. Ther. 91(6), 322–329 (2012)

    Article  Google Scholar 

  2. Norn, G., Hopstadius, J., Bate, A., Star, K., Edwards, I.: Temporal pattern discovery in longitudinal electronic patient records. Data Min. Knowl. Disc. 20(3), 361–387 (2010)

    Article  MathSciNet  Google Scholar 

  3. Sun, W., Rumshisky, A., Uzuner, O.: Temporal reasoning over clinical text: the state of the art. J. Am. Med. Inf. Assoc. 20(5), 814–819 (2013)

    Article  Google Scholar 

  4. Miller, T.A., Bethard, S., Dligach, D., Lin, C.: Savova. Extracting time expressions from clinical text, G.K. (2015)

    Google Scholar 

  5. Tang, B., Wu, Y., Jiang, M., Chen, Y., Denny, J.C., Xu, H.: A hybrid system for temporal information extraction from clinical text. J. Am. Med. Inf. Assoc. 20(5), 828–835 (2013)

    Article  Google Scholar 

  6. Sohn, S., Wagholikar, K., Li, D., Jonnalagaddaa, S., Tao, C., Elayavilli, R.K., Liu, H.: Comprehensive temporal information detection from clinical text: medical events, time, and tlink identification. JAMIA 20(5), 836–842 (2013)

    Google Scholar 

  7. Weiyi, S., Anna, R., Ozlem, U.: Evaluating temporal relations in clinical text: 2012 i2b2 challenge. J. Am. Med. Inf. Assoc. 20(5), 806–813 (2013)

    Article  Google Scholar 

  8. Galescu, L., Nate, B.: A corpus of clinical narratives annotated with temporal information. In: ACM, pp. 715–720 (2012)

    Google Scholar 

  9. Styler IV, W.F., Bethard, S., Finan, S., Palmer, M., Pradhan, S., de Groen, P.C., Erickson, B., Miller, T., Lin, C., Savova, G., Pustejovsky, J.: Temporal annotation in the clinical domain. Trans. Assoc. Comput. Linguist. 2, 143–154 (2013)

    Google Scholar 

  10. Sun, W., Rumshisky, A., Uzuner, O.: Annotating temporal information in clinical narratives. J. Biomed. Inf. 46, s5–s12 (2013)

    Article  Google Scholar 

  11. Pustejovsky, J., Hanks, P., Sauri, R., See, A., Gaizauskas, R., Setzer, A., Radev, D., Sundheim, B., Day, D., Ferro, L., et al.: The timebank corpus. In: Corpus linguistics, vol. 2003, p. 40 (2003)

    Google Scholar 

  12. Setzer, A., Gaizauskas, R.J.: Annotating events and temporal information in newswire texts. In: Proceedings of LREC-2000, vol. 2000, pp. 1287–1294 (2000)

    Google Scholar 

  13. Verhagen, M., Gaizauskas, R., Schilder, F., Hepple, M., Moszkowicz, J., Pustejovsky, J.: The tempeval challenge: identifying temporal relations in text. Lang. Resour. Eval. 43(2), 161–179 (2009)

    Article  Google Scholar 

  14. Zhou, L., Friedman, C., Parsons, S., Hripcsak, G.: System architecture for temporal information extraction, representation and reasoning in clinical narrative reports. Am. Med. Inf. Assoc. 2005, 869 (2005)

    Google Scholar 

  15. Lin, Y.-K., Chen, H., Brown, R.A.: Medtime: a temporal information extraction system for clinical narratives. J. Biomed. Inf. 46, s20–s28 (2013)

    Article  Google Scholar 

  16. Zhu, X., Cherry, C., Kiritchenko, S., Martin, J., De Bruijn, B.: Detecting concept relations in clinical text: insights from a state-of-the-art model. J. Biomed. Inf. 46(2), 275–285 (2013)

    Article  Google Scholar 

  17. Bethard, S., Derczynski, L., Savova, G., Pustejovsky, J., Verhagen, M.: Semeval-2015 task 6: cinical tempeval. In: Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), pp. 806–814 (2015)

    Google Scholar 

  18. Bethard, S., Savova, G., Chen, W.T., Derczynski, L., Pustejovsky, J., Verhagen, M.: Semeval-2016 task 12: clinical tempeval. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval 2016), San Diego, California, June, pp. 962–972. Association for Computational Linguistics (2016)

    Google Scholar 

  19. Higashiyama, S., Seki, K., Uehara, K.: Clinical entity recognition using cost-sensitive structured perceptron for ntcir-10 mednlp. Proc. NTCIR 10, 706–709 (2013)

    Google Scholar 

  20. Kovačević, A., Dehghan, A., Filannino, M., Keane, J.A., Nenadic, G.: Combining rules and machine learning for extraction of temporal expressions and events from clinical narratives. J. Am. Med. Inf. Assoc. 20(5), 859–866 (2013)

    Article  Google Scholar 

  21. Xu, Y., Wang, Y., Liu, T., Tsujii, J., Eric, I., Chang, C.: An end-to-end system to identify temporal relation in discharge summaries: 2012 i2b2 challenge. J. Am. Med. Inf. Assoc. 20(5), 849–858 (2013)

    Article  Google Scholar 

  22. Jindal, P., Roth, D.: Extraction of events and temporal expressions from clinical narratives: 2012 i2b2 NLP challenge on temporal relations in clinical data. J. Biomed. Inf. 46, S13–S19 (2013)

    Article  Google Scholar 

  23. Zhu, X.: Semi-supervised learning literature survey. World 10, 10 (2005)

    Google Scholar 

  24. Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning, ICML, vol. 1, pp. 282–289 (2001)

    Google Scholar 

  25. Jiao, F., Wang, S., Lee, C.H., Greiner, R., Schuurmans, D.: Semi-supervised conditional random fields for improved sequence segmentation and labeling. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, pp. 209–216. Association for Computational Linguistics (2006)

    Google Scholar 

  26. Moharasan, G., Ho, T.B.: A semi-supervised approach for temporal information extraction from clinical text. In: 2016 IEEE RIVF International Conference on Computing and Communication Technologies, Research, Innovation, and Vision for the Future (RIVF), pp. 7–12. IEEE (2016)

    Google Scholar 

  27. Kanungo, T., Mount, D.M., Netanyahu, N.S., Piatko, C.D., Silverman, R., Wu, A.Y.: An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 881–892 (2002)

    Article  MATH  Google Scholar 

Download references

Acknowledgments

This work is partially funded by Vietnam National University at Ho Chi Minh City under the grant number B2015-42-02, and Japan Advanced Institute of Science and Technology under the Data Science Project. Also we thank mayo clinic and Informatics for Integrating Biology and the Bedside (I2B2) organizers for providing access to annotated I2B2 temporal relations corpus.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gandhimathi Moharasan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Moharasan, G., Ho, TB. (2017). Extraction of Temporal Events from Clinical Text Using Semi-supervised Conditional Random Fields. In: Tan, Y., Takagi, H., Shi, Y. (eds) Data Mining and Big Data. DMBD 2017. Lecture Notes in Computer Science(), vol 10387. Springer, Cham. https://doi.org/10.1007/978-3-319-61845-6_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-61845-6_41

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-61844-9

  • Online ISBN: 978-3-319-61845-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics