Skip to main content

Temporal Tagging of Noisy Clinical Texts in Brazilian Portuguese

  • Conference paper
  • First Online:
Computational Processing of the Portuguese Language (PROPOR 2018)

Abstract

Temporal expressions are present in several types of texts, including clinical ones. The current research over temporal expressions has been done by the use of rule-based systems, machine learning or hybrid approaches, in most cases, over annotated (labeled) news texts correctly written in English. In this paper, we propose a method to extract and normalize temporal expressions from noisy and unlabeled clinical texts (discharge summaries) written in Brazilian Portuguese using a rule-based approach. The obtained results are similar to the state-of-the-art researches made with the same purpose in other languages. The proposed method reached a F1 score of 88.92% for the extraction step and, a F1 score of 87.89% for the normalization step.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/HeidelTime/heideltime.

References

  1. Alonso, O., Strötgen, J., Baeza-Yates, R.A., Gertz, M.: Temporal information retrieval: challenges and opportunities. In: TWAW, vol. 11, pp. 1–8 (2011)

    Google Scholar 

  2. Bethard, S., Derczynski, L., Savova, G., Pustejovsky, J., Verhagen, M.: SemEval-2015 task 6: clinical TempEval. In: SemEval@NAACL-HLT, pp. 806–814 (2015)

    Google Scholar 

  3. Campos, R., Dias, G., Jorge, A.M., Jatowt, A.: Survey of temporal information retrieval and related applications. ACM Comput. Surv. (CSUR) 47(2), 15 (2015)

    Google Scholar 

  4. Chang, A.X., Manning, C.D.: SUTime: a library for recognizing and normalizing time expressions. In: LREC, vol. 2012, pp. 3735–3740 (2012)

    Google Scholar 

  5. Costa, F., Branco, A.: Extracting temporal information from portuguese texts. In: Caseli, H., Villavicencio, A., Teixeira, A., Perdigão, F. (eds.) PROPOR 2012. LNCS, vol. 7243, pp. 99–105. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28885-2_11

    Chapter  Google Scholar 

  6. Costa, F., Branco, A.: TimeBankPT: a TimeML annotated corpus of Portuguese. In: LREC, pp. 3727–3734 (2012)

    Google Scholar 

  7. Gertz, M., Strötgen, J., Zell, J.: HeidelTime: tuning English and developing Spanish resources for TempEval-3, Atlanta, Georgia, USA, p. 15 (2013)

    Google Scholar 

  8. TimeML Working Group, et al.: Guidelines for temporal expression annotation for English for TempEval 2010 (2009)

    Google Scholar 

  9. Gupta, N., Joshi, A., Bhattacharyya, P.: A temporal expression recognition system for medical documents by taking help of news domain corpora. In: 12th International Conference on Natural Language Processing, ICON (2015)

    Google Scholar 

  10. Hamon, T., Grabar, N.: Tuning HeidelTime for identifying time expressions in clinical texts in English and French. In: EACL 2014, pp. 101–105 (2014)

    Google Scholar 

  11. Kreimeyer, K., et al.: Natural language processing systems for capturing and standardizing unstructured clinical information: a systematic review. J. Biomed. Inform. 73, 14–29 (2017)

    Article  Google Scholar 

  12. Lee, K., Artzi, Y., Dodge, J., Zettlemoyer, L.: Context-dependent semantic parsing for time expressions. In: ACL, vol. 1, pp. 1437–1447 (2014)

    Google Scholar 

  13. Li, H., Strötgen, J., Zell, J., Gertz, M.: Chinese temporal tagging with HeidelTime. In: EACL, vol. 2014, pp. 133–137 (2014)

    Google Scholar 

  14. Madkour, M., Benhaddou, D., Tao, C.: Temporal data representation, normalization, extraction, and reasoning: a review from clinical domain. Comput. Methods Progr. Biomed. 128, 52–68 (2016)

    Article  Google Scholar 

  15. Manfredi, G., Strötgen, J., Zell, J., Gertz, M.: HeidelTime at EVENTI: tuning Italian resources and addressing TimeML’s empty tags. In: Proceedings of the Forth International Workshop EVALITA, pp. 39–43 (2014)

    Google Scholar 

  16. Meng, Y., Rumshisky, A., Romanov, A.: Temporal information extraction for question answering using syntactic dependencies in an LSTM-based architecture. arXiv preprint arXiv:1703.05851 (2017)

  17. Moharasar, G., Ho, T.B.: A semi-supervised approach for temporal information extraction from clinical text. In: 2016 IEEE RIVF International Conference on Computing & Communication Technologies, Research, Innovation, and Vision for the Future, RIVF, pp. 7–12. IEEE (2016)

    Google Scholar 

  18. Pustejovsky, J., et al.: TimeML: robust specification of event and temporal expressions in text. New Dir. Quest. Answ. 3, 28–34 (2003)

    Google Scholar 

  19. Pustejovsky, J., Knippen, R., Littman, J., Saurí, R.: Temporal and event information in natural language text. Lang. Resour. Eval. 39(2), 123–164 (2005)

    Article  Google Scholar 

  20. Quaresma, P., Mendes, A., Hendrickx, I., Gonçalves, T.: Tagging and labelling Portuguese modal verbs. In: Baptista, J., Mamede, N., Candeias, S., Paraboni, I., Pardo, T.A.S., Volpe Nunes, M.G. (eds.) PROPOR 2014. LNCS, vol. 8775, pp. 70–81. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-09761-9_7

    Chapter  Google Scholar 

  21. Roberts, K., Rink, B., Harabagiu, S.M.: A flexible framework for recognizing events, temporal expressions, and temporal relations in clinical text. J. Am. Med. Inform. Assoc. 20(5), 867–875 (2013)

    Article  Google Scholar 

  22. Rodrigues, R., Gomes, P.: Improving question-answering for portuguese using triples extracted from corpora. In: Silva, J., Ribeiro, R., Quaresma, P., Adami, A., Branco, A. (eds.) PROPOR 2016. LNCS, vol. 9727, pp. 25–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-41552-9_3

    Chapter  Google Scholar 

  23. Sarath, P., Manikandan, R., Niwa, Y.: Hitachi at SemEval-2017 Task 12: system for temporal information extraction from clinical notes. In: Proceedings of the 11th International Workshop on Semantic Evaluation, SemEval 2017, pp. 1005–1009 (2017)

    Google Scholar 

  24. Saurı, R., Littman, J., Knippen, B., Gaizauskas, R., Setzer, A., Pustejovsky, J.: TimeML annotation guidelines version 1.2. 1 (2006)

    Google Scholar 

  25. Schilder, F.: Extracting meaning from temporal nouns and temporal prepositions. ACM Trans. Asian Lang. Inf. Process. (TALIP) 3(1), 33–50 (2004)

    Article  Google Scholar 

  26. Skukan, L., Glavaš, G., Šnajder, J.: HEIDELTIME.HR: extracting and normalizing temporal expressions in Croatian. In: Proceedings of the 9th Slovenian Language Technologies Conferences, IS-LT 2014, pp. 99–103 (2014)

    Google Scholar 

  27. Strötgen, J., Gertz, M.: Multilingual and cross-domain temporal tagging. Lang. Resour. Eval. 47(2), 269–298 (2013)

    Article  Google Scholar 

  28. Strötgen, J., Gertz, M.: A baseline temporal tagger for all languages. In: EMNLP, vol. 15, pp. 541–547 (2015)

    Google Scholar 

  29. Strötgen, J., Gertz, M.: Domain-sensitive temporal tagging. Synth. Lect. Hum. Lang. Technol. 9(3), 1–151 (2016)

    Article  Google Scholar 

  30. Sun, W., Rumshisky, A., Uzuner, O.: Annotating temporal information in clinical narratives. J. Biomed. Inform. 46, S5–S12 (2013)

    Article  Google Scholar 

  31. Tissot, H., Roberts, A., Derczynski, L., Gorrell, G., Del Fabro, M.D.: Analysis of temporal expressions annotated in clinical notes. In: Proceedings of the 11th Joint ACL-ISO Workshop on Interoperable Semantic Annotation, ISA 2011 (2015)

    Google Scholar 

  32. Tissot, H.C.: Normalisation of imprecise temporal expressions extracted from text (2016)

    Google Scholar 

  33. UzZaman, N., Allen, J.F.: Event and temporal expression extraction from raw text: first step towards a temporally aware system. Int. J. Semant. Comput. 4(04), 487–508 (2010)

    Article  Google Scholar 

  34. Verhagen, M., Gaizauskas, R., Schilder, F., Hepple, M., Katz, G., Pustejovsky, J.: SemEval-2007 Task 15: TempEval temporal relation identification. In: Proceedings of the 4th International Workshop on Semantic Evaluations, pp. 75–80. Association for Computational Linguistics (2007)

    Google Scholar 

  35. Verhagen, M., Gaizauskas, R., Schilder, F., Hepple, M., Moszkowicz, J., Pustejovsky, J.: The TempEval challenge: identifying temporal relations in text. Lang. Resour. Eval. 43(2), 161–179 (2009)

    Article  Google Scholar 

  36. Verhagen, M., Sauri, R., Caselli, T., Pustejovsky, J.: SemEval-2010 Task 13: TempEval-2. In: Proceedings of the 5th International Workshop on Semantic Evaluation, pp. 57–62. Association for Computational Linguistics (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rafael Faria de Azevedo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

de Azevedo, R.F., Rodrigues, J.P.S., da Silva Reis, M.R., Moro, C.M.C., Paraiso, E.C. (2018). Temporal Tagging of Noisy Clinical Texts in Brazilian Portuguese. In: Villavicencio, A., et al. Computational Processing of the Portuguese Language. PROPOR 2018. Lecture Notes in Computer Science(), vol 11122. Springer, Cham. https://doi.org/10.1007/978-3-319-99722-3_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-99722-3_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-99721-6

  • Online ISBN: 978-3-319-99722-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics