Skip to main content
Log in

Extraction of temporal information from social media messages using the BERT model

  • Research Article
  • Published:
Earth Science Informatics Aims and scope Submit manuscript

Abstract

Temporal information extraction from social media messages is of critical importance to several geographical applications. Combined with the characteristics of temporal information descriptions in Chinese text, different time expression patterns formed by time unit combinations are summarized. A deep learning-based information extraction algorithm (named BERT-BiLSTM-CRF) for automatically extracting temporal information from social media messages is proposed. Based on the bidirectional long short-term memory-conditional random field (BiLSTM-CRF) model, the BERT (bidirectional encoder representations from transformers) pretrained language model was used to enhance the generalization ability of the word vector model to capture long-range contextual information; then, the trained word vector was input into the BiLSTM-CRF model for further training. The proposed model was then evaluated on the constructed corpus, a set of manually annotated Chinese texts from social media messages. Among the basic models, the BERT-BiLSTM-CRF achieved the highest average F1-score of 85%. The experimental results show that the proposed method outperforms the current state-of-the-art models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Ahn D, Adafre F, De Rijke M (2005) Towards task-based temporal extraction and recognition. In: Dagstuhl Seminar Proceedings. Schloss Dagstuhl-Leibniz-Zentrum für Informatik

  • Alfattni G, Peek N, Nenadic G (2020) Extraction of temporal relations from clinical free text: a systematic review of current approaches. J Biomed Inform 108:103488. https://doi.org/10.1016/j.jbi.2020.103488

  • Amigó E, Artiles J, Li Q, Ji H (2021) An evaluation framework for aggregated temporal information extraction. In: SIGIR-2011 workshop on entity-oriented search

  • Chang Y-C, Dai H-J, Wu JC-Y, Chen J-M, Tsai RT-H, Hsu W-L(2013) TEMPTING system: A hybrid method of rule and machine learning for temporal relation extraction in patient discharge summaries. J Biomed Inform 46(6):S54–S62

    Article  Google Scholar 

  • Deepika SS, Tv G (2021)Pattern-based bootstrapping framework for biomedical relation extraction. Eng Appl Artif Intell 99:104130. https://doi.org/10.1016/j.engappai.2020.104130

    Article  Google Scholar 

  • Devlin J, Chang M, Lee K et al (2019) Bert: pre-training of deep bidirectional transformers for language understanding [C]. Proc of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. ACL, Stroudsburg, 4171-4186

  • Elman JL (1990) Finding structure in time. Cogn Sci 14(2):179–211

    Article  Google Scholar 

  • Ferro L, Gerberl, Mani I et al. Tides 2005 standard for the annotation of temporal expressions [EB /OL]. (2005-09-10) [2019-05-27]. http://www.timex2.mitre.org

  • Ghahabi O, Hernando J (2018) Restricted boltzmann machines for vector representation of speech in speaker recognition. Comput Speech Lang 47:16–29

  • Giannella C, Winder R, Jubinski J (2019) Annotation projection for temporal information extraction. Nat Lang Eng 25:385–403. https://doi.org/10.1017/S1351324919000044

    Article  Google Scholar 

  • Jayapriya K, Jacob IJ, Darney PE (2020) Hyperspectral image classification using multi-task feature leverage with multi-variant deep learning. Earth Sci Inf 13(4):1093–1102

    Article  Google Scholar 

  • Jeong YS, Kim ZM, Do HW, Lim CG, Choi HJ (2015) Temporal information extraction from Korean texts. In Proceedings of the Nineteenth Conference on Computational Natural Language Learning, pp 279-288

  • Kolomiyets O, Moens M-F(2010) KUL: Recognition and normalization of temporal expressions. SemEval@ACL, 325–328

  • Kreimeyer K, Foster M, Pandey A, Arya N, Halford G, Jones SF, Forshee R, Walderhaug M, Botsis T (2017) Natural language processing systems for capturing and standardizing unstructured clinical information: a systematic review. J Biomed Inform 73:14–29

    Article  Google Scholar 

  • Leeuwenberg A, Moens M-F(2019) A survey on temporal reasoning for temporal information extraction from text. J Artif Intell Res 66:341–380. https://doi.org/10.1613/jair.1.11727

    Article  Google Scholar 

  • Li W, Wong K-F, Yuan C (2001) Toward automatic Chinese temporal information extraction. JASIST 52:748–762. https://doi.org/10.1002/asi.1126.abs

    Article  Google Scholar 

  • Li J, Tan H, Wang F (2012) Recognition of temporal expressions and their types in Chinese [J]. Comput Sci 39(S3):191–194211

    Google Scholar 

  • Li Z, Li C, Long Yu, Wang X (2020) A system for automatically extracting clinical events with temporal information. BMC Med Inform Decis Mak 20. https://doi.org/10.1186/s12911-020-01208-9

  • Lin Y-K, Chen Hsiu-chin, Brown R (2013) MedTime: A temporal information extraction system for clinical narratives. J Biomed Inform 46. https://doi.org/10.1016/j.jbi.2013.07.012

  • Liu K, El-Gohary N (2017)Ontology-based semi-supervised conditional random fields for automated information extraction from bridge inspection reports. Autom Constr 81. https://doi.org/10.1016/j.autcon.2017.02.003

  • Ma K, Tian M, Tan Y, Xie X, Qiu Q (2021) What is this article about? Generative summarization with the BERT model in the geosciences domain. Earth Science Informatics. 1-16

  • Mani I, Wilson G (2000) Robust temporal processing of news [C]. Proceedings of the 38th Annual Meeting on ACL, Hongkong, 69-76

  • Martins B, Manguinhas H, Borbinha J, Siabato W (2021) A geo-temporal information extraction service for processing descriptive metadata in digital libraries

  • Meng Y, Rumshisky A, Romanov A (2017) Temporal information extraction for question answering using syntactic dependencies in an LSTM-based architecture. arXiv preprint arXiv:1703.05851.

  • Moharasan G, Ho T-B(2019) Extraction of temporal information from clinical narratives. J Healthc Inform Res 3. https://doi.org/10.1007/s41666-019-00049-0

  • Paramita P, Minard A-LM(2014) Fbk-hlt-time: a complete italian temporal processing system for eventi-evalita 2014. In: Fourth International Workshop EVALITA 2014, pp 44–49

  • Peters ME, Neumann M, Iyyer M et al (2018) Deep contextualized word representations [C]. Proc of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. ACL, Stroudsburg, 2227-2237

  • Qiu Q, Xie Z, Wu L et al (2019)BiLSTM-CRF for geological named entity recognition from the geoscience literature[J]. Earth Sci Inf 12(4):565–579

    Article  Google Scholar 

  • Qiu Q, Xie Z, Wu L et al (2020) Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques[J]. Earth Sci Inf 13(4):1393–1410

    Article  Google Scholar 

  • Qu J, Ouyang D, Hua W, Ye, Yuxin, Li X (2018) Distant supervision for neural relation extraction integrated with word attention and property features. Neural Netw 100. https://doi.org/10.1016/j.neunet.2018.01.006

  • Radford A, Narasimhan K, Salimans T (2018) Improving language understanding with unsupervised learning [EB /OL]. [2019-10-30]. https://www.openai.com/blog/language-unsupervised

  • Rumelhart D, Hinton G, Williams R (1986) Learning representations by back-propagating errors. Nature 323:533–536

    Article  Google Scholar 

  • Sagcan M, Karagoz P (2015) Toponym recognition in social media for estimating the location of events. ICDM Workshops, 33–39

  • Santos R, Murrietaflores P, Calado P, Martins B (2017) Toponym matching through deep neural networks. Int J Geogr Inf Sci 32(3):1–25

    Google Scholar 

  • Song G, Zhang S, Jia F, Jiang S (2019) Temporal information extraction and normalization method in Chinese Texts [J]. J Geomat Sci Technol 36(05):538–544

    Google Scholar 

  • Strötgen J, Gertz M (2010) Heideltime: High quality rule-based extraction and normalization of temporal expressions. In: Proceedings of the 5th International Workshop on Semantic Evaluation, pp 321-324

  • Tourille J, Ferret O, Névéol A, Tannier X (2017) Temporal information extraction from clinical text, 739-745. https://doi.org/10.18653/v1/E17-2117

  • Tourille J, Ferret O, Neveol A, Tannier X (2016) Extraction de relations temporelles dans des dossiers électroniques patient, in: Actes de la Conference Traitement Automatique des Langues Naturelles (TALN 2016, article court), Paris, France

  • Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need [C]. Advances in Neural. Information Processing Systems 30. Curran Associates, New York, pp 5998–6008

    Google Scholar 

  • Verhagen M, Saur R, Caselli T, et al (2010)SemEval-2010 task 13: TempEval-2 [C]. Proceedings of the 5th International Workshop on Semantic Evaluation. Uppsala, Sweden, 57-62

  • Viani N, Kam J, Yin L, Bittar A, Dutta R, Patel R, Stewart R, Sumithra V (2020) Temporal information extraction from mental health records to identify duration of untreated psychosis. J Biomed Semantics 11. https://doi.org/10.1186/s13326-020-00220-2

  • Vicente-Díez MT, Martínez P (2009) Temporal semantics extraction for improving web search. DEXA Workshops, 69–73

  • Wang W, Kreimeyer K, Woo E, Ball R, Foster M, Pandey A, Scott J, Botsis T (2016) A new algorithmic approach for the extraction of temporal associations from clinical narratives with an application to medical product safety surveillance reports. J Biomed Inform 62. https://doi.org/10.1016/j.jbi.2016.06.006

  • Wang J, Hu Y, Joseph K (2020) NeuroTPR: A neuro-net toponym recognition model for extracting locations from social media messages[J]. Trans GIS 24(3):719–735

    Article  Google Scholar 

  • Werbos PJ (1988) Generalization of backpropagation with application to a recurrent gas market model. Neural Netw 1(4):339–356

    Article  Google Scholar 

  • Wong K-F, Xia Y, Li W, Yuan C (2012) An overview of temporal information extraction. Int J Comput Process Lang 18. https://doi.org/10.1142/S0219427905001225

  • Wu T, Zhou Y, Huang X, Wu L (2010) Chinese time expression recognition based on automatically generated. Basic Time Unit Rules 24(04):3–10

    Google Scholar 

  • Yao L, Zhang Y, Chen Q, Qian H, Hu Z (2017) Mining coherent topics in documents using word embeddings and large-scale text data. Eng Appl Artif Intell 64:432–439

    Article  Google Scholar 

  • Zhang Chunju Z, Xueying L, Ming W (2014) Temporal information analysis method in Chinese text [J]. Geogr Geo-Inf Sci 30(06):1–7

    Google Scholar 

  • Zhou X, Li H, Lu X, Duan H (2011) Temporal expression recognition and temporal relationship extraction from chinese narrative medical records. 2011 5th International Conference on Bioinformatics and Biomedical Engineering, Wuhan, pp 1-4. https://doi.org/10.1109/icbbe.2011.5780699

  • Zhou P, Xu J, Qi Z, Bao H, Chen Z, Xu B (2018) Distant supervision for relation extraction with hierarchical selective attention. Neural Netw 108. https://doi.org/10.1016/j.neunet.2018.08.016

  • Zhou X, Tong W, Li L (2020) Deep learning spatiotemporal air pollution data in China using data fusion. Earth Sci Inform 13:859–868. https://doi.org/10.1007/s12145-020-00470-9

    Article  Google Scholar 

Download references

Acknowledgements

The study was supported by the National Natural Science Foundation of China (No. 42050101, U1711267, 41871311, 41871305), the Open Research Project of The Hubei Key Laboratory of Intelligent Geo-Information Processing (No. KLIGIP-2021A01), Major scientific and technological innovation projects in Shandong Province (2019JZZY020105), the China Postdoctoral Science Foundation (No.2021M702991), and the Fundamental Research Funds for the Central Universities, China University of Geosciences (Wuhan) (No. CUG2106116)).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qinjun Qiu.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ma, K., Tan, Y., Tian, M. et al. Extraction of temporal information from social media messages using the BERT model. Earth Sci Inform 15, 573–584 (2022). https://doi.org/10.1007/s12145-021-00756-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12145-021-00756-6

Keywords

Navigation