Abstract
In disasters, a large amount of information is exchanged via SMS messages. The content of these messages can be of high value and strategic interest. SMS messages tend to be informal and to contain abbreviations and misspellings, which are problems for current information extraction tools. Here, we describe an architecture designed to address the matter through four components: linguistic processing, temporal processing, event processing, and information fusion. Thereafter, we present a case study over a SMS corpus of messages sent to an electric utility company and a prototype built with Python and NLTK to validate the architecture’s information extraction components, obtaining Precision of 88%, Recall of 59% and F-measure (F1) of 71%. The work also serves as a roadmap to the treatment of emergency SMS in Portuguese.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bernicot, J., Volckaert-Legrier, O., Goumi, A., Bert-Erboul, A.: Forms and functions of SMS messages: A study of variations in a corpus written by adolescents. Journal of Pragmatics 4412, 1701–1715 (2012)
Corvey, W.J., Verma, S., Vieweg, S., Palmer, M., Martin, J.H.: Foundations of a multilayer annotation framework for twitter communications during crisis events. In: 8th International Conference on Language Resources and Evaluation Conference (LREC), p. 5 (2012)
Cowie, J., Lehnert, W.: Information extraction. Communications of the ACM 391, 80–91 (1996)
Dai, Y., Kakkonen, T., Sutinen, E.: SoMEST: a model for detecting competitive intelligence from social media. In: Proceedings of the 15th International Academic MindTrek Conference: Envisioning Future Media Environments, pp. 241–248 (2011)
Jurafsky, D., Martin, J. H.: Speech and language processing, 2nd edn. Prentice Hall (2008)
Melero, M., Costa-Juss, M.R., Domingo, J., Marquina, M., Quixal, M.: Holaaa!! writin like u talk is kewl but kinda hard 4 NLP. In: 8th International Conference on Language Resources and Evaluation Conference (LREC), pp. 3794–3800 (2012)
Miner, G., Elder, J.I., Hill, T., Nisbet, R., Delen, D.: Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications. Elsevier, Burlington (2012)
Pustejovsky, J., Stubbs, A.: Natural Language Annotation for Machine Learning. OReilly Media, Inc. (2012)
Ritter, A., Etzioni, O., Clark, S., et al.: Open domain event extraction from twitter. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1104–1112 (2012)
Seon, C.-N., Yoo, J., Kim, H., Kim, J.-H., Seo, J.: Lightweight named entity extraction for korean short message service text. KSII Transactions on Internet and Information Systems (TIIS) 5–3, 560–574 (2011)
Sridhar, V.K.R., Chen, J., Bangalore, S., Shacham, R.: A Framework for translating SMS messages. In: Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pp. 974–983 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Monteiro, D., de Lima, V.L.S. (2015). A SMS Information Extraction Architecture to Face Emergency Situations. In: Pereira, F., Machado, P., Costa, E., Cardoso, A. (eds) Progress in Artificial Intelligence. EPIA 2015. Lecture Notes in Computer Science(), vol 9273. Springer, Cham. https://doi.org/10.1007/978-3-319-23485-4_74
Download citation
DOI: https://doi.org/10.1007/978-3-319-23485-4_74
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23484-7
Online ISBN: 978-3-319-23485-4
eBook Packages: Computer ScienceComputer Science (R0)