Abstract
This paper describes the results of the analysis of an experimentally collected small corpus of messages exchanged through an instant messaging (IM) programme. The data is analysed from the point of view of automatic parsing. Special attention is paid to two problems associated with IM discourse: the semantic multi-tasking (or the interweaving of topics) of conversation partners, and the non-standard spelling found in such dialogues. The contents of the corpus are also compared with other types of written dialogues, i.e. SMS messages and conversations between human users and chatterbots. Finally, some solutions are proposed to facilitate the process of automatic parsing of IM messages.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Vetulani, Z., Marciniak, J., Konieczka, P., Walkowska, J.: An SMS-Based System Architecture (Logical Model) To Support Management of Information Exchange in Emergency Situations. POLINT-112-SMS project. In: Intelligent Information Processing IV. 5th IFIP International Conference on Intelligent Information Processing, pp. 240â253. Springer, Boston (2008)
Vetulani, Z., Marciniak, J.: Corpus Based Methodology in the Study and Design of Systems with Emulated Linguistic Competence. In: Christodoulakis, D.N. (ed.) NLP 2000. LNCS (LNAI), vol. 1835, pp. 346â357. Springer, Heidelberg (2000)
Walkowska, J.: Gathering and Analysis of a Corpus of Polish SMS Dialogues. In: Challenging Problems of Science. Computer Science. Recent Advances in Intelligent Information Systems, pp. 145â157. Academic Publishing House EXIT, Warsaw (2009)
Grosz, B.J., Sidner, C.L.: Attention, Intentions and the Structure of Discourse. Computational Linguistics 12(3), 175â204 (1986)
Hult, C.A., Richins, R.: The Rhetoric And Discourse Of Instant Messaging. Computers and Composition Online (2006), http://www.bgsu.edu/cconline/hultrichins_im/hultrichins_im.htm
Thurlow, C., Brown, A.: Generation Txt? The Sociolinguistics of Young Peopleâs Text-Messaging. Discourse Analysis Online (2003), http://extra.shu.ac.uk/daol/articles/v1/n1/a3/thurlow2002003-01.html
Fairon, C., Paumier, S.: A Translated Corpus of 30,000 French SMS. In: Proceedings of LREC 2006, Genoa (2006)
Levenshtein, V.I.: Binary Codes Capable of Correcting Deletions, Insertions, and Reversals. Soviet Physics Doklady 10, 707â710 (1966)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Walkowska, J. (2010). An NLP-Oriented Analysis of the Instant Messaging Discourse. In: Sojka, P., HorĂĄk, A., KopeÄek, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2010. Lecture Notes in Computer Science(), vol 6231. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15760-8_73
Download citation
DOI: https://doi.org/10.1007/978-3-642-15760-8_73
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15759-2
Online ISBN: 978-3-642-15760-8
eBook Packages: Computer ScienceComputer Science (R0)