Synonyms
Glossary
- NLP :
-
Natural Language Processing
Definition
The term āmicrotextā was proposed by US Navy researchers (Dela Rosa and Ellen 2009) to describe a type of written text document that has three characteristics: (a) it is very short, typically one or two sentences, and possibly as little as a single word; (b) it is written in an informal manner and unedited for quality and thus may use loose grammar, a conversational tone, vocabulary errors, and uncommon abbreviations and acronyms; and (c) it is semi-structured in the NLP sense, in that it includes some metadata such as a time stamp, an author, or the name of a field it was entered into. Microtexts have become omnipresent in todayās world: they are notably found in online chat discussions; online forum posts; user comments posted on online material such as videos, pictures, and news stories; Facebook newsfeeds and Twitter updates; Internet...
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Baldwin T, Chai JY (2011) Beyond normalization: pragmatics of word form in text messages. In: 5th international joint conference on natural language processing, Chiang Mai, 8ā13 Nov 2011
Barbosa L, Feng, J (2010) Robust sentiment detection on Twitter from biased and noisy data. In: Proceedings of the 23rd international conference on computational Linguistics, Beijing, pp 36ā44
Baron N, Ling R (2007) Text messaging and IM: linguistic comparison of American college data. J Lang Psychol Stud 26:291ā298
Chen L, Wang W, Sheth AP (2012) Are Twitter users equal in predicting elections? A study of user groups in predicting 2012 U.S. republican presidential primaries. In: SocInfo 2012, Lausanne. LNCS 7710, Springer, pp 379ā392
Cormack GV, GĆ³mez Hidalgo JM, Puertas SĆ”nz E (2007) Spam filtering for short messages. In: Proceedings of the 16th ACM conference on information and knowledge management (ACM CIKMā07), Lisbon, pp 313ā320
Cvijikj IP, Michahelles F (2011) Monitoring trends on Facebook. In: Ninth IEEE international conference on dependable, autonomic and secure computing, Zurich, 12ā14 Dec 2011, pp 895ā202
Dela Rosa K, Ellen J (2009) Text classification methodologies applied to micro-text in military chat. In: Proceedings of the international conference on machine learning and applications, Miami Beach, pp 710ā714
Dong H, Hui SC, He Y (2006) Structural analysis of chat messages for topic detection. Online Inf Rev 30(5):496ā516
Ellen J (2011) All about microtext: a working definition and a survey of current microtext research within artificial intelligence and natural language processing. In: ICAART (1), Rome, pp 329ā336
Ferrara K, Brunner H, Whittemore G (1991) Interactive written discourse as an emergent register. Writ Commun 8:8ā34
Go A, Bhayani R, Huang L (2009) Twitter sentiment classification using distant supervision. Technical report, Stanford
Healy M, Delany S, Zamolotskikh A (2005) An assessment of case-based reasoning for short text messages. In: Creaney N (ed) Proceedings of the 16th Irish conference on artificial intelligence and cognitive science, pp 257ā266
Kolenda T, Hansen LK, Larsen J (2001) Signal detection using ICA: application to chat room topic spotting. In: Proceedings of the third international conference on independent component analysis and blind source separation, San Diego, pp 540ā545
Liu F, Weng F, Wang B, Liu Y (2011) Insertion, deletion, or substitution?: normalizing text messages without pre-categorization nor supervision. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies, Portland, vol 2, pp 71ā76
Pak A, Paroubek P (2010) Twitter as a corpus for sentiment analysis and opinion mining. In: Proceedings of the seventh conference on international language resources and evaluation, Valletta. European Language
Paolillo JC (1999) The virtual speech community: social network and language variation on IRC. In: Proceedings of the 32nd annual Hawaii international conference on system sciences, Maui
Petrovic S, Osborne M, Lavrenko V (2010) The Edinburgh Twitter corpus. In: Proceedings of the NAACL HLT workshop on computational linguistics in a world of social media, Los Angeles pp 25ā26
Ritterman J, Osborne M, Klein E (2009) Using prediction markets and Twitter to predict a swine flu pandemic. In: 1st international workshop on mining social media ā 13th conference of the Spanish association for artificial intelligence
Takahashi T, Tomioka R, Yamanishi K (2011) Discovering emerging topics in social streams via link anomaly detection. In: 11th IEEE international conference on data mining, Tokyo, 11ā14 Dec 2011, pp 1230ā1235
Wang AH (2010) Donāt follow me ā spam detection in Twitter. In: Proceedings of the international conference on security and cryptography (SECRYPT 2010), Athens, pp 142ā151
Wu T, Khan FM, Fisher TA, Shuler LA, Pottenger WM (2002) Posting act tagging using transformation-based learning. In: The proceedings of the workshop on foundations of data mining and discovery, IEEE international conference on data mining (ICDMā02), Dec 2002
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Ā© 2014 Springer Science+Business Media New York
About this entry
Cite this entry
Khoury, R., Khoury, R., Hamou-Lhadj, A. (2014). Microtext Processing. In: Alhajj, R., Rokne, J. (eds) Encyclopedia of Social Network Analysis and Mining. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-6170-8_353
Download citation
DOI: https://doi.org/10.1007/978-1-4614-6170-8_353
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-6169-2
Online ISBN: 978-1-4614-6170-8
eBook Packages: Computer ScienceReference Module Computer Science and Engineering