Skip to main content

Microtext Processing

  • Reference work entry
  • First Online:
Encyclopedia of Social Network Analysis and Mining

Synonyms

Comment; Instant message; Microblog; Microtext; Post; SMS; Status update; Tweet

Glossary

NLP :

Natural Language Processing

Definition

The term ā€œmicrotextā€ was proposed by US Navy researchers (Dela Rosa and Ellen 2009) to describe a type of written text document that has three characteristics: (a) it is very short, typically one or two sentences, and possibly as little as a single word; (b) it is written in an informal manner and unedited for quality and thus may use loose grammar, a conversational tone, vocabulary errors, and uncommon abbreviations and acronyms; and (c) it is semi-structured in the NLP sense, in that it includes some metadata such as a time stamp, an author, or the name of a field it was entered into. Microtexts have become omnipresent in todayā€™s world: they are notably found in online chat discussions; online forum posts; user comments posted on online material such as videos, pictures, and news stories; Facebook newsfeeds and Twitter updates; Internet...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 1,500.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 549.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Baldwin T, Chai JY (2011) Beyond normalization: pragmatics of word form in text messages. In: 5th international joint conference on natural language processing, Chiang Mai, 8ā€“13 Nov 2011

    Google ScholarĀ 

  • Barbosa L, Feng, J (2010) Robust sentiment detection on Twitter from biased and noisy data. In: Proceedings of the 23rd international conference on computational Linguistics, Beijing, pp 36ā€“44

    Google ScholarĀ 

  • Baron N, Ling R (2007) Text messaging and IM: linguistic comparison of American college data. J Lang Psychol Stud 26:291ā€“298

    Google ScholarĀ 

  • Chen L, Wang W, Sheth AP (2012) Are Twitter users equal in predicting elections? A study of user groups in predicting 2012 U.S. republican presidential primaries. In: SocInfo 2012, Lausanne. LNCS 7710, Springer, pp 379ā€“392

    Google ScholarĀ 

  • Cormack GV, GĆ³mez Hidalgo JM, Puertas SĆ”nz E (2007) Spam filtering for short messages. In: Proceedings of the 16th ACM conference on information and knowledge management (ACM CIKMā€™07), Lisbon, pp 313ā€“320

    Google ScholarĀ 

  • Cvijikj IP, Michahelles F (2011) Monitoring trends on Facebook. In: Ninth IEEE international conference on dependable, autonomic and secure computing, Zurich, 12ā€“14 Dec 2011, pp 895ā€“202

    Google ScholarĀ 

  • Dela Rosa K, Ellen J (2009) Text classification methodologies applied to micro-text in military chat. In: Proceedings of the international conference on machine learning and applications, Miami Beach, pp 710ā€“714

    Google ScholarĀ 

  • Dong H, Hui SC, He Y (2006) Structural analysis of chat messages for topic detection. Online Inf Rev 30(5):496ā€“516

    Google ScholarĀ 

  • Ellen J (2011) All about microtext: a working definition and a survey of current microtext research within artificial intelligence and natural language processing. In: ICAART (1), Rome, pp 329ā€“336

    Google ScholarĀ 

  • Ferrara K, Brunner H, Whittemore G (1991) Interactive written discourse as an emergent register. Writ Commun 8:8ā€“34

    Google ScholarĀ 

  • Go A, Bhayani R, Huang L (2009) Twitter sentiment classification using distant supervision. Technical report, Stanford

    Google ScholarĀ 

  • Healy M, Delany S, Zamolotskikh A (2005) An assessment of case-based reasoning for short text messages. In: Creaney N (ed) Proceedings of the 16th Irish conference on artificial intelligence and cognitive science, pp 257ā€“266

    Google ScholarĀ 

  • Kolenda T, Hansen LK, Larsen J (2001) Signal detection using ICA: application to chat room topic spotting. In: Proceedings of the third international conference on independent component analysis and blind source separation, San Diego, pp 540ā€“545

    Google ScholarĀ 

  • Liu F, Weng F, Wang B, Liu Y (2011) Insertion, deletion, or substitution?: normalizing text messages without pre-categorization nor supervision. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies, Portland, vol 2, pp 71ā€“76

    Google ScholarĀ 

  • Pak A, Paroubek P (2010) Twitter as a corpus for sentiment analysis and opinion mining. In: Proceedings of the seventh conference on international language resources and evaluation, Valletta. European Language

    Google ScholarĀ 

  • Paolillo JC (1999) The virtual speech community: social network and language variation on IRC. In: Proceedings of the 32nd annual Hawaii international conference on system sciences, Maui

    Google ScholarĀ 

  • Petrovic S, Osborne M, Lavrenko V (2010) The Edinburgh Twitter corpus. In: Proceedings of the NAACL HLT workshop on computational linguistics in a world of social media, Los Angeles pp 25ā€“26

    Google ScholarĀ 

  • Ritterman J, Osborne M, Klein E (2009) Using prediction markets and Twitter to predict a swine flu pandemic. In: 1st international workshop on mining social media ā€“ 13th conference of the Spanish association for artificial intelligence

    Google ScholarĀ 

  • Takahashi T, Tomioka R, Yamanishi K (2011) Discovering emerging topics in social streams via link anomaly detection. In: 11th IEEE international conference on data mining, Tokyo, 11ā€“14 Dec 2011, pp 1230ā€“1235

    Google ScholarĀ 

  • Wang AH (2010) Donā€™t follow me ā€“ spam detection in Twitter. In: Proceedings of the international conference on security and cryptography (SECRYPT 2010), Athens, pp 142ā€“151

    Google ScholarĀ 

  • Wu T, Khan FM, Fisher TA, Shuler LA, Pottenger WM (2002) Posting act tagging using transformation-based learning. In: The proceedings of the workshop on foundations of data mining and discovery, IEEE international conference on data mining (ICDMā€™02), Dec 2002

    Google ScholarĀ 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

Ā© 2014 Springer Science+Business Media New York

About this entry

Cite this entry

Khoury, R., Khoury, R., Hamou-Lhadj, A. (2014). Microtext Processing. In: Alhajj, R., Rokne, J. (eds) Encyclopedia of Social Network Analysis and Mining. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-6170-8_353

Download citation

Publish with us

Policies and ethics