Skip to main content
Log in

Mood sensing from social media texts and its applications

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

We present a large-scale mood analysis in social media texts. We organise the paper in three parts: (1) addressing the problem of feature selection and classification of mood in blogosphere, (2) we extract global mood patterns at different level of aggregation from a large-scale data set of approximately 18 millions documents (3) and finally, we extract mood trajectory for an egocentric user and study how it can be used to detect subtle emotion signals in a user-centric manner, supporting discovery of hyper-groups of communities based on sentiment information. For mood classification, two feature sets proposed in psychology are used, showing that these features are efficient, do not require a training phase and yield classification results comparable to state of the art, supervised feature selection schemes; on mood patterns, empirical results for mood organisation in the blogosphere are provided, analogous to the structure of human emotion proposed independently in the psychology literature; and on community structure discovery, sentiment-based approach can yield useful insights into community formation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Notes

  1. For example, blog text was found to have a higher occurrence of the 1st person singular than conversations [31].

  2. For example, one of the main reasons for writing, cited by bloggers, is to speak their minds: www.intac.net/breakdown-of-the-blogosphere/, accessed August 2011.

  3. From the state of the blogosphere 2008 at http://technorati.com.

  4. www.proxem.com.

  5. http://www.liwc.net/descriptiontable1.php —accessed July 2011.

  6. http://www.icwsm.org/2009/data/, retrieved November 2011.

  7. Consistent with what is reported in [36].

  8. For the full list of predefined moods, visit http://www.livejournal.com.

  9. http://www.livejournal.com/moodlist.bml —accessed July 2011.

  10. http://www.liwc.net/descriptiontable1.php—accessed July 2011.

  11. http://www.livejournal.com/browse/—accessed July 2011.

References

  1. Adams B, Phung D, Venkatesh S (2010) Discovery of latent subcommunities in a blog’s readership. ACM Trans Web 4(3):1–30

    Article  Google Scholar 

  2. Backstrom L, Huttenlocher D, Kleinberg J, Lan X (2006) Group formation in large social networks: membership, growth, and evolution. In: Proceedings of the ACM international conference on knowledge discovery and data mining (SIGKDD), pp 44–54

  3. Berendt B, Hanser C (2007) Tags are not metadata, but ‘just more content’-to some people. In: Proceedings of the international AAAI conference on weblogs and social media (ICWSM)

  4. Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022

    MATH  Google Scholar 

  5. Bolón-Canedo V, Sánchez-Maroño N, Alonso-Betanzos A (2013) A review of feature selection methods on synthetic data. Knowl Inf Syst 34: 483–519

    Google Scholar 

  6. Bradley MM, Lang PJ (1999) Affective norms for English words (ANEW): instruction manual and affective ratings. University of Florida, Gainesville

    Google Scholar 

  7. Cambria E, Hussain A, Havasi C, Eckl C, Munro J (2010) Towards crowd validation of the UK national health service. In: Proceedings of the web science conference (WebSci)

  8. Fan TK, Chang CH (2010) Sentiment-oriented contextual advertising. Knowl Inf Syst 23:321–344

    Article  Google Scholar 

  9. Farahat AK, Ghodsi A, Kamel MS (2012) Efficient greedy feature selection for unsupervised learning. Knowl Inf Syst 1–26. doi:10.1007/s10115-012-0538-1

  10. Feng S, Wang D, Yu G, Gao W, Wong KF (2011) Extracting common emotions from blogs based on fine-grained sentiment clustering. Knowl Inf Syst 27:281–302

    Article  MATH  Google Scholar 

  11. Frey BJ, Dueck D (2007) Clustering by passing messages between data points. Science 315:972–976

    Article  MATH  MathSciNet  Google Scholar 

  12. Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. ACM SIGKDD Explor Newsl 11(1):10–18

    Article  Google Scholar 

  13. Hayes C, Avesani P (2007) Using tags and clustering to identify topic-relevant blogs. In: Proceedings of the international AAAI conference on weblogs and social media (ICWSM)

  14. Hu X, Downie JS (2007) Exploring mood metadata: relationships with genre, artist and usage metadata. In: Proceedings of the international conference on music, information retrieval

  15. Kumar R, Novak J, Tomkins A (2006) Structure and evolution of online social networks. In: Proceedings of the ACM international conference on knowledge discovery and data mining (SIGKDD), p 617

  16. Leshed G, Kaye JJ (2006) Understanding how bloggers feel: recognizing affect in blog posts. In: Proceedings of the ACM conference on human factors in computing systems (SIGCHI), p 1024

  17. Long C, Zhang J, Huang M, Zhu X, Li M, Ma B (2012) Estimating feature ratings through an effective review selection approach. Knowl Inf Syst (accepted)

  18. McCallum A, Wang X, Corrada-Emmanuel A (2007) Topic and role discovery in social networks with experiments on enron and academic email. J Artif Intell Res 30:249–272

    Google Scholar 

  19. McCallum A, Wang X, Mohanty N (2007) Joint group and topic discovery from relations and text. Lect Notes Comput Sci 4503:28

    Article  Google Scholar 

  20. Mishne G (2005) Experiments with mood classification in blog posts. In: Proceedings of ACM workshop on stylistic analysis of text for information access

  21. Mishne G, Glance N (2006) Predicting movie sales from blogger sentiment. In: Proceedings of the AAAI spring symposium on computational approaches to analysing weblogs

  22. Mohtasseb H, Ahmed A (2012) Two-layered blogger identification model integrating profile and instance-based methods. Knowl Inf Syst 31(1):1–21

    Article  Google Scholar 

  23. Nallapati R, Cohen W (2008) Link-PLSA-LDA: a new unsupervised model for topics and influence of blogs. In: Proceedings of the international AAAI conference on weblogs and social media (ICWSM)

  24. Negoescu RA, Adams B, Phung D, Venkatesh S, Gatica-Perez D (2009) Flickr hypergroups. In: Proceedings of the ACM international conference on multimedia, pp 813–816

  25. Nguyen T, Phung D, Adams B, Tran T, Venkatesh S (2010) Classification and pattern discovery of mood in weblogs. Adv Knowl Discov Data Min 6119:283–290

    Google Scholar 

  26. Nguyen T, Phung D, Adams B, Tran T, Venkatesh S (2010) Hyper-community detection in the blogosphere. In: Proceeding of ACM workshop on social media, in conjunction with ACM Int Conf on Multime’d (ACM-MM). ACM, Firenze, Italy

  27. Nigam K, Hurst M (2004) Towards a robust metric of opinion. In: AAAI spring symposium on exploring attitude and affect in text, pp 598–603

  28. Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(1–2):1–135

    Article  Google Scholar 

  29. Pang B, Lee L, Vaithyanathan S (2002) Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the ACL conference on empirical methods in natural language processing, pp 79–86

  30. Pennebaker JW, Chung CK, Ireland M, Gonzales A, Booth RJ (2007) The development and psychometric properties of LIWC2007. LIWC, Austin

    Google Scholar 

  31. Pennebaker JW, Francis ME, Booth RJ (2007) Linguistic inquiry and word count (LIWC) [computer software]. LIWC, Austin

    Google Scholar 

  32. Russell JA (1980) A circumplex model of affect. J Pers Soc Psychol 39(6):1161–1178

    Article  Google Scholar 

  33. Russell JA (2003) Core affect and the psychological construction of emotion. Psychol Rev 110(1):145

    Google Scholar 

  34. Sebastiani F (2002) Machine learning in automated text categorization. ACM Comput Surv 34(1):1–47

    Article  Google Scholar 

  35. Song X, Lin CY, Tseng BL, Sun MT (2005) Modeling and predicting personal information dissemination behavior. In: Proceedings of the ACM international conference on knowledge discovery and data mining (SIGKDD), pp 479–488

  36. Sood SO, Vasserman L (2009) ESSE: exploring mood on the web. In: Proceedings of the international AAAI conference on weblogs and social media (ICWSM)

  37. Tausczik YR, Pennebaker JW (2010) The psychological meaning of words: LIWC and computerized text analysis methods. J Lang Soc Psychol 29(1):24

    Article  Google Scholar 

  38. Teh YW, Jordan MI (2010) Hierarchical bayesian nonparametric models with applications. In: Hjort N, Holmes C, Müller P, Walker S (eds) Bayesian nonparametrics: principles and practice. Cambridge University Press, Cambridge

    Google Scholar 

  39. Tsuruoka Y, Tsujii J (2005) Bidirectional inference with the easiest-first strategy for tagging sequence data. In: Proceedings of the conference on human language technology and empirical methods in natural language processing, pp 467–474

  40. Yang Y, Pedersen JO (1997) A comparative study on feature selection in text categorization. In: Proceedings of the international conference on machine learning (ICML), pp 412–420

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thin Nguyen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nguyen, T., Phung, D., Adams, B. et al. Mood sensing from social media texts and its applications. Knowl Inf Syst 39, 667–702 (2014). https://doi.org/10.1007/s10115-013-0628-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-013-0628-8

Keywords

Navigation