Abstract
We present a large-scale mood analysis in social media texts. We organise the paper in three parts: (1) addressing the problem of feature selection and classification of mood in blogosphere, (2) we extract global mood patterns at different level of aggregation from a large-scale data set of approximately 18 millions documents (3) and finally, we extract mood trajectory for an egocentric user and study how it can be used to detect subtle emotion signals in a user-centric manner, supporting discovery of hyper-groups of communities based on sentiment information. For mood classification, two feature sets proposed in psychology are used, showing that these features are efficient, do not require a training phase and yield classification results comparable to state of the art, supervised feature selection schemes; on mood patterns, empirical results for mood organisation in the blogosphere are provided, analogous to the structure of human emotion proposed independently in the psychology literature; and on community structure discovery, sentiment-based approach can yield useful insights into community formation.
Similar content being viewed by others
Notes
For example, blog text was found to have a higher occurrence of the 1st person singular than conversations [31].
For example, one of the main reasons for writing, cited by bloggers, is to speak their minds: www.intac.net/breakdown-of-the-blogosphere/, accessed August 2011.
From the state of the blogosphere 2008 at http://technorati.com.
http://www.liwc.net/descriptiontable1.php —accessed July 2011.
http://www.icwsm.org/2009/data/, retrieved November 2011.
Consistent with what is reported in [36].
For the full list of predefined moods, visit http://www.livejournal.com.
http://www.livejournal.com/moodlist.bml —accessed July 2011.
http://www.liwc.net/descriptiontable1.php—accessed July 2011.
http://www.livejournal.com/browse/—accessed July 2011.
References
Adams B, Phung D, Venkatesh S (2010) Discovery of latent subcommunities in a blog’s readership. ACM Trans Web 4(3):1–30
Backstrom L, Huttenlocher D, Kleinberg J, Lan X (2006) Group formation in large social networks: membership, growth, and evolution. In: Proceedings of the ACM international conference on knowledge discovery and data mining (SIGKDD), pp 44–54
Berendt B, Hanser C (2007) Tags are not metadata, but ‘just more content’-to some people. In: Proceedings of the international AAAI conference on weblogs and social media (ICWSM)
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022
Bolón-Canedo V, Sánchez-Maroño N, Alonso-Betanzos A (2013) A review of feature selection methods on synthetic data. Knowl Inf Syst 34: 483–519
Bradley MM, Lang PJ (1999) Affective norms for English words (ANEW): instruction manual and affective ratings. University of Florida, Gainesville
Cambria E, Hussain A, Havasi C, Eckl C, Munro J (2010) Towards crowd validation of the UK national health service. In: Proceedings of the web science conference (WebSci)
Fan TK, Chang CH (2010) Sentiment-oriented contextual advertising. Knowl Inf Syst 23:321–344
Farahat AK, Ghodsi A, Kamel MS (2012) Efficient greedy feature selection for unsupervised learning. Knowl Inf Syst 1–26. doi:10.1007/s10115-012-0538-1
Feng S, Wang D, Yu G, Gao W, Wong KF (2011) Extracting common emotions from blogs based on fine-grained sentiment clustering. Knowl Inf Syst 27:281–302
Frey BJ, Dueck D (2007) Clustering by passing messages between data points. Science 315:972–976
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. ACM SIGKDD Explor Newsl 11(1):10–18
Hayes C, Avesani P (2007) Using tags and clustering to identify topic-relevant blogs. In: Proceedings of the international AAAI conference on weblogs and social media (ICWSM)
Hu X, Downie JS (2007) Exploring mood metadata: relationships with genre, artist and usage metadata. In: Proceedings of the international conference on music, information retrieval
Kumar R, Novak J, Tomkins A (2006) Structure and evolution of online social networks. In: Proceedings of the ACM international conference on knowledge discovery and data mining (SIGKDD), p 617
Leshed G, Kaye JJ (2006) Understanding how bloggers feel: recognizing affect in blog posts. In: Proceedings of the ACM conference on human factors in computing systems (SIGCHI), p 1024
Long C, Zhang J, Huang M, Zhu X, Li M, Ma B (2012) Estimating feature ratings through an effective review selection approach. Knowl Inf Syst (accepted)
McCallum A, Wang X, Corrada-Emmanuel A (2007) Topic and role discovery in social networks with experiments on enron and academic email. J Artif Intell Res 30:249–272
McCallum A, Wang X, Mohanty N (2007) Joint group and topic discovery from relations and text. Lect Notes Comput Sci 4503:28
Mishne G (2005) Experiments with mood classification in blog posts. In: Proceedings of ACM workshop on stylistic analysis of text for information access
Mishne G, Glance N (2006) Predicting movie sales from blogger sentiment. In: Proceedings of the AAAI spring symposium on computational approaches to analysing weblogs
Mohtasseb H, Ahmed A (2012) Two-layered blogger identification model integrating profile and instance-based methods. Knowl Inf Syst 31(1):1–21
Nallapati R, Cohen W (2008) Link-PLSA-LDA: a new unsupervised model for topics and influence of blogs. In: Proceedings of the international AAAI conference on weblogs and social media (ICWSM)
Negoescu RA, Adams B, Phung D, Venkatesh S, Gatica-Perez D (2009) Flickr hypergroups. In: Proceedings of the ACM international conference on multimedia, pp 813–816
Nguyen T, Phung D, Adams B, Tran T, Venkatesh S (2010) Classification and pattern discovery of mood in weblogs. Adv Knowl Discov Data Min 6119:283–290
Nguyen T, Phung D, Adams B, Tran T, Venkatesh S (2010) Hyper-community detection in the blogosphere. In: Proceeding of ACM workshop on social media, in conjunction with ACM Int Conf on Multime’d (ACM-MM). ACM, Firenze, Italy
Nigam K, Hurst M (2004) Towards a robust metric of opinion. In: AAAI spring symposium on exploring attitude and affect in text, pp 598–603
Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(1–2):1–135
Pang B, Lee L, Vaithyanathan S (2002) Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the ACL conference on empirical methods in natural language processing, pp 79–86
Pennebaker JW, Chung CK, Ireland M, Gonzales A, Booth RJ (2007) The development and psychometric properties of LIWC2007. LIWC, Austin
Pennebaker JW, Francis ME, Booth RJ (2007) Linguistic inquiry and word count (LIWC) [computer software]. LIWC, Austin
Russell JA (1980) A circumplex model of affect. J Pers Soc Psychol 39(6):1161–1178
Russell JA (2003) Core affect and the psychological construction of emotion. Psychol Rev 110(1):145
Sebastiani F (2002) Machine learning in automated text categorization. ACM Comput Surv 34(1):1–47
Song X, Lin CY, Tseng BL, Sun MT (2005) Modeling and predicting personal information dissemination behavior. In: Proceedings of the ACM international conference on knowledge discovery and data mining (SIGKDD), pp 479–488
Sood SO, Vasserman L (2009) ESSE: exploring mood on the web. In: Proceedings of the international AAAI conference on weblogs and social media (ICWSM)
Tausczik YR, Pennebaker JW (2010) The psychological meaning of words: LIWC and computerized text analysis methods. J Lang Soc Psychol 29(1):24
Teh YW, Jordan MI (2010) Hierarchical bayesian nonparametric models with applications. In: Hjort N, Holmes C, Müller P, Walker S (eds) Bayesian nonparametrics: principles and practice. Cambridge University Press, Cambridge
Tsuruoka Y, Tsujii J (2005) Bidirectional inference with the easiest-first strategy for tagging sequence data. In: Proceedings of the conference on human language technology and empirical methods in natural language processing, pp 467–474
Yang Y, Pedersen JO (1997) A comparative study on feature selection in text categorization. In: Proceedings of the international conference on machine learning (ICML), pp 412–420
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Nguyen, T., Phung, D., Adams, B. et al. Mood sensing from social media texts and its applications. Knowl Inf Syst 39, 667–702 (2014). https://doi.org/10.1007/s10115-013-0628-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-013-0628-8