Skip to main content

User group based emotion detection and topic discovery over short text

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

In recent years, with the development of social media platforms, more and more people express their emotions online through short messages. It is quite valuable to detect emotions and relevant topics from such data. However, the feature sparsity of short texts brings challenges to joint topic-emotion models. In many cases, it is necessary to know not only what people think of specific topics, but also which individuals have similar feedback, and what characteristics of these users have. In this paper, we propose a user group based topic-emotion model named UGTE for emotions detection and topic discovery, which can alleviate the above feature sparsity problem of short texts. Specifically, the characteristics of each user are used to discover groups of individuals who share similar emotions, and UGTE aggregates short texts within a group into long pseudo-documents effectively. Experiments conducted on a real-world short text dataset validate the effectiveness of our proposed model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8

Similar content being viewed by others

Notes

  1. http://www.affective-sciences.org/researchmaterial

References

  1. Artstein, R., Poesio, M.: Inter-coder agreement for computational linguistics. Comput. Linguist. 34(4), 555–596 (2008)

    Article  Google Scholar 

  2. Bao, S., Xu, S., Zhang, L., Yan, R., Su, Z., Han, D., Yu, Y.: Mining social emotions from affective text. IEEE Trans. Knowl. Data Eng. 24(9), 1658–1670 (2012)

    Article  Google Scholar 

  3. Blei, D. M., Ng, A. Y., Jordan, M. I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  4. Cao, Z., Li, S., Liu, Y., Li, W., Ji, H.: A novel neural topic model and its supervised extension. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, Austin, pp. 2210–2216 (2015)

  5. Chen, H., Yin, H., Li, X., Wang, M., Chen, W., Chen, T.: People opinion topic model: Opinion based user clustering in social networks. In: Proceedings of the 26th International Conference on World Wide Web Companion, Perth, pp. 1353–1359 (2017)

  6. Chen, Z., Liu, B.: Mining Topics in Documents: Standing on the Shoulders of Big Data. In: The 20Th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14, New York, pp. 1116–1125 (2014)

  7. Cheng, X., Yan, X., Lan, Y., Guo, J.: BTM: topic modeling over short texts. IEEE Trans. Knowl. Data Eng. 26(12), 2928–2941 (2014)

    Article  Google Scholar 

  8. Diao, Q., Jiang, J., Zhu, F., Lim, E.: Finding bursty topics from microblogs. In: The 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, Jeju Island, Korea - Volume 1: Long Papers, pp. 536–544 (2012)

  9. Griffiths, T. L., Steyvers, M.: Finding scientific topics. Proc. Natl. Acad. Sci. 101(suppl 1), 5228–5235 (2004)

    Article  Google Scholar 

  10. Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Mach. Learn. 42(1/2), 177–196 (2001)

    Article  Google Scholar 

  11. Huang, F., Zhang, S., Zhang, J., Yu, G.: Multimodal learning for topic sentiment analysis in microblogging. Neurocomputing 253, 144–153 (2017)

    Article  Google Scholar 

  12. Huang, M., Rao, Y., Liu, Y., Xie, H., Wang, F. L.: Siamese network-based supervised topic modeling. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, pp. 4652–4662 (2018)

  13. Huang, T., Nevmyvaka, Y.: A practical markov chain monte carlo approach to decision problems. In: Proceedings of the Fourteenth International Florida Artificial Intelligence Research Society Conference, Key West, pp. 520–524 (2001)

  14. Jin, O., Liu, N. N., Zhao, K., Yu, Y., Yang, Q.: Transferring topical knowledge from auxiliary long texts for short text clustering. In: Proceedings of the 20th ACM Conference on Information and Knowledge Management, CIKM 2011, Glasgow, pp. 775–784 (2011)

  15. Lin, T., Tian, W., Mei, Q., Cheng, H.: The Dual-Sparse Topic Model: Mining Focused Topics and Focused Terms in Short Text. In: 23Rd International World Wide Web Conference, WWW ’14, Seoul, pp. 539–550 (2014)

  16. Mcpherson, M., Smithlovin, L., Cook, J. M.: Birds of a feather: Homophily in social networks. Annu. Rev. Sociol. 27(1), 415–444 (2001)

    Article  Google Scholar 

  17. Mimno, D. M., Wallach, H. M., Talley, E. M., Leenders, M., McCallum, A.: Optimizing semantic coherence in topic models. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, John McIntyre Conference Centre, Edinburgh, UK, A meeting of SIGDAT, a Special Interest Group of the ACL, pp. 262–272 (2011)

  18. Parthasarathy, S., Ruan, Y., Satuluri, V.: Community Discovery in Social Networks: Applications, methods and emerging trends. In: Social Network Data Analytics, pp. 79–113 (2011)

    Chapter  Google Scholar 

  19. Phan, X. H., Nguyen, M. L., Horiguchi, S.: Learning to classify short and sparse text & Web with hidden topics from large-scale data collections. In: Proceedings of the 17th International Conference on World Wide Web, WWW 2008, Beijing, pp. 91–100 (2008)

  20. Poria, S., Gelbukh, A. F., Hussain, A., Howard, N., Das, D., Bandyopadhyay, S.: Enhanced senticnet with affective labels for concept-based opinion mining. IEEE Intell. Syst. 28(2), 31–38 (2013)

    Article  Google Scholar 

  21. Pu, X., Jin, R., Wu, G., Han, D., Xue, G.: Topic modeling in semantic space with keywords. In: Proceedings of the 24th ACM International Conference on Information and Knowledge Management, CIKM 2015, Melbourne, pp. 1141–1150 (2015)

  22. Ramage, D., Hall, D. L. W., Nallapati, R., Manning, C. D.: Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, EMNLP 2009, Singapore, A meeting of SIGDAT, a Special Interest Group of the ACL, pp. 248–256 (2009)

  23. Rao, Y.: Contextual sentiment topic model for adaptive social emotion classification. IEEE Intell. Syst. 31(1), 41–47 (2016)

    Article  Google Scholar 

  24. Rao, Y., Li, Q., Mao, X., Wenyin, L.: Sentiment topic models for social emotion mining. Inf. Sci. 266, 90–100 (2014)

    Article  Google Scholar 

  25. Rao, Y., Pang, J., Xie, H., Liu, A., Wong, T., Li, Q., Wang, F. L.: Supervised Intensive Topic Models for Emotion Detection over Short Text. In: Database Systems for Advanced Applications - 22Nd International Conference, DASFAA 2017, Suzhou, Proceedings, Part I, pp. 408–422 (2017)

    Chapter  Google Scholar 

  26. Rosen-Zvi, M., Griffiths, T.L., Steyvers, M., Smyth, P.: The author-topic model for authors and documents. In: UAI ’04, Proceedings of the 20th Conference in Uncertainty in Artificial Intelligence, Banff, pp. 487–494 (2004)

  27. Sachan, M., Contractor, D., Faruquie, T. A., Subramaniam, L. V.: Using content and interactions for discovering communities in social networks. In: Proceedings of the 21st World Wide Web Conference 2012, WWW 2012, Lyon, pp. 331–340 (2012)

  28. Sahami, M., Heilman, T. D.: A Web-based kernel function for measuring the similarity of short text snippets. In: Proceedings of the 15th international conference on World Wide Web, WWW 2006, Edinburgh, pp. 377–386 (2006)

  29. Wallach, H. M., Mimno, D. M., McCallum, A.: Rethinking LDA: why priors matter. In: Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, Vancouver, pp. 1973–1981 (2009)

  30. Wang, D., Li, J., Xu, K., Wu, Y.: Sentiment community detection: exploring sentiments and relationships in social networks. Electron. Commer. Res. 17 (1), 103–132 (2017)

    Article  Google Scholar 

  31. Wang, X., Mohanty, N., McCallum, A.: Group and topic discovery from relations and text. In: Proceedings of the 3rd international workshop on Link discovery, LinkKDD 2005, Chicago, pp. 28–35 (2005)

  32. Xu, K., Qi, G., Huang, J., Wu, T., Fu, X.: Detecting bursts in sentiment-aware topics from social media. Knowl.-Based Syst. 141, 44–54 (2018)

    Article  Google Scholar 

  33. Yang, B., Manandhar, S.: STC: A Joint Sentiment-Topic Model for Community Identification. In: Trends and Applications in Knowledge Discovery and Data Mining - PAKDD 2014 International Workshops: DANTH, BDM, MobiSocial, BigEC, CloudSD, MSMV-MBI, SDA, DMDA-Health, ALSIP, SocNet, DMBIH, BigPMA, Tainan, 2014. Revised Selected Papers, pp. 535–548 (2014)

    Google Scholar 

  34. Zhang, L., Liu, B.: Sentiment Analysis and Opinion Mining. In: Encyclopedia of Machine Learning and Data Mining, pp. 1152–1161 (2017)

    Chapter  Google Scholar 

  35. Zhang, Q., Gong, Y., Sun, X., Huang, X.: Time-aware personalized hashtag recommendation on social media. In: COLING 2014, 25th International Conference on Computational Linguistics, Proceedings of the Conference: Technical Papers, Dublin, pp. 203–212 (2014)

  36. Zhao, W. X., Jiang, J., Weng, J., He, J., Lim, E., Yan, H., Li, X.: Comparing Twitter and Traditional Media Using Topic Models. In: Advances in Information Retrieval - 33Rd European Conference on IR Research, ECIR 2011, Dublin, 2011. Proceedings, pp. 338–349 (2011)

  37. Zhao, W. X., Jiang, J., Weng, J., He, J., Lim, E., Yan, H., Li, X.: Comparing Twitter and Traditional Media Using Topic Models. In: Advances in Information Retrieval - 33Rd European Conference on IR Research, ECIR 2011, Dublin, 2011. Proceedings, pp. 338–349 (2011)

  38. Zuo, Y., Wu, J., Zhang, H., Lin, H., Wang, F., Xu, K., Xiong, H.: Topic modeling of short texts: A pseudo-document view. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, pp. 2105–2114 (2016)

  39. Zuo, Y., Wu, J., Zhang, H., Lin, H., Wang, F., Xu, K., Xiong, H.: Topic modeling of short texts: A pseudo-document view. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, pp. 2105–2114 (2016)

Download references

Acknowledgment

This work has been supported by Top-Up Fund (TFG-04) and Seed Fund (SFG-10) for General Research Fund / Early Career Scheme and Interdisciplinary Research Scheme of the Dean’s Research Fund 2018-19 (FLASS/DRF/IDS-3), Departmental Collaborative Research Fund 2019 (MIT/DCRF-R2/18-19), Funding Support to General Research Fund Proposal (RG 39/2019-2020R) and the Internal Research Grant (RG 90/2018-2019R) of The Education University of Hong Kong, and LEO Dr David P. Chan Institute of Data Science, Lingnan University, Hong Kong. The work has also been supported by the Research Grants Council of the Hong Kong Special Administrative Region, China (Collaborative Research Fund, project number C1031-18G).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yanghui Rao.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix:

Appendix:

For clarity, numerical results of Figures 27 are provided as follows.

Table 9 Coherence@10 of UGTE_ID and baselines with different topic numbers when |G| = 10, where the best results are highlighted in boldface
Table 10 Coherence@20 of UGTE_ID and baselines with different topic numbers when |G| = 10, where the best results are highlighted in boldface
Table 11 Coherence@30 of UGTE_ID and baselines with different topic numbers when |G| = 10, where the best results are highlighted in boldface
Table 12 Accuracy of UGTE_ID and baselines with different topic numbers when |G| = 10, where the best results are highlighted in boldface
Table 13 Kappa Score of UGTE_ID and baselines with different topic numbers when |G| = 10, where the best results are highlighted in boldface
Table 14 The mean and variance of topic discovery and emotion discovery of UGTE_ID over different numbers of user groups, where the best results are highlighted in boldface
Table 15 The mean and variance values of impact of extremely short text on UGTE_ID and MSTM

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Feng, J., Rao, Y., Xie, H. et al. User group based emotion detection and topic discovery over short text. World Wide Web 23, 1553–1587 (2020). https://doi.org/10.1007/s11280-019-00760-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-019-00760-3

Keywords