Skip to main content

Advertisement

Log in

Multi-document influence on readers: augmenting social emotion prediction by learning document interactions

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Social emotion prediction aims to predict readers’ emotion, for example, emotion distributions evoked by documents (e.g., news articles). It makes a significant contribution to social media applications, such as opinion summary, election prediction, and emotions investigation of society. While recent studies have focused on encoding consecutive word sequences in documents using neural network models and leveraging topical information, it is essential to acknowledge the influence of documents sharing similar topics or being related to similar events on evoking readers’ emotions. The interactions among documents can significantly impact social emotion prediction. In this paper, we propose a novel approach to model the interactions among documents by constructing a heterogeneous graph. This graph captures the interaction among documents based on global word co-occurrence patterns in a corpus and the emotional scores of words obtained from emotion lexicons. Additionally, we develop heterogeneous graph convolution attention network (HGCA) to embed the heterogeneous graph. This network effectively captures the importance of different neighboring nodes and different node types, enabling comprehensive emotion prediction. Furthermore, we develop Taylor series expansion-based Transformer (Tayformer) to derive initialized node representations that can be co-trained with our graph network while having low memory complexity. Experimental results on four benchmark datasets show the effectiveness of our method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability

SinaNews2016, SemEval and ISEAR are publicly available. SinaNews2017 are available from corresponding author on reasonable request.

Notes

  1. http://news.sina.com.cn/s/wh/2016-10-07/doc-ifxwrhpn9259416.shtml.

  2. http://news.sina.com.cn.

  3. https://bosonnlp.com/dev/resource.

  4. https://saifmohammad.com/WebDocs/NRC-Emoticon-AffLexNegLex-v1.0.zip.

  5. https://github.com/fxsjy/jieba.

  6. https://www.nltk.org/.

  7. https://code.google.com/archive/p/word2vec/.

  8. https://pytorch.org/.

References

  1. Adikari A, Nawaratne R, De Silva D et al (2021) Emotions of COVID-19: content analysis of self-reported information using artificial intelligence. J Med Internet Res 23(4):e27341

    Article  PubMed  PubMed Central  Google Scholar 

  2. Bao S, Xu S, Zhang L et al (2009) Joint emotion-topic modeling for social affective text mining. In: 2009 Ninth IEEE international conference on data mining. IEEE, pp 699–704

  3. Bao S, Xu S, Zhang L et al (2012) Mining social emotions from affective text. IEEE Trans Knowl Data Eng 24(9):1658–1670

    Article  Google Scholar 

  4. Czopp AM, Kay AC, Cheryan S (2015) Positive stereotypes are pervasive and powerful. Perspect Psychol Sci A J Assoc Psychol Sci 10(4):451

    Article  Google Scholar 

  5. Dai L, Wang B, Xiang W et al (2022) A hybrid semantic-topic co-encoding network for social emotion classification. In: Part I (ed) Advances in knowledge discovery and data mining: 26th Pacific-Asia conference, PAKDD 2022, Chengdu, China, May 16–19, 2022, Proceedings. Springer, Cham, pp 587–598

    Chapter  Google Scholar 

  6. Dai Y, Shou L, Gong M et al (2022) Graph fusion network for text classification. Knowl Based Syst 236:107659

    Article  Google Scholar 

  7. Guan X, Peng Q, Li X et al (2019) Social emotion prediction with attention-based hierarchical neural network. In: IAEAC, pp 1001–1005

  8. Katharopoulos A, Vyas A, Pappas N et al (2020) Transformers are RNNs: fast autoregressive transformers with linear attention. In: International conference on machine learning. PMLR, pp 5156–5165

  9. Katz P, Singleton M, Wicentowski R (2007) SWAT-MP: the semeval-2007 systems for task 5 and task 14. In: Proceedings of the fourth international workshop on semantic evaluations (SemEval-2007), pp 308–313

  10. Kenton JDMWC, Toutanova LK (2019) Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT, pp 4171–4186

  11. Kim Y (2014) Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882

  12. Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980

  13. Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907

  14. Kipf TN, Welling M (2017) Graph convolutional networks for text classification. In: ICLR

  15. Li H, Yan Y, Wang S et al (2023) Text classification on heterogeneous information network via enhanced GCN and knowledge. Neural Comput Appl 35(20):14911–14927

    Article  Google Scholar 

  16. Li X, Peng Q, Sun Z et al (2019) Predicting social emotions from readers’ perspective. IEEE Trans Affect Comput 10(02):255–264

    Article  Google Scholar 

  17. Li X, Rao Y, Xie H et al (2019) Social emotion classification based on noise-aware training. Data Knowl Eng 123:101605

    Article  Google Scholar 

  18. Li Y, Zemel R, Brockschmidt M et al (2016) Gated graph sequence neural networks. In: Proceedings of ICLR’16

  19. Lin C, He Y (2009) Joint sentiment/topic model for sentiment analysis. In: Proceedings of the 18th ACM conference on Information and knowledge management, pp 375–384

  20. Lin Y, Meng Y, Sun X et al (2021) BertGCN: transductive text classification by combining GCN and BERT. arXiv preprint arXiv:2105.05727

  21. Liu Y, Ott M, Goyal N et al (2019) Roberta: a robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692

  22. Ma Y, Zong L, Yang Y et al (2019) News2vec: news network embedding with subnode information. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 4843–4852

  23. Marcheggiani D, Titov I (2017) Encoding sentences with graph convolutional networks for semantic role labeling. In: EMNLP, pp 1506–1515

  24. Mou X, Peng Q, Sun Z et al (2023) A deep learning framework for news readers’ emotion prediction based on features from news article and pseudo comments. IEEE Trans Cybern 53(4):2186–2199

    Article  PubMed  Google Scholar 

  25. Piao Y, Lee S, Lee D et al (2022) Sparse structure learning via graph neural networks for inductive document classification. In: Proceedings of the AAAI conference on artificial intelligence, pp 11165–11173

  26. Quan X, Wang Q, Zhang Y et al (2015) Latent discriminative models for social emotion detection with emotional dependency. ACM Trans Inf Syst. https://doi.org/10.1145/2749459

    Article  Google Scholar 

  27. Ragesh R, Sellamanickam S, Iyer A et al (2021) Hetegcn: heterogeneous graph convolutional networks for text classification. In: Proceedings of the 14th ACM international conference on web search and data mining, pp 860–868

  28. Ramage D, Hall D, Nallapati R et al (2009) Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora. In: Proceedings of the 2009 conference on empirical methods in natural language processing, pp 248–256

  29. Ramage D, Dumais S, Liebling D (2010) Characterizing microblogs with topic models. In: Proceedings of the international AAAI conference on web and social media

  30. Rao Y (2016) Contextual sentiment topic model for adaptive social emotion classification. IEEE Intell Syst 31(1):41–47

    Article  Google Scholar 

  31. Rao Y, Li Q, Mao X et al (2014) Sentiment topic models for social emotion mining. Inf Sci 266:90–100

    Article  Google Scholar 

  32. Rao Y, Li Q, Wenyin L et al (2014) Affective topic model for social emotion detection. Neural Netw 58:29–37

    Article  PubMed  Google Scholar 

  33. Ren H, Lu W, Xiao Y et al (2022) Graph convolutional networks in language and vision: a survey. Knowl Based Syst 251:109250

    Article  Google Scholar 

  34. Scherer KR, Wallbott HG (1994) Evidence for universality and cultural variation of differential emotion response patterning. J Pers Soc Psychol 66(2):310

    Article  CAS  PubMed  Google Scholar 

  35. Song Y, Shi S, Li J et al (2018) Directional skip-gram: explicitly distinguishing left and right context for word embeddings. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 2 (short papers), pp 175–180

  36. Strapparava C, Mihalcea R (2007) Semeval-2007 task 14: affective text. In: Proceedings of the fourth international workshop on semantic evaluations (SemEval-2007), pp 70–74

  37. Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008

  38. Veličković P, Cucurull G, Casanova A et al (2018) Graph attention networks. In: International conference on learning representations

  39. Wang C, Wang B (2020) An end-to-end topic-enhanced self-attention network for social emotion classification. In: Proceedings of the web conference 2020, pp 2210–2219

  40. Wang C, Wang B, Xiang W et al (2019) Encoding syntactic dependency and topical information for social emotion classification. In: Proceedings of the 42nd international ACM SIGIR conference on research and development in information retrieval, pp 881–884

  41. Wang FL, Zhao Z, Cheng G et al (2023) Weighted cluster-level social emotion classification across domains. Int J Mach Learn Cybern 14(7):2385–2394

    Article  Google Scholar 

  42. Wang K, Han SC, Long S et al (2022) Me-gcn: multi-dimensional edge-embedded graph convolutional networks for semi-supervised text classification. arXiv preprint arXiv:2204.04618

  43. Wang K, Ding Y, Han SC (2023b) Graph neural networks for text classification: a survey. arXiv preprint arXiv:2304.11534

  44. Yang T, Hu L, Shi C et al (2021) Hgat: Heterogeneous graph attention networks for semi-supervised short text classification. ACM Trans Inf Syst. https://doi.org/10.1145/3450352

    Article  Google Scholar 

  45. Yao L, Mao C, Luo Y (2019) Graph convolutional networks for text classification. In: AAAI, pp 7370–7377

  46. Zhang Y, Yu X, Cui Z et al (2020) Every document owns its structure: inductive text classification via graph neural networks. In: Proceedings of the 58th annual meeting of the association for computational linguistics. Association for Computational Linguistics, pp 334–339. https://doi.org/10.18653/v1/2020.acl-main.31

  47. Zhao X, Wang C, Yang Z et al (2016) Online news emotion prediction with bidirectional LSTM. In: International conference on web-age information management. Springer, Berlin, pp 238–250

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qinke Peng.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Proof on the memory complexity of Tayformer

Appendix A: Proof on the memory complexity of Tayformer

Original attention mechanism.

$$\begin{aligned} \rm{Attention}({\varvec{Q}},{\varvec{K}},{\varvec{V}})=\rm{softmax}\left( \frac{{\varvec{Q}}{\varvec{K}}^T}{\sqrt{d_k}}\right) {\varvec{V}} \end{aligned}$$
(A1)

where \({\varvec{Q}},{\varvec{K}}\in {\mathbb {R}}^{n\times {d_k}}, {\varvec{V}}\in {\mathbb {R}}^{n\times {d_v}}\).

We get a matrix with \(n\times n\) during the computation of \(QK^T\), which decides the perplexity of attention is \({{\mathcal {O}}}(n^2)\). If there is no softmax, we can calculate \(K^TV\) firstly and then multiply Q. Thus, the perplexity of this new attention mechanism is \({{\mathcal {O}}}(n)\) while \(n\gg d_v\). To solve this problem, we first change the Eq. A1 as follows:

$$\begin{aligned} \rm{Attn}({\varvec{Q}},{\varvec{K}},{\varvec{V}})_i=\frac{\sum ^N_{j=1}e^{\frac{{{\varvec{q}}_i}^T {\varvec{k}}_j}{\sqrt{d_k}}}{\varvec{v}}_j}{\sum ^N_{j=1}e^{\frac{{{\varvec{q}}_i}^T {\varvec{k}}_j}{\sqrt{d_k}}}} \end{aligned}$$
(A2)

According to [8], Eq. (A2) implements specific attention where the similarity score is to calculate the exponential of the dot product between Q and K. Thus, we can get the generalized attention equation:

$$\begin{aligned} \rm{Attn}({\varvec{Q}},{\varvec{K}},{\varvec{V}})_i=\frac{\sum ^N_{j=1}\rm{sim}({\varvec{q}}_i,{\varvec{k}}_j){\varvec{v}}_j}{\sum ^N_{j=1}\rm{sim}({\varvec{q}}_i,{\varvec{k}}_j)} \end{aligned}$$
(A3)

According to Taylor series expansion, \(e^{{{\varvec{q}}_i}^T{\varvec{k}}_j}\) can be approximate as:

$$\begin{aligned} e^{\frac{{{\varvec{q}}_i}^T {\varvec{k}}_j}{\sqrt{d_k}}}\approx 1+\frac{{{\varvec{q}}_i}^T {\varvec{k}}_j}{\sqrt{d_k}} \end{aligned}$$
(A4)

Here we normalize \({\varvec{q}}_i,{\varvec{k}}_j\) to ensure their no-negative and propose our similarity function:

$$\begin{aligned} \rm{sim}({\varvec{q}}_i,{\varvec{k}}_j)=1+\frac{1}{d_k}\cdot \left( \frac{{\varvec{q}}_i}{\Vert {\varvec{q}}_i\Vert }\right) ^T\left( \frac{{\varvec{k}}_j}{\Vert {\varvec{k}}_j\Vert }\right) \end{aligned}$$
(A5)

We put the Eq. (A5) to the Eq. (A3) to update the attention mechanism:

$$\begin{aligned} \begin{aligned} \rm{Attn}({\varvec{Q}},{\varvec{K}},{\varvec{V}})_i&=\frac{\sum ^N_{j=1}\left[ 1+\frac{1}{d_k}\cdot \left( \frac{{\varvec{q}}_i}{\Vert {\varvec{q}}_i\Vert }\right) ^T\left( \frac{{\varvec{k}}_j}{\Vert {\varvec{k}}_j\Vert }\right) \right] {\varvec{v}}_j}{\sum ^N_{j=1}1+\frac{1}{d_k}\cdot (\frac{{\varvec{q}}_i}{\Vert {\varvec{q}}_i\Vert })^T\left( \frac{{\varvec{k}}_j}{\Vert {\varvec{k}}_j\Vert }\right) }\\&=\frac{\sum ^N_{j=1}{\varvec{v}}_j+\frac{1}{d_k}\cdot \left( \frac{{\varvec{q}}_i}{\Vert {\varvec{q}}_i\Vert }\right) ^T\cdot \sum ^N_{j=1}\left( \frac{{\varvec{k}}_j}{\Vert {\varvec{k}}_j\Vert }\right) \cdot {\varvec{v}}_j}{N+\frac{1}{d_k}\cdot \left( \frac{{\varvec{q}}_i}{\Vert {\varvec{q}}_i\Vert }\right) ^T\cdot \sum ^N_{j=1}\left( \frac{{\varvec{k}}_j}{\Vert {\varvec{k}}_j\Vert }\right) } \end{aligned} \end{aligned}$$
(A6)

The above equation is simpler to follow when the numerator is written in vectorized form as follows:

$$\begin{aligned} ({\varvec{I}}+{\varvec{Q}}_{l_2}\cdot {\varvec{K}}_{l_2}^T)\cdot {\varvec{V}}={\varvec{V}}+{\varvec{Q}}_{l_2}\cdot ({\varvec{K}}_{l_2}^T\cdot {\varvec{V}}) \end{aligned}$$
(A7)

where \({\varvec{Q}}_{l_2},{\varvec{K}}_{l_2}\) are the result of l2 normalization, respectively. Equation (A1) shows the computational cost of softmax attention scales with \({{\mathcal {O}}}(n^2)\), where n represents the sequence length. However, Tayformer from Eq. (A6) has time and memory complexity \({{\mathcal {O}}}(n)\) because we can compute \(\sum ^N_{j=1}\left( \frac{{\varvec{k}}_j}{\Vert {\varvec{k}}_j\Vert }\right) \cdot {\varvec{v}}_j\) and \(\sum ^N_{j=1}\left( \frac{{\varvec{k}}_j}{\Vert {\varvec{k}}_j\Vert }\right)\) once and reuse them for every query.

The total cost of softmax attention in terms of multiplications and additions scales as \({{\mathcal {O}}}(n^2\max \{d_k,d_v\})\), where d is the dimensionality of QKV. In contrast, we first compute the \(K^TV\) and then multiply Q in Tayformer. Subsequently, we can compute the new values with \({{\mathcal {O}}}(nd_kd_v)\) additions and multiplications. For very long sequences, \(n\gg d_v, d_k\). Thus, the complexity of Transformer and Tayformer can be rewrite \({{\mathcal {O}}}(n^2)\), \({{\mathcal {O}}}(n)\), respectively.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mou, X., Peng, Q., Sun, Z. et al. Multi-document influence on readers: augmenting social emotion prediction by learning document interactions. Neural Comput & Applic 36, 6701–6719 (2024). https://doi.org/10.1007/s00521-024-09420-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-024-09420-8

Keywords

Navigation