Abstract
Social emotion prediction aims to predict readers’ emotion, for example, emotion distributions evoked by documents (e.g., news articles). It makes a significant contribution to social media applications, such as opinion summary, election prediction, and emotions investigation of society. While recent studies have focused on encoding consecutive word sequences in documents using neural network models and leveraging topical information, it is essential to acknowledge the influence of documents sharing similar topics or being related to similar events on evoking readers’ emotions. The interactions among documents can significantly impact social emotion prediction. In this paper, we propose a novel approach to model the interactions among documents by constructing a heterogeneous graph. This graph captures the interaction among documents based on global word co-occurrence patterns in a corpus and the emotional scores of words obtained from emotion lexicons. Additionally, we develop heterogeneous graph convolution attention network (HGCA) to embed the heterogeneous graph. This network effectively captures the importance of different neighboring nodes and different node types, enabling comprehensive emotion prediction. Furthermore, we develop Taylor series expansion-based Transformer (Tayformer) to derive initialized node representations that can be co-trained with our graph network while having low memory complexity. Experimental results on four benchmark datasets show the effectiveness of our method.
Access this article
We’re sorry, something doesn't seem to be working properly.
Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.








Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
SinaNews2016, SemEval and ISEAR are publicly available. SinaNews2017 are available from corresponding author on reasonable request.
Notes
References
Adikari A, Nawaratne R, De Silva D et al (2021) Emotions of COVID-19: content analysis of self-reported information using artificial intelligence. J Med Internet Res 23(4):e27341
Bao S, Xu S, Zhang L et al (2009) Joint emotion-topic modeling for social affective text mining. In: 2009 Ninth IEEE international conference on data mining. IEEE, pp 699–704
Bao S, Xu S, Zhang L et al (2012) Mining social emotions from affective text. IEEE Trans Knowl Data Eng 24(9):1658–1670
Czopp AM, Kay AC, Cheryan S (2015) Positive stereotypes are pervasive and powerful. Perspect Psychol Sci A J Assoc Psychol Sci 10(4):451
Dai L, Wang B, Xiang W et al (2022) A hybrid semantic-topic co-encoding network for social emotion classification. In: Part I (ed) Advances in knowledge discovery and data mining: 26th Pacific-Asia conference, PAKDD 2022, Chengdu, China, May 16–19, 2022, Proceedings. Springer, Cham, pp 587–598
Dai Y, Shou L, Gong M et al (2022) Graph fusion network for text classification. Knowl Based Syst 236:107659
Guan X, Peng Q, Li X et al (2019) Social emotion prediction with attention-based hierarchical neural network. In: IAEAC, pp 1001–1005
Katharopoulos A, Vyas A, Pappas N et al (2020) Transformers are RNNs: fast autoregressive transformers with linear attention. In: International conference on machine learning. PMLR, pp 5156–5165
Katz P, Singleton M, Wicentowski R (2007) SWAT-MP: the semeval-2007 systems for task 5 and task 14. In: Proceedings of the fourth international workshop on semantic evaluations (SemEval-2007), pp 308–313
Kenton JDMWC, Toutanova LK (2019) Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT, pp 4171–4186
Kim Y (2014) Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882
Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980
Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907
Kipf TN, Welling M (2017) Graph convolutional networks for text classification. In: ICLR
Li H, Yan Y, Wang S et al (2023) Text classification on heterogeneous information network via enhanced GCN and knowledge. Neural Comput Appl 35(20):14911–14927
Li X, Peng Q, Sun Z et al (2019) Predicting social emotions from readers’ perspective. IEEE Trans Affect Comput 10(02):255–264
Li X, Rao Y, Xie H et al (2019) Social emotion classification based on noise-aware training. Data Knowl Eng 123:101605
Li Y, Zemel R, Brockschmidt M et al (2016) Gated graph sequence neural networks. In: Proceedings of ICLR’16
Lin C, He Y (2009) Joint sentiment/topic model for sentiment analysis. In: Proceedings of the 18th ACM conference on Information and knowledge management, pp 375–384
Lin Y, Meng Y, Sun X et al (2021) BertGCN: transductive text classification by combining GCN and BERT. arXiv preprint arXiv:2105.05727
Liu Y, Ott M, Goyal N et al (2019) Roberta: a robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692
Ma Y, Zong L, Yang Y et al (2019) News2vec: news network embedding with subnode information. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 4843–4852
Marcheggiani D, Titov I (2017) Encoding sentences with graph convolutional networks for semantic role labeling. In: EMNLP, pp 1506–1515
Mou X, Peng Q, Sun Z et al (2023) A deep learning framework for news readers’ emotion prediction based on features from news article and pseudo comments. IEEE Trans Cybern 53(4):2186–2199
Piao Y, Lee S, Lee D et al (2022) Sparse structure learning via graph neural networks for inductive document classification. In: Proceedings of the AAAI conference on artificial intelligence, pp 11165–11173
Quan X, Wang Q, Zhang Y et al (2015) Latent discriminative models for social emotion detection with emotional dependency. ACM Trans Inf Syst. https://doi.org/10.1145/2749459
Ragesh R, Sellamanickam S, Iyer A et al (2021) Hetegcn: heterogeneous graph convolutional networks for text classification. In: Proceedings of the 14th ACM international conference on web search and data mining, pp 860–868
Ramage D, Hall D, Nallapati R et al (2009) Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora. In: Proceedings of the 2009 conference on empirical methods in natural language processing, pp 248–256
Ramage D, Dumais S, Liebling D (2010) Characterizing microblogs with topic models. In: Proceedings of the international AAAI conference on web and social media
Rao Y (2016) Contextual sentiment topic model for adaptive social emotion classification. IEEE Intell Syst 31(1):41–47
Rao Y, Li Q, Mao X et al (2014) Sentiment topic models for social emotion mining. Inf Sci 266:90–100
Rao Y, Li Q, Wenyin L et al (2014) Affective topic model for social emotion detection. Neural Netw 58:29–37
Ren H, Lu W, Xiao Y et al (2022) Graph convolutional networks in language and vision: a survey. Knowl Based Syst 251:109250
Scherer KR, Wallbott HG (1994) Evidence for universality and cultural variation of differential emotion response patterning. J Pers Soc Psychol 66(2):310
Song Y, Shi S, Li J et al (2018) Directional skip-gram: explicitly distinguishing left and right context for word embeddings. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 2 (short papers), pp 175–180
Strapparava C, Mihalcea R (2007) Semeval-2007 task 14: affective text. In: Proceedings of the fourth international workshop on semantic evaluations (SemEval-2007), pp 70–74
Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008
Veličković P, Cucurull G, Casanova A et al (2018) Graph attention networks. In: International conference on learning representations
Wang C, Wang B (2020) An end-to-end topic-enhanced self-attention network for social emotion classification. In: Proceedings of the web conference 2020, pp 2210–2219
Wang C, Wang B, Xiang W et al (2019) Encoding syntactic dependency and topical information for social emotion classification. In: Proceedings of the 42nd international ACM SIGIR conference on research and development in information retrieval, pp 881–884
Wang FL, Zhao Z, Cheng G et al (2023) Weighted cluster-level social emotion classification across domains. Int J Mach Learn Cybern 14(7):2385–2394
Wang K, Han SC, Long S et al (2022) Me-gcn: multi-dimensional edge-embedded graph convolutional networks for semi-supervised text classification. arXiv preprint arXiv:2204.04618
Wang K, Ding Y, Han SC (2023b) Graph neural networks for text classification: a survey. arXiv preprint arXiv:2304.11534
Yang T, Hu L, Shi C et al (2021) Hgat: Heterogeneous graph attention networks for semi-supervised short text classification. ACM Trans Inf Syst. https://doi.org/10.1145/3450352
Yao L, Mao C, Luo Y (2019) Graph convolutional networks for text classification. In: AAAI, pp 7370–7377
Zhang Y, Yu X, Cui Z et al (2020) Every document owns its structure: inductive text classification via graph neural networks. In: Proceedings of the 58th annual meeting of the association for computational linguistics. Association for Computational Linguistics, pp 334–339. https://doi.org/10.18653/v1/2020.acl-main.31
Zhao X, Wang C, Yang Z et al (2016) Online news emotion prediction with bidirectional LSTM. In: International conference on web-age information management. Springer, Berlin, pp 238–250
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A: Proof on the memory complexity of Tayformer
Appendix A: Proof on the memory complexity of Tayformer
Original attention mechanism.
where \({\varvec{Q}},{\varvec{K}}\in {\mathbb {R}}^{n\times {d_k}}, {\varvec{V}}\in {\mathbb {R}}^{n\times {d_v}}\).
We get a matrix with \(n\times n\) during the computation of \(QK^T\), which decides the perplexity of attention is \({{\mathcal {O}}}(n^2)\). If there is no softmax, we can calculate \(K^TV\) firstly and then multiply Q. Thus, the perplexity of this new attention mechanism is \({{\mathcal {O}}}(n)\) while \(n\gg d_v\). To solve this problem, we first change the Eq. A1 as follows:
According to [8], Eq. (A2) implements specific attention where the similarity score is to calculate the exponential of the dot product between Q and K. Thus, we can get the generalized attention equation:
According to Taylor series expansion, \(e^{{{\varvec{q}}_i}^T{\varvec{k}}_j}\) can be approximate as:
Here we normalize \({\varvec{q}}_i,{\varvec{k}}_j\) to ensure their no-negative and propose our similarity function:
We put the Eq. (A5) to the Eq. (A3) to update the attention mechanism:
The above equation is simpler to follow when the numerator is written in vectorized form as follows:
where \({\varvec{Q}}_{l_2},{\varvec{K}}_{l_2}\) are the result of l2 normalization, respectively. Equation (A1) shows the computational cost of softmax attention scales with \({{\mathcal {O}}}(n^2)\), where n represents the sequence length. However, Tayformer from Eq. (A6) has time and memory complexity \({{\mathcal {O}}}(n)\) because we can compute \(\sum ^N_{j=1}\left( \frac{{\varvec{k}}_j}{\Vert {\varvec{k}}_j\Vert }\right) \cdot {\varvec{v}}_j\) and \(\sum ^N_{j=1}\left( \frac{{\varvec{k}}_j}{\Vert {\varvec{k}}_j\Vert }\right)\) once and reuse them for every query.
The total cost of softmax attention in terms of multiplications and additions scales as \({{\mathcal {O}}}(n^2\max \{d_k,d_v\})\), where d is the dimensionality of Q, K, V. In contrast, we first compute the \(K^TV\) and then multiply Q in Tayformer. Subsequently, we can compute the new values with \({{\mathcal {O}}}(nd_kd_v)\) additions and multiplications. For very long sequences, \(n\gg d_v, d_k\). Thus, the complexity of Transformer and Tayformer can be rewrite \({{\mathcal {O}}}(n^2)\), \({{\mathcal {O}}}(n)\), respectively.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Mou, X., Peng, Q., Sun, Z. et al. Multi-document influence on readers: augmenting social emotion prediction by learning document interactions. Neural Comput & Applic 36, 6701–6719 (2024). https://doi.org/10.1007/s00521-024-09420-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-024-09420-8