Multi-document influence on readers: augmenting social emotion prediction by learning document interactions

Mou, Xu; Peng, Qinke; Sun, Zhao; Bashir, Muhammad Fiaz; Li, Haozhou

doi:10.1007/s00521-024-09420-8

Multi-document influence on readers: augmenting social emotion prediction by learning document interactions

Original Article
Published: 14 February 2024

Volume 36, pages 6701–6719, (2024)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Xu Mou ORCID: orcid.org/0000-0002-2697-1038¹,
Qinke Peng¹,
Zhao Sun¹,
Muhammad Fiaz Bashir¹ &
…
Haozhou Li¹

209 Accesses
Explore all metrics

Abstract

Social emotion prediction aims to predict readers’ emotion, for example, emotion distributions evoked by documents (e.g., news articles). It makes a significant contribution to social media applications, such as opinion summary, election prediction, and emotions investigation of society. While recent studies have focused on encoding consecutive word sequences in documents using neural network models and leveraging topical information, it is essential to acknowledge the influence of documents sharing similar topics or being related to similar events on evoking readers’ emotions. The interactions among documents can significantly impact social emotion prediction. In this paper, we propose a novel approach to model the interactions among documents by constructing a heterogeneous graph. This graph captures the interaction among documents based on global word co-occurrence patterns in a corpus and the emotional scores of words obtained from emotion lexicons. Additionally, we develop heterogeneous graph convolution attention network (HGCA) to embed the heterogeneous graph. This network effectively captures the importance of different neighboring nodes and different node types, enabling comprehensive emotion prediction. Furthermore, we develop Taylor series expansion-based Transformer (Tayformer) to derive initialized node representations that can be co-trained with our graph network while having low memory complexity. Experimental results on four benchmark datasets show the effectiveness of our method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

A Hybrid Semantic-Topic Co-encoding Network for Social Emotion Classification

Social Media Sentiment Analysis Based on Dependency Graph and Co-occurrence Graph

Article 07 February 2022

A multichannel embedding and arithmetic optimized stacked Bi-GRU model with semantic attention to detect emotion over text data

Article 26 July 2022

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability

SinaNews2016, SemEval and ISEAR are publicly available. SinaNews2017 are available from corresponding author on reasonable request.

Notes

References

Adikari A, Nawaratne R, De Silva D et al (2021) Emotions of COVID-19: content analysis of self-reported information using artificial intelligence. J Med Internet Res 23(4):e27341
Article PubMed PubMed Central Google Scholar
Bao S, Xu S, Zhang L et al (2009) Joint emotion-topic modeling for social affective text mining. In: 2009 Ninth IEEE international conference on data mining. IEEE, pp 699–704
Bao S, Xu S, Zhang L et al (2012) Mining social emotions from affective text. IEEE Trans Knowl Data Eng 24(9):1658–1670
Article Google Scholar
Czopp AM, Kay AC, Cheryan S (2015) Positive stereotypes are pervasive and powerful. Perspect Psychol Sci A J Assoc Psychol Sci 10(4):451
Article Google Scholar
Dai L, Wang B, Xiang W et al (2022) A hybrid semantic-topic co-encoding network for social emotion classification. In: Part I (ed) Advances in knowledge discovery and data mining: 26th Pacific-Asia conference, PAKDD 2022, Chengdu, China, May 16–19, 2022, Proceedings. Springer, Cham, pp 587–598
Chapter Google Scholar
Dai Y, Shou L, Gong M et al (2022) Graph fusion network for text classification. Knowl Based Syst 236:107659
Article Google Scholar
Guan X, Peng Q, Li X et al (2019) Social emotion prediction with attention-based hierarchical neural network. In: IAEAC, pp 1001–1005
Katharopoulos A, Vyas A, Pappas N et al (2020) Transformers are RNNs: fast autoregressive transformers with linear attention. In: International conference on machine learning. PMLR, pp 5156–5165
Katz P, Singleton M, Wicentowski R (2007) SWAT-MP: the semeval-2007 systems for task 5 and task 14. In: Proceedings of the fourth international workshop on semantic evaluations (SemEval-2007), pp 308–313
Kenton JDMWC, Toutanova LK (2019) Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT, pp 4171–4186
Kim Y (2014) Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882
Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980
Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907
Kipf TN, Welling M (2017) Graph convolutional networks for text classification. In: ICLR
Li H, Yan Y, Wang S et al (2023) Text classification on heterogeneous information network via enhanced GCN and knowledge. Neural Comput Appl 35(20):14911–14927
Article Google Scholar
Li X, Peng Q, Sun Z et al (2019) Predicting social emotions from readers’ perspective. IEEE Trans Affect Comput 10(02):255–264
Article Google Scholar
Li X, Rao Y, Xie H et al (2019) Social emotion classification based on noise-aware training. Data Knowl Eng 123:101605
Article Google Scholar
Li Y, Zemel R, Brockschmidt M et al (2016) Gated graph sequence neural networks. In: Proceedings of ICLR’16
Lin C, He Y (2009) Joint sentiment/topic model for sentiment analysis. In: Proceedings of the 18th ACM conference on Information and knowledge management, pp 375–384
Lin Y, Meng Y, Sun X et al (2021) BertGCN: transductive text classification by combining GCN and BERT. arXiv preprint arXiv:2105.05727
Liu Y, Ott M, Goyal N et al (2019) Roberta: a robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692
Ma Y, Zong L, Yang Y et al (2019) News2vec: news network embedding with subnode information. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 4843–4852
Marcheggiani D, Titov I (2017) Encoding sentences with graph convolutional networks for semantic role labeling. In: EMNLP, pp 1506–1515
Mou X, Peng Q, Sun Z et al (2023) A deep learning framework for news readers’ emotion prediction based on features from news article and pseudo comments. IEEE Trans Cybern 53(4):2186–2199
Article PubMed Google Scholar
Piao Y, Lee S, Lee D et al (2022) Sparse structure learning via graph neural networks for inductive document classification. In: Proceedings of the AAAI conference on artificial intelligence, pp 11165–11173
Quan X, Wang Q, Zhang Y et al (2015) Latent discriminative models for social emotion detection with emotional dependency. ACM Trans Inf Syst. https://doi.org/10.1145/2749459
Article Google Scholar
Ragesh R, Sellamanickam S, Iyer A et al (2021) Hetegcn: heterogeneous graph convolutional networks for text classification. In: Proceedings of the 14th ACM international conference on web search and data mining, pp 860–868
Ramage D, Hall D, Nallapati R et al (2009) Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora. In: Proceedings of the 2009 conference on empirical methods in natural language processing, pp 248–256
Ramage D, Dumais S, Liebling D (2010) Characterizing microblogs with topic models. In: Proceedings of the international AAAI conference on web and social media
Rao Y (2016) Contextual sentiment topic model for adaptive social emotion classification. IEEE Intell Syst 31(1):41–47
Article Google Scholar
Rao Y, Li Q, Mao X et al (2014) Sentiment topic models for social emotion mining. Inf Sci 266:90–100
Article Google Scholar
Rao Y, Li Q, Wenyin L et al (2014) Affective topic model for social emotion detection. Neural Netw 58:29–37
Article PubMed Google Scholar
Ren H, Lu W, Xiao Y et al (2022) Graph convolutional networks in language and vision: a survey. Knowl Based Syst 251:109250
Article Google Scholar
Scherer KR, Wallbott HG (1994) Evidence for universality and cultural variation of differential emotion response patterning. J Pers Soc Psychol 66(2):310
Article CAS PubMed Google Scholar
Song Y, Shi S, Li J et al (2018) Directional skip-gram: explicitly distinguishing left and right context for word embeddings. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 2 (short papers), pp 175–180
Strapparava C, Mihalcea R (2007) Semeval-2007 task 14: affective text. In: Proceedings of the fourth international workshop on semantic evaluations (SemEval-2007), pp 70–74
Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008
Veličković P, Cucurull G, Casanova A et al (2018) Graph attention networks. In: International conference on learning representations
Wang C, Wang B (2020) An end-to-end topic-enhanced self-attention network for social emotion classification. In: Proceedings of the web conference 2020, pp 2210–2219
Wang C, Wang B, Xiang W et al (2019) Encoding syntactic dependency and topical information for social emotion classification. In: Proceedings of the 42nd international ACM SIGIR conference on research and development in information retrieval, pp 881–884
Wang FL, Zhao Z, Cheng G et al (2023) Weighted cluster-level social emotion classification across domains. Int J Mach Learn Cybern 14(7):2385–2394
Article Google Scholar
Wang K, Han SC, Long S et al (2022) Me-gcn: multi-dimensional edge-embedded graph convolutional networks for semi-supervised text classification. arXiv preprint arXiv:2204.04618
Wang K, Ding Y, Han SC (2023b) Graph neural networks for text classification: a survey. arXiv preprint arXiv:2304.11534
Yang T, Hu L, Shi C et al (2021) Hgat: Heterogeneous graph attention networks for semi-supervised short text classification. ACM Trans Inf Syst. https://doi.org/10.1145/3450352
Article Google Scholar
Yao L, Mao C, Luo Y (2019) Graph convolutional networks for text classification. In: AAAI, pp 7370–7377
Zhang Y, Yu X, Cui Z et al (2020) Every document owns its structure: inductive text classification via graph neural networks. In: Proceedings of the 58th annual meeting of the association for computational linguistics. Association for Computational Linguistics, pp 334–339. https://doi.org/10.18653/v1/2020.acl-main.31
Zhao X, Wang C, Yang Z et al (2016) Online news emotion prediction with bidirectional LSTM. In: International conference on web-age information management. Springer, Berlin, pp 238–250

Download references

Author information

Authors and Affiliations

System of Engineering, Xi’an Jiaotong University, No.28 Xianning West Road, Xi’an, 710049, Shaanxi Province, China
Xu Mou, Qinke Peng, Zhao Sun, Muhammad Fiaz Bashir & Haozhou Li

Authors

Xu Mou
View author publications
You can also search for this author in PubMed Google Scholar
Qinke Peng
View author publications
You can also search for this author in PubMed Google Scholar
Zhao Sun
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Fiaz Bashir
View author publications
You can also search for this author in PubMed Google Scholar
Haozhou Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qinke Peng.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Proof on the memory complexity of Tayformer

Original attention mechanism.

$$\begin{aligned} \rm{Attention}({\varvec{Q}},{\varvec{K}},{\varvec{V}})=\rm{softmax}\left( \frac{{\varvec{Q}}{\varvec{K}}^T}{\sqrt{d_k}}\right) {\varvec{V}} \end{aligned}$$

(A1)

where ${\varvec{Q}},{\varvec{K}}\in {\mathbb {R}}^{n\times {d_k}}, {\varvec{V}}\in {\mathbb {R}}^{n\times {d_v}}$.

We get a matrix with $n\times n$ during the computation of $QK^T$, which decides the perplexity of attention is ${{\mathcal {O}}}(n^2)$. If there is no softmax, we can calculate $K^TV$ firstly and then multiply Q. Thus, the perplexity of this new attention mechanism is ${{\mathcal {O}}}(n)$ while $n\gg d_v$. To solve this problem, we first change the Eq. A1 as follows:

$$\begin{aligned} \rm{Attn}({\varvec{Q}},{\varvec{K}},{\varvec{V}})_i=\frac{\sum ^N_{j=1}e^{\frac{{{\varvec{q}}_i}^T {\varvec{k}}_j}{\sqrt{d_k}}}{\varvec{v}}_j}{\sum ^N_{j=1}e^{\frac{{{\varvec{q}}_i}^T {\varvec{k}}_j}{\sqrt{d_k}}}} \end{aligned}$$

(A2)

According to [8], Eq. (A2) implements specific attention where the similarity score is to calculate the exponential of the dot product between Q and K. Thus, we can get the generalized attention equation:

$$\begin{aligned} \rm{Attn}({\varvec{Q}},{\varvec{K}},{\varvec{V}})_i=\frac{\sum ^N_{j=1}\rm{sim}({\varvec{q}}_i,{\varvec{k}}_j){\varvec{v}}_j}{\sum ^N_{j=1}\rm{sim}({\varvec{q}}_i,{\varvec{k}}_j)} \end{aligned}$$

(A3)

According to Taylor series expansion, $e^{{{\varvec{q}}_i}^T{\varvec{k}}_j}$ can be approximate as:

$$\begin{aligned} e^{\frac{{{\varvec{q}}_i}^T {\varvec{k}}_j}{\sqrt{d_k}}}\approx 1+\frac{{{\varvec{q}}_i}^T {\varvec{k}}_j}{\sqrt{d_k}} \end{aligned}$$

(A4)

Here we normalize ${\varvec{q}}_i,{\varvec{k}}_j$ to ensure their no-negative and propose our similarity function:

$$\begin{aligned} \rm{sim}({\varvec{q}}_i,{\varvec{k}}_j)=1+\frac{1}{d_k}\cdot \left( \frac{{\varvec{q}}_i}{\Vert {\varvec{q}}_i\Vert }\right) ^T\left( \frac{{\varvec{k}}_j}{\Vert {\varvec{k}}_j\Vert }\right) \end{aligned}$$

(A5)

We put the Eq. (A5) to the Eq. (A3) to update the attention mechanism:

$$\begin{aligned} \begin{aligned} \rm{Attn}({\varvec{Q}},{\varvec{K}},{\varvec{V}})_i&=\frac{\sum ^N_{j=1}\left[ 1+\frac{1}{d_k}\cdot \left( \frac{{\varvec{q}}_i}{\Vert {\varvec{q}}_i\Vert }\right) ^T\left( \frac{{\varvec{k}}_j}{\Vert {\varvec{k}}_j\Vert }\right) \right] {\varvec{v}}_j}{\sum ^N_{j=1}1+\frac{1}{d_k}\cdot (\frac{{\varvec{q}}_i}{\Vert {\varvec{q}}_i\Vert })^T\left( \frac{{\varvec{k}}_j}{\Vert {\varvec{k}}_j\Vert }\right) }\\&=\frac{\sum ^N_{j=1}{\varvec{v}}_j+\frac{1}{d_k}\cdot \left( \frac{{\varvec{q}}_i}{\Vert {\varvec{q}}_i\Vert }\right) ^T\cdot \sum ^N_{j=1}\left( \frac{{\varvec{k}}_j}{\Vert {\varvec{k}}_j\Vert }\right) \cdot {\varvec{v}}_j}{N+\frac{1}{d_k}\cdot \left( \frac{{\varvec{q}}_i}{\Vert {\varvec{q}}_i\Vert }\right) ^T\cdot \sum ^N_{j=1}\left( \frac{{\varvec{k}}_j}{\Vert {\varvec{k}}_j\Vert }\right) } \end{aligned} \end{aligned}$$

(A6)

The above equation is simpler to follow when the numerator is written in vectorized form as follows:

$$\begin{aligned} ({\varvec{I}}+{\varvec{Q}}_{l_2}\cdot {\varvec{K}}_{l_2}^T)\cdot {\varvec{V}}={\varvec{V}}+{\varvec{Q}}_{l_2}\cdot ({\varvec{K}}_{l_2}^T\cdot {\varvec{V}}) \end{aligned}$$

(A7)

where ${\varvec{Q}}_{l_2},{\varvec{K}}_{l_2}$ are the result of l2 normalization, respectively. Equation (A1) shows the computational cost of softmax attention scales with ${{\mathcal {O}}}(n^2)$, where n represents the sequence length. However, Tayformer from Eq. (A6) has time and memory complexity ${{\mathcal {O}}}(n)$ because we can compute $\sum ^N_{j=1}\left( \frac{{\varvec{k}}_j}{\Vert {\varvec{k}}_j\Vert }\right) \cdot {\varvec{v}}_j$ and $\sum ^N_{j=1}\left( \frac{{\varvec{k}}_j}{\Vert {\varvec{k}}_j\Vert }\right)$ once and reuse them for every query.

The total cost of softmax attention in terms of multiplications and additions scales as ${{\mathcal {O}}}(n^2\max \{d_k,d_v\})$, where d is the dimensionality of Q, K, V. In contrast, we first compute the $K^TV$ and then multiply Q in Tayformer. Subsequently, we can compute the new values with ${{\mathcal {O}}}(nd_kd_v)$ additions and multiplications. For very long sequences, $n\gg d_v, d_k$. Thus, the complexity of Transformer and Tayformer can be rewrite ${{\mathcal {O}}}(n^2)$, ${{\mathcal {O}}}(n)$, respectively.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Mou, X., Peng, Q., Sun, Z. et al. Multi-document influence on readers: augmenting social emotion prediction by learning document interactions. Neural Comput & Applic 36, 6701–6719 (2024). https://doi.org/10.1007/s00521-024-09420-8

Download citation

Received: 12 June 2023
Accepted: 15 January 2024
Published: 14 February 2024
Issue Date: April 2024
DOI: https://doi.org/10.1007/s00521-024-09420-8

Keywords

Access this article

Log in via an institution

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Multi-document influence on readers: augmenting social emotion prediction by learning document interactions

Abstract

Access this article

Similar content being viewed by others

A Hybrid Semantic-Topic Co-encoding Network for Social Emotion Classification

Social Media Sentiment Analysis Based on Dependency Graph and Co-occurrence Graph

A multichannel embedding and arithmetic optimized stacked Bi-GRU model with semantic attention to detect emotion over text data

Data availability

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix A: Proof on the memory complexity of Tayformer

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-document influence on readers: augmenting social emotion prediction by learning document interactions

Abstract

Access this article

Similar content being viewed by others

A Hybrid Semantic-Topic Co-encoding Network for Social Emotion Classification

Social Media Sentiment Analysis Based on Dependency Graph and Co-occurrence Graph

A multichannel embedding and arithmetic optimized stacked Bi-GRU model with semantic attention to detect emotion over text data

Explore related subjects

Data availability

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix A: Proof on the memory complexity of Tayformer

Appendix A: Proof on the memory complexity of Tayformer

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation