Cross-domain sentiment classification via parameter transferring and attention sharing mechanism
Introduction
Classic text sentiment classification methods premise that domains between training and testing are independent and identically distributed. However, different domains have distribution differences under realistic conditions [1], [2]. The cross-domain sentiment classification (CDSC) task adopts source domain resources to achieve sentiment classification tasks in target domains [3], [4]. The utilization of CDSC not only extends the application scope of transfer learning in text-based social media but also promotes the classification effect of low-resource text sentiment classification tasks to solve the problem of insufficient marking samples in specific domains. In addition, CDSC is conducive to promoting the rapid development of industrial applications related to text-based sentiment analysis [5].
Deep learning methods have achieved exceptional results on text sentiment classification tasks in the past years but require abundant labeled training data. However, annotating domain-specific data is an extremely labor-intensive task [6]. The distribution of sentiment varies across different domains. Users tend to express the sentiment with diverse words in different domains [7], [8]. Thus, the expression of sentiment is domain-dependent. Generalizing classifiers trained in different domains is difficult; thus, specific sentiment transfer strategies must be introduced. Sentiment transfer across different domains aims to seek domain invariance as a bridge to achieve cross-domain transfer.
Documents in social media have a three-level semantic structure, that is, word-sentence-document. The sentiment of words in the document composition determines the sentiment of a sentence, while that sentences determines the overall sentiment of documents [9], [10], [11]. In addition, various words and sentences contribute differently to the overall sentimental expression [12]. The attention mechanism can effectively promote the effect of sequence-to-sequence models via performing weighted transformation [13], [14]. Intuition suggests that the word- and sentence-level attentional weights obtained from source domain training can guide the training of the target domain attentional weights. The most important words and sentences for sentiment decision-making considering attention mechanisms are obtained. Fig. 1 shows examples of word- and sentence-level attention visualizations for reviews in kitchen and electronics domains. For instance, the sentiment polarity of sentences 1 and 6 according to the words “happy” and “disappoint” can be easily determined. Similarly, the sentiment polarity of the review can be determined by using sentences 7 and 9. Simultaneously, document representations in different domains can share similar sentence and word attention models. For instance, in kitchen and electronics domains, sentences 4 and 11 have similar word attention weights, while samples (a) and (c) have similar sentence attention weights.
The two challenges should generally be addressed in CDSC. First, how can the structure and parameters of deep neural networks be transferred across different domains? The model transfer mechanism can be adopted to transfer the structure and parameters of the model. Meanwhile, fine-tuning strategies in deep transfer learning act as a key scientific issue for this subject. Second, how are attention mechanisms shared in neural network models trained across different domains? The sentiment attention mechanism can be used as a bridge to connect different domains. The word- and sentence- level attention mechanisms can be mutually guided and trained.
A CDSC method is proposed in this paper via parameter transferring and attention sharing mechanism (PTASM). First, the sentiment information of important words and sentences in the text is modeled through hierarchical attentional network (HAN), and a document-level distributed representation is determined. The pre-training language models, such as global vectors for word representation (Glove) and bidirectional encoder representations from transformers (BERT), are used as input for HAN. First, two HANs, namely, source domain network (SDN) and target domain network (TDN), are designed. Then, the word and sentence levels of the parameter transferring mechanisms are introduced in model transfer. In addition, word- and sentence-level attention alignment relationships across different domains are considered, and an effective cross-domain attention sharing mechanism is designed. Last, PTASM is validated with Glove and BERT on benchmark CDSC datasets. Experiments indicate that the PTASM-BERT method not only obtains a high cross-domain classification accuracy but also automatically learns the alignment degree of features between domains. Moreover, parameter transferring and attention sharing at the word and sentence levels are verified to be better than those at a single level.
The contributions of this paper can be summarized as follows.
- •
A CDSC method via parameter transferring and attention sharing mechanism, which can effectively transfer sentiment knowledge via model transfer methods and avoid overfitting, is proposed. Meanwhile, the attention sharing mechanism is utilized as a bridge across different domains.
- •
The model transfer strategy is adopted to transfer the network parameters. The attentional weights of different feature spaces for the attention sharing mechanism can guide one another, and the cosine distance of attentional weights is reduced.
- •
PTASM is validated on Amazon review datasets, and the parameter selection is experimentally verified. Experiments demonstrate that PTASM-BERT can improve the accuracy compared with several baseline methods.
The remainder of this paper is arranged as follows. Section 2 discusses model transfer methods and attention mechanism in sentiment analysis. Section 3 proposes the architecture of transferable neural networks framework and attention sharing mechanism. Section 4 introduces the experimental setup. Section 5 shows the experimental results and analyses of the proposed approach. Section 6 summarizes the paper and provides promising future works.
Section snippets
Related works
First, existing model transfer methods for CDSC tasks are summarized in this section. Then, the attention mechanism for texts sentiment classification tasks is explained. Finally, the model transfer and attention sharing approach are introduced to achieve sentiment transfer across different domains.
Proposed model
First, the notations used in PTASM and problem definitions of CDSC are introduced in this section. Then, the architecture of transferable neural networks and HAN are presented. Afterward, the parameter transferring strategy and the attention sharing mechanism are discussed. Finally, the training details of PTASM are presented.
Experimental setup
The Amazon review datasets are first presented in this section. Subsequently, the parameter settings in PTASM are studied. Then, the comparison baselines are shown, followed by the evaluation metrics.
Results and analyses
First, the accuracy rates of different methods are compared to verify the effect of PTASM. Then, the effectiveness of parameter variation is validated on the CDSC results, including the influence of the transferred level, attention weight, and pre-training epoch. Subsequently, case studies and visualization are provided and displayed. Finally, error analysis and examples of incorrectly divided samples are provided.
Conclusions and future works
A CDSC method based on PTASM is presented in this study. This method enables efficient sentiment transfer across domains. Parameter transferring can transfer the model parameters of the HAN, while attention sharing can share cross-domain position information. Experiments on publicly available Amazon product evaluation datasets have shown that a combination of representational and transfer learning can construct a text-based CDSC system. Furthermore, the method can be effectively used for the
CRediT authorship contribution statement
Chuanjun Zhao: Funding acquisition, Project administration, Writing - original draft, Methodology, Software, Supervision, Resources, Writing - review & editing. Suge Wang: Conceptualization, Formal analysis, Funding acquisition, Writing - original draft. Deyu Li: Funding acquisition, Project administration, Resources, Writing - review & editing. Xianzhi Liu: Conceptualization, Resources. Xinyi Yang: Data curation, Formal analysis. Jinfeng Liu: Conceptualization, Formal analysis.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This study is supported by national natural science foundation of China (Grant No.61906110, 62076158, 61632011, 62072294), Scientific and technological innovation programs of higher education institutions in Shanxi (Grant No.2019L0500), Shanxi application basic research plan (Grant No.201901D211414), Key research and development projects of Shanxi Province (Grant No.201903D421041).
References (48)
Affective computing and sentiment analysis
IEEE Intell. Syst.
(2016)- et al.
Coarse alignment of topic and sentiment: a unified model for cross-lingual sentiment classification
IEEE Trans. Neural Networks Learn. Syst.
(2021) - et al.
Exploiting social and local contexts propagation for inducing chinese microblog-specific sentiment lexicons
Comput. Speech Language
(2019) - et al.
Multi-granularity three-way decisions with adjustable hesitant fuzzy linguistic multigranulation decision-theoretic rough sets over two universes
Inf. Sci.
(2020) - et al.
Senticnet 6: Ensemble application of symbolic and subsymbolic ai for sentiment analysis
- et al.
Position-aware hierarchical transfer model for aspect-level sentiment classification
Inf. Sci.
(2020) - et al.
A Practical Guide to Sentiment Analysis
(2018) - et al.
Deep transfer learning for social media cross-domain sentiment classification
- K.-P. Lai, W. Lam, J.C.S. Ho, Domain-Aware Recurrent Neural Network for Cross-Domain Sentiment Classification,...
- et al.
E2sam: evolutionary ensemble of sentiment analysis methods for domain adaptation
Inf. Sci.
(2019)
Leveraging multiple features for document sentiment classification
Inf. Sci.
Aspect-level sentiment classification with heat (hierarchical attention) network
Learning sentiment sentence representation with multiview attention model
Inf. Sci.
Research progress on cross-domain text sentiment classification
J. Software
E-commerce product review sentiment classification based on a naïve bayes continuous learning framework
Inf. Process. Manage.
Cross-domain sentiment classification with bidirectional contextualized transformer language models
IEEE Access
Cross-domain sentiment classification with word embeddings and canonical correlation analysis
Learning sentence embeddings with auxiliary tasks for cross-domain sentiment classification
Multi-source domain adaptation with joint learning for cross-domain sentiment classification
Knowl.-Based Syst.
Cross-domain sentiment classification by capsule network with semantic rules
IEEE Access
Cited by (17)
DEA: Data-efficient augmentation for interpretable medical image segmentation
2024, Biomedical Signal Processing and ControlProgress and prospects of data-driven stock price forecasting research
2023, International Journal of Cognitive Computing in EngineeringDomain adaptation with a shrinkable discrepancy strategy for cross-domain sentiment classification
2022, NeurocomputingCitation Excerpt :For example, zhang et al. [13] introduced interactive attention transfer network to jointly train domain classifier and sentiment classifier, and used GRL in domain confusion. Zhao et al. [14] proposed parameter transferring and attention sharing mechanisms to transfer sentiment across domain and bridged source and target domains by reducing the cosine distance of attentional weights. Yao et al. [15] introduced DANN-based approach to achieve the domain-adaptation, and extracted domain-relevant information to predict the sentiment polarity of target domain.
Knowing how satisfied/dissatisfied is far from enough: a comprehensive customer satisfaction analysis framework based on hybrid text mining techniques
2024, International Journal of Contemporary Hospitality Management