Cross-domain sentiment classification via parameter transferring and attention sharing mechanism

doi:10.1016/j.ins.2021.07.001

Information Sciences

Volume 578, November 2021, Pages 281-296

https://doi.org/10.1016/j.ins.2021.07.001 Get rights and content

Highlights

•
Parameter transferring and attention sharing mechanism are proposed to effectively transfer sentiment knowledge through model transfer methods and effectively avoid overfitting.
•
Model transfer is presented toto transfer the network parameters from source domain network to target domain network.
•
Attention sharing mechanism is designed to share the attentional weights of different feature spaces in different domains.
•
The prediction results show the validity and advantages of sentiment transfer strategy.

Abstract

Training data in a specific domain are often insufficient in the area of text sentiment classifications. Cross-domain sentiment classification (CDSC) is usually utilized to extend the application scope of transfer learning in text-based social media and effectively solve the problem of insufficient data marking in specific domains. Hence, this paper aims to propose a CDSC method via parameter transferring and attention sharing mechanism (PTASM), and the presented architecture includes the source domain network (SDN) and the target domain network (TDN). First, hierarchical attentional network with pre-training language model on training data, such as global vectors for word representation and bidirectional encoder representations from transformers (BERT), are constructed. The word and sentence levels of parameter transferring mechanisms are introduced in the model transfer. Then, parameter transfer and fine-tuning techniques are adopted to transfer network parameters from SDN to TDN. Moreover, sentiment attention can serve as a bridge for sentiment transfer across different domains. Finally, word and sentence level attention mechanisms are introduced, and sentiment attention is shared from the two levels across domains. Extensive experiments show that the PTASM-BERT method achieves state-of-the-art results on Amazon review cross-domain datasets.

Introduction

Classic text sentiment classification methods premise that domains between training and testing are independent and identically distributed. However, different domains have distribution differences under realistic conditions [1], [2]. The cross-domain sentiment classification (CDSC) task adopts source domain resources to achieve sentiment classification tasks in target domains [3], [4]. The utilization of CDSC not only extends the application scope of transfer learning in text-based social media but also promotes the classification effect of low-resource text sentiment classification tasks to solve the problem of insufficient marking samples in specific domains. In addition, CDSC is conducive to promoting the rapid development of industrial applications related to text-based sentiment analysis [5].

Deep learning methods have achieved exceptional results on text sentiment classification tasks in the past years but require abundant labeled training data. However, annotating domain-specific data is an extremely labor-intensive task [6]. The distribution of sentiment varies across different domains. Users tend to express the sentiment with diverse words in different domains [7], [8]. Thus, the expression of sentiment is domain-dependent. Generalizing classifiers trained in different domains is difficult; thus, specific sentiment transfer strategies must be introduced. Sentiment transfer across different domains aims to seek domain invariance as a bridge to achieve cross-domain transfer.

Documents in social media have a three-level semantic structure, that is, word-sentence-document. The sentiment of words in the document composition determines the sentiment of a sentence, while that sentences determines the overall sentiment of documents [9], [10], [11]. In addition, various words and sentences contribute differently to the overall sentimental expression [12]. The attention mechanism can effectively promote the effect of sequence-to-sequence models via performing weighted transformation [13], [14]. Intuition suggests that the word- and sentence-level attentional weights obtained from source domain training can guide the training of the target domain attentional weights. The most important words and sentences for sentiment decision-making considering attention mechanisms are obtained. Fig. 1 shows examples of word- and sentence-level attention visualizations for reviews in kitchen and electronics domains. For instance, the sentiment polarity of sentences 1 and 6 according to the words “happy” and “disappoint” can be easily determined. Similarly, the sentiment polarity of the review can be determined by using sentences 7 and 9. Simultaneously, document representations in different domains can share similar sentence and word attention models. For instance, in kitchen and electronics domains, sentences 4 and 11 have similar word attention weights, while samples (a) and (c) have similar sentence attention weights.

The two challenges should generally be addressed in CDSC. First, how can the structure and parameters of deep neural networks be transferred across different domains? The model transfer mechanism can be adopted to transfer the structure and parameters of the model. Meanwhile, fine-tuning strategies in deep transfer learning act as a key scientific issue for this subject. Second, how are attention mechanisms shared in neural network models trained across different domains? The sentiment attention mechanism can be used as a bridge to connect different domains. The word- and sentence- level attention mechanisms can be mutually guided and trained.

A CDSC method is proposed in this paper via parameter transferring and attention sharing mechanism (PTASM). First, the sentiment information of important words and sentences in the text is modeled through hierarchical attentional network (HAN), and a document-level distributed representation is determined. The pre-training language models, such as global vectors for word representation (Glove) and bidirectional encoder representations from transformers (BERT), are used as input for HAN. First, two HANs, namely, source domain network (SDN) and target domain network (TDN), are designed. Then, the word and sentence levels of the parameter transferring mechanisms are introduced in model transfer. In addition, word- and sentence-level attention alignment relationships across different domains are considered, and an effective cross-domain attention sharing mechanism is designed. Last, PTASM is validated with Glove and BERT on benchmark CDSC datasets. Experiments indicate that the PTASM-BERT method not only obtains a high cross-domain classification accuracy but also automatically learns the alignment degree of features between domains. Moreover, parameter transferring and attention sharing at the word and sentence levels are verified to be better than those at a single level.

The contributions of this paper can be summarized as follows.

•
A CDSC method via parameter transferring and attention sharing mechanism, which can effectively transfer sentiment knowledge via model transfer methods and avoid overfitting, is proposed. Meanwhile, the attention sharing mechanism is utilized as a bridge across different domains.
•
The model transfer strategy is adopted to transfer the network parameters. The attentional weights of different feature spaces for the attention sharing mechanism can guide one another, and the cosine distance of attentional weights is reduced.
•
PTASM is validated on Amazon review datasets, and the parameter selection is experimentally verified. Experiments demonstrate that PTASM-BERT can improve the accuracy compared with several baseline methods.

The remainder of this paper is arranged as follows. Section 2 discusses model transfer methods and attention mechanism in sentiment analysis. Section 3 proposes the architecture of transferable neural networks framework and attention sharing mechanism. Section 4 introduces the experimental setup. Section 5 shows the experimental results and analyses of the proposed approach. Section 6 summarizes the paper and provides promising future works.

Section snippets

Related works

First, existing model transfer methods for CDSC tasks are summarized in this section. Then, the attention mechanism for texts sentiment classification tasks is explained. Finally, the model transfer and attention sharing approach are introduced to achieve sentiment transfer across different domains.

Proposed model

First, the notations used in PTASM and problem definitions of CDSC are introduced in this section. Then, the architecture of transferable neural networks and HAN are presented. Afterward, the parameter transferring strategy and the attention sharing mechanism are discussed. Finally, the training details of PTASM are presented.

Experimental setup

The Amazon review datasets are first presented in this section. Subsequently, the parameter settings in PTASM are studied. Then, the comparison baselines are shown, followed by the evaluation metrics.

Results and analyses

First, the accuracy rates of different methods are compared to verify the effect of PTASM. Then, the effectiveness of parameter variation is validated on the CDSC results, including the influence of the transferred level, attention weight, and pre-training epoch. Subsequently, case studies and visualization are provided and displayed. Finally, error analysis and examples of incorrectly divided samples are provided.

Conclusions and future works

A CDSC method based on PTASM is presented in this study. This method enables efficient sentiment transfer across domains. Parameter transferring can transfer the model parameters of the HAN, while attention sharing can share cross-domain position information. Experiments on publicly available Amazon product evaluation datasets have shown that a combination of representational and transfer learning can construct a text-based CDSC system. Furthermore, the method can be effectively used for the

CRediT authorship contribution statement

Chuanjun Zhao: Funding acquisition, Project administration, Writing - original draft, Methodology, Software, Supervision, Resources, Writing - review & editing. Suge Wang: Conceptualization, Formal analysis, Funding acquisition, Writing - original draft. Deyu Li: Funding acquisition, Project administration, Resources, Writing - review & editing. Xianzhi Liu: Conceptualization, Resources. Xinyi Yang: Data curation, Formal analysis. Jinfeng Liu: Conceptualization, Formal analysis.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This study is supported by national natural science foundation of China (Grant No.61906110, 62076158, 61632011, 62072294), Scientific and technological innovation programs of higher education institutions in Shanxi (Grant No.2019L0500), Shanxi application basic research plan (Grant No.201901D211414), Key research and development projects of Shanxi Province (Grant No.201903D421041).

References (48)

E. Cambria
Affective computing and sentiment analysis
IEEE Intell. Syst.
(2016)
D. Wang et al.
Coarse alignment of topic and sentiment: a unified model for cross-lingual sentiment classification
IEEE Trans. Neural Networks Learn. Syst.
(2021)
C. Zhao et al.
Exploiting social and local contexts propagation for inducing chinese microblog-specific sentiment lexicons
Comput. Speech Language
(2019)
C. Zhang et al.
Multi-granularity three-way decisions with adjustable hesitant fuzzy linguistic multigranulation decision-theoretic rough sets over two universes
Inf. Sci.
(2020)
E. Cambria et al.
Senticnet 6: Ensemble application of symbolic and subsymbolic ai for sentiment analysis
J. Zhou et al.
Position-aware hierarchical transfer model for aspect-level sentiment classification
Inf. Sci.
(2020)
E. Cambria et al.
A Practical Guide to Sentiment Analysis
(2018)
C. Zhao et al.
Deep transfer learning for social media cross-domain sentiment classification
K.-P. Lai, W. Lam, J.C.S. Ho, Domain-Aware Recurrent Neural Network for Cross-Domain Sentiment Classification,...
M. López et al.
E2sam: evolutionary ensemble of sentiment analysis methods for domain adaptation
Inf. Sci.
(2019)

L. Kong et al.

Leveraging multiple features for document sentiment classification

Inf. Sci.

(2020)

J. Cheng et al.

Aspect-level sentiment classification with heat (hierarchical attention) network

J. Liu, Y. Zhang, Attention modeling for targeted sentiment, in: M. Lapata, P. Blunsom, A. Koller (Eds.), Proceedings...

Y. Zhang et al.

Learning sentiment sentence representation with multiview attention model

Inf. Sci.

(2021)

C. Zhao et al.

Research progress on cross-domain text sentiment classification

J. Software

(2020)

F. Xu et al.

E-commerce product review sentiment classification based on a naïve bayes continuous learning framework

Inf. Process. Manage.

(2020)

B. Myagmar et al.

Cross-domain sentiment classification with bidirectional contextualized transformer language models

IEEE Access

(2019)

T. Al-Moslmi, M. Albared, A. Al-Shabi, S. Abdullah, et al., Bidirectional feature transfer for cross-domain sentiment...

Z. Li, Y. Zhang, Y. Wei, Y. Wu, Q. Yang, End-to-end adversarial memory network for cross-domain sentiment...

N.X. Bach et al.

Cross-domain sentiment classification with word embeddings and canonical correlation analysis

X. Glorot, A. Bordes, Y. Bengio, Domain adaptation for large-scale sentiment classification: A deep learning approach,...

J. Yu et al.

Learning sentence embeddings with auxiliary tasks for cross-domain sentiment classification

C. Zhao et al.

Multi-source domain adaptation with joint learning for cross-domain sentiment classification

Knowl.-Based Syst.

(2019)

B. Zhang et al.

Cross-domain sentiment classification by capsule network with semantic rules

IEEE Access

(2018)

Cited by (17)

DEA: Data-efficient augmentation for interpretable medical image segmentation
2024, Biomedical Signal Processing and Control
Data efficiency plays a pivotal role in medical image segmentation where data labeling is expensive and time consuming. However, there are few effective methods to enhance data efficiency without compromising the methods’ effectiveness. We propose a plug-and-use data augmentation method called Data Efficient Augmentation (DEA) for efficient medical image segmentation. DEA is designed to enhance data efficiency, which also has excellent generalization capabilities across different segmentation methods without modifying network structures. Furthermore, we extract data-efficient features with Grad-weighted Class Activation Mapping (Grad-CAM). The proposed DEA method improves segmentation performance on Hyper-Kvasir and ISIC-Archive datasets. With the proposed DEA method, the Intersection over Union (IoU) and dice similarity coefficient (Dice) is increased by 2.84% and 2.26% respectively compared with the state-of-the-art methods. What is more, DEA enables different segmentation methods to achieve over 95% accuracy with only 70% of training data compared with these methods using the whole dataset. The quantitative and qualitative analysis proves that the proposed DEA is tailored for medical image segmentation and improves the interpretability in data augmentation techniques.
An efficient multiple-word embedding-based cross-domain feature extraction and aspect sentiment classification
2023, Measurement: Sensors
As the size of the feature space and data size increases, it is difficult to find the essential key features for cross-domain classification problems. Traditional word embedding and feature selection models use limited-sized data and dimensions for feature ranking and classification processes. To overcome these challenges, a hybrid cross-domain classification (CDC) based feature selection has been proposed to improve the efficiency of aspect sentiment classification on large databases. Initially, the crawled data from the Amazon product dataset to the cloud web server. In the data filtering approach, each record is pre-processed for the noise removal process. In the CNN framework, the different word-embedded models are used to ensemble the features from the training data. Here, TF-ID+Word2Vec, Hybrid Glove, and Skipgram are used to select the relevant key features from the training data. On the local side, remote IoT-based devices data distributed processing. For in-depth analysis of the sentiments IOT sensors have been used for automation. The filtered key features are given to the proposed feature extraction measure to select the essential features for the cross-domain classification model, it classifies the positive or negative. The proposed CDC model achieves overall precision and recall is 0.96 and 0.97 respectively. Experimental results proved that the proposed cross-domain feature selection-based classification approach has a better overall true positive than the conventional approaches.
Progress and prospects of data-driven stock price forecasting research
2023, International Journal of Cognitive Computing in Engineering
With the rapid development of social economy and the continuous improvement of stock market, stock investment has become more and more widely concerned. Stock price prediction has become an important research direction in the field of cognitive computing in engineering. Data-driven stock price forecasting aims to predict future stock price trends based on historical values and textual data, which can effectively help people reduce risks and improve returns in the process of stock investment. The article reviews the literature on stock price forecasting methods, and classifies stock price forecasting methods from two different perspectives of model and feature. According to different model angles, the existing stock price prediction methods can be divided into statistical analysis methods, traditional machine learning methods and deep learning methods. According to different characteristic angles, the existing stock price prediction methods can be divided into those based on numerical data and those based on text mixed with numerical data. Finally, we summarize the research challenges faced by stock price prediction and provide future research directions.
Domain adaptation with a shrinkable discrepancy strategy for cross-domain sentiment classification
2022, Neurocomputing
Citation Excerpt :
For example, zhang et al. [13] introduced interactive attention transfer network to jointly train domain classifier and sentiment classifier, and used GRL in domain confusion. Zhao et al. [14] proposed parameter transferring and attention sharing mechanisms to transfer sentiment across domain and bridged source and target domains by reducing the cosine distance of attentional weights. Yao et al. [15] introduced DANN-based approach to achieve the domain-adaptation, and extracted domain-relevant information to predict the sentiment polarity of target domain.
Cross-domain sentiment classification (CDSC) is used to predict the sentiment polarity of a text in an unlabeled target domain by analyzing the reviews in the labeled source domain. Domain adaptive approaches have become the preferred solution in recent years to the unsupervised domain migration problem. Among them, adversarial learning aligns the sample distribution of the two domains through domain confusion to transfer sentiment across domains. However, traditional adversarial learning often roughly measures domain discrepancy. Although scholars have attempted to adjust the decision boundary of different categories to eliminate the domain shift, such as maximum classifier discrepancy model, there are still two problems with this approach. First, it ignores the intra-domain structure, which causes the samples distributed on the decision boundary to be easily misclassified. Second, it only realizes coarse-grained sentiment migration and lacks a refined evaluation of the transferable information in the inter-domain, which causes a negative transfer. To solve these problems, we propose domain adaptation with a shrinkable discrepancy strategy (DA-SDS) for the task of CDSC. Specifically, we propose to shrink the category subspace in the intra-domain while building the decision boundary of classifiers, which reduces the misclassification by clustering samples to the category center. We also propose to measure the weighted domain discrepancy in the inter-domain, which mitigates the negative transfer through the refined assessment of domain discrepancy. Extensive evaluations showed that DA-SDS outperformed state-of-the-art methods on the Amazon Review dataset.
Knowing how satisfied/dissatisfied is far from enough: a comprehensive customer satisfaction analysis framework based on hybrid text mining techniques
2024, International Journal of Contemporary Hospitality Management
Cross-Domain Sentiment Analysis Based on Feature Projection and Multi-Source Attention in IoT
2023, Sensors

View all citing articles on Scopus

View full text

Cross-domain sentiment classification via parameter transferring and attention sharing mechanism

Highlights

Abstract

Introduction

Section snippets

Related works

Proposed model

Experimental setup

Results and analyses

Conclusions and future works

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgments

Affective computing and sentiment analysis

IEEE Intell. Syst.

Coarse alignment of topic and sentiment: a unified model for cross-lingual sentiment classification

IEEE Trans. Neural Networks Learn. Syst.

Exploiting social and local contexts propagation for inducing chinese microblog-specific sentiment lexicons

Comput. Speech Language

Multi-granularity three-way decisions with adjustable hesitant fuzzy linguistic multigranulation decision-theoretic rough sets over two universes

Inf. Sci.

Senticnet 6: Ensemble application of symbolic and subsymbolic ai for sentiment analysis

Position-aware hierarchical transfer model for aspect-level sentiment classification

Inf. Sci.

A Practical Guide to Sentiment Analysis

Deep transfer learning for social media cross-domain sentiment classification

E2sam: evolutionary ensemble of sentiment analysis methods for domain adaptation

Inf. Sci.

Leveraging multiple features for document sentiment classification

Inf. Sci.

Aspect-level sentiment classification with heat (hierarchical attention) network

Learning sentiment sentence representation with multiview attention model

Inf. Sci.

Research progress on cross-domain text sentiment classification

J. Software

E-commerce product review sentiment classification based on a naïve bayes continuous learning framework

Inf. Process. Manage.

Cross-domain sentiment classification with bidirectional contextualized transformer language models

IEEE Access

Cross-domain sentiment classification with word embeddings and canonical correlation analysis

Learning sentence embeddings with auxiliary tasks for cross-domain sentiment classification

Multi-source domain adaptation with joint learning for cross-domain sentiment classification

Knowl.-Based Syst.

Cross-domain sentiment classification by capsule network with semantic rules

IEEE Access