Elsevier

Information Sciences

Volume 578, November 2021, Pages 281-296
Information Sciences

Cross-domain sentiment classification via parameter transferring and attention sharing mechanism

https://doi.org/10.1016/j.ins.2021.07.001Get rights and content

Highlights

  • Parameter transferring and attention sharing mechanism are proposed to effectively transfer sentiment knowledge through model transfer methods and effectively avoid overfitting.

  • Model transfer is presented toto transfer the network parameters from source domain network to target domain network.

  • Attention sharing mechanism is designed to share the attentional weights of different feature spaces in different domains.

  • The prediction results show the validity and advantages of sentiment transfer strategy.

Abstract

Training data in a specific domain are often insufficient in the area of text sentiment classifications. Cross-domain sentiment classification (CDSC) is usually utilized to extend the application scope of transfer learning in text-based social media and effectively solve the problem of insufficient data marking in specific domains. Hence, this paper aims to propose a CDSC method via parameter transferring and attention sharing mechanism (PTASM), and the presented architecture includes the source domain network (SDN) and the target domain network (TDN). First, hierarchical attentional network with pre-training language model on training data, such as global vectors for word representation and bidirectional encoder representations from transformers (BERT), are constructed. The word and sentence levels of parameter transferring mechanisms are introduced in the model transfer. Then, parameter transfer and fine-tuning techniques are adopted to transfer network parameters from SDN to TDN. Moreover, sentiment attention can serve as a bridge for sentiment transfer across different domains. Finally, word and sentence level attention mechanisms are introduced, and sentiment attention is shared from the two levels across domains. Extensive experiments show that the PTASM-BERT method achieves state-of-the-art results on Amazon review cross-domain datasets.

Introduction

Classic text sentiment classification methods premise that domains between training and testing are independent and identically distributed. However, different domains have distribution differences under realistic conditions [1], [2]. The cross-domain sentiment classification (CDSC) task adopts source domain resources to achieve sentiment classification tasks in target domains [3], [4]. The utilization of CDSC not only extends the application scope of transfer learning in text-based social media but also promotes the classification effect of low-resource text sentiment classification tasks to solve the problem of insufficient marking samples in specific domains. In addition, CDSC is conducive to promoting the rapid development of industrial applications related to text-based sentiment analysis [5].

Deep learning methods have achieved exceptional results on text sentiment classification tasks in the past years but require abundant labeled training data. However, annotating domain-specific data is an extremely labor-intensive task [6]. The distribution of sentiment varies across different domains. Users tend to express the sentiment with diverse words in different domains [7], [8]. Thus, the expression of sentiment is domain-dependent. Generalizing classifiers trained in different domains is difficult; thus, specific sentiment transfer strategies must be introduced. Sentiment transfer across different domains aims to seek domain invariance as a bridge to achieve cross-domain transfer.

Documents in social media have a three-level semantic structure, that is, word-sentence-document. The sentiment of words in the document composition determines the sentiment of a sentence, while that sentences determines the overall sentiment of documents [9], [10], [11]. In addition, various words and sentences contribute differently to the overall sentimental expression [12]. The attention mechanism can effectively promote the effect of sequence-to-sequence models via performing weighted transformation [13], [14]. Intuition suggests that the word- and sentence-level attentional weights obtained from source domain training can guide the training of the target domain attentional weights. The most important words and sentences for sentiment decision-making considering attention mechanisms are obtained. Fig. 1 shows examples of word- and sentence-level attention visualizations for reviews in kitchen and electronics domains. For instance, the sentiment polarity of sentences 1 and 6 according to the words “happy” and “disappoint” can be easily determined. Similarly, the sentiment polarity of the review can be determined by using sentences 7 and 9. Simultaneously, document representations in different domains can share similar sentence and word attention models. For instance, in kitchen and electronics domains, sentences 4 and 11 have similar word attention weights, while samples (a) and (c) have similar sentence attention weights.

The two challenges should generally be addressed in CDSC. First, how can the structure and parameters of deep neural networks be transferred across different domains? The model transfer mechanism can be adopted to transfer the structure and parameters of the model. Meanwhile, fine-tuning strategies in deep transfer learning act as a key scientific issue for this subject. Second, how are attention mechanisms shared in neural network models trained across different domains? The sentiment attention mechanism can be used as a bridge to connect different domains. The word- and sentence- level attention mechanisms can be mutually guided and trained.

A CDSC method is proposed in this paper via parameter transferring and attention sharing mechanism (PTASM). First, the sentiment information of important words and sentences in the text is modeled through hierarchical attentional network (HAN), and a document-level distributed representation is determined. The pre-training language models, such as global vectors for word representation (Glove) and bidirectional encoder representations from transformers (BERT), are used as input for HAN. First, two HANs, namely, source domain network (SDN) and target domain network (TDN), are designed. Then, the word and sentence levels of the parameter transferring mechanisms are introduced in model transfer. In addition, word- and sentence-level attention alignment relationships across different domains are considered, and an effective cross-domain attention sharing mechanism is designed. Last, PTASM is validated with Glove and BERT on benchmark CDSC datasets. Experiments indicate that the PTASM-BERT method not only obtains a high cross-domain classification accuracy but also automatically learns the alignment degree of features between domains. Moreover, parameter transferring and attention sharing at the word and sentence levels are verified to be better than those at a single level.

The contributions of this paper can be summarized as follows.

  • A CDSC method via parameter transferring and attention sharing mechanism, which can effectively transfer sentiment knowledge via model transfer methods and avoid overfitting, is proposed. Meanwhile, the attention sharing mechanism is utilized as a bridge across different domains.

  • The model transfer strategy is adopted to transfer the network parameters. The attentional weights of different feature spaces for the attention sharing mechanism can guide one another, and the cosine distance of attentional weights is reduced.

  • PTASM is validated on Amazon review datasets, and the parameter selection is experimentally verified. Experiments demonstrate that PTASM-BERT can improve the accuracy compared with several baseline methods.

The remainder of this paper is arranged as follows. Section 2 discusses model transfer methods and attention mechanism in sentiment analysis. Section 3 proposes the architecture of transferable neural networks framework and attention sharing mechanism. Section 4 introduces the experimental setup. Section 5 shows the experimental results and analyses of the proposed approach. Section 6 summarizes the paper and provides promising future works.

Section snippets

Related works

First, existing model transfer methods for CDSC tasks are summarized in this section. Then, the attention mechanism for texts sentiment classification tasks is explained. Finally, the model transfer and attention sharing approach are introduced to achieve sentiment transfer across different domains.

Proposed model

First, the notations used in PTASM and problem definitions of CDSC are introduced in this section. Then, the architecture of transferable neural networks and HAN are presented. Afterward, the parameter transferring strategy and the attention sharing mechanism are discussed. Finally, the training details of PTASM are presented.

Experimental setup

The Amazon review datasets are first presented in this section. Subsequently, the parameter settings in PTASM are studied. Then, the comparison baselines are shown, followed by the evaluation metrics.

Results and analyses

First, the accuracy rates of different methods are compared to verify the effect of PTASM. Then, the effectiveness of parameter variation is validated on the CDSC results, including the influence of the transferred level, attention weight, and pre-training epoch. Subsequently, case studies and visualization are provided and displayed. Finally, error analysis and examples of incorrectly divided samples are provided.

Conclusions and future works

A CDSC method based on PTASM is presented in this study. This method enables efficient sentiment transfer across domains. Parameter transferring can transfer the model parameters of the HAN, while attention sharing can share cross-domain position information. Experiments on publicly available Amazon product evaluation datasets have shown that a combination of representational and transfer learning can construct a text-based CDSC system. Furthermore, the method can be effectively used for the

CRediT authorship contribution statement

Chuanjun Zhao: Funding acquisition, Project administration, Writing - original draft, Methodology, Software, Supervision, Resources, Writing - review & editing. Suge Wang: Conceptualization, Formal analysis, Funding acquisition, Writing - original draft. Deyu Li: Funding acquisition, Project administration, Resources, Writing - review & editing. Xianzhi Liu: Conceptualization, Resources. Xinyi Yang: Data curation, Formal analysis. Jinfeng Liu: Conceptualization, Formal analysis.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This study is supported by national natural science foundation of China (Grant No.61906110, 62076158, 61632011, 62072294), Scientific and technological innovation programs of higher education institutions in Shanxi (Grant No.2019L0500), Shanxi application basic research plan (Grant No.201901D211414), Key research and development projects of Shanxi Province (Grant No.201903D421041).

References (48)

  • E. Cambria

    Affective computing and sentiment analysis

    IEEE Intell. Syst.

    (2016)
  • D. Wang et al.

    Coarse alignment of topic and sentiment: a unified model for cross-lingual sentiment classification

    IEEE Trans. Neural Networks Learn. Syst.

    (2021)
  • C. Zhao et al.

    Exploiting social and local contexts propagation for inducing chinese microblog-specific sentiment lexicons

    Comput. Speech Language

    (2019)
  • C. Zhang et al.

    Multi-granularity three-way decisions with adjustable hesitant fuzzy linguistic multigranulation decision-theoretic rough sets over two universes

    Inf. Sci.

    (2020)
  • E. Cambria et al.

    Senticnet 6: Ensemble application of symbolic and subsymbolic ai for sentiment analysis

  • J. Zhou et al.

    Position-aware hierarchical transfer model for aspect-level sentiment classification

    Inf. Sci.

    (2020)
  • E. Cambria et al.

    A Practical Guide to Sentiment Analysis

    (2018)
  • C. Zhao et al.

    Deep transfer learning for social media cross-domain sentiment classification

  • K.-P. Lai, W. Lam, J.C.S. Ho, Domain-Aware Recurrent Neural Network for Cross-Domain Sentiment Classification,...
  • M. López et al.

    E2sam: evolutionary ensemble of sentiment analysis methods for domain adaptation

    Inf. Sci.

    (2019)
  • L. Kong et al.

    Leveraging multiple features for document sentiment classification

    Inf. Sci.

    (2020)
  • J. Cheng et al.

    Aspect-level sentiment classification with heat (hierarchical attention) network

  • J. Liu, Y. Zhang, Attention modeling for targeted sentiment, in: M. Lapata, P. Blunsom, A. Koller (Eds.), Proceedings...
  • Y. Zhang et al.

    Learning sentiment sentence representation with multiview attention model

    Inf. Sci.

    (2021)
  • C. Zhao et al.

    Research progress on cross-domain text sentiment classification

    J. Software

    (2020)
  • F. Xu et al.

    E-commerce product review sentiment classification based on a naïve bayes continuous learning framework

    Inf. Process. Manage.

    (2020)
  • B. Myagmar et al.

    Cross-domain sentiment classification with bidirectional contextualized transformer language models

    IEEE Access

    (2019)
  • T. Al-Moslmi, M. Albared, A. Al-Shabi, S. Abdullah, et al., Bidirectional feature transfer for cross-domain sentiment...
  • Z. Li, Y. Zhang, Y. Wei, Y. Wu, Q. Yang, End-to-end adversarial memory network for cross-domain sentiment...
  • N.X. Bach et al.

    Cross-domain sentiment classification with word embeddings and canonical correlation analysis

  • X. Glorot, A. Bordes, Y. Bengio, Domain adaptation for large-scale sentiment classification: A deep learning approach,...
  • J. Yu et al.

    Learning sentence embeddings with auxiliary tasks for cross-domain sentiment classification

  • C. Zhao et al.

    Multi-source domain adaptation with joint learning for cross-domain sentiment classification

    Knowl.-Based Syst.

    (2019)
  • B. Zhang et al.

    Cross-domain sentiment classification by capsule network with semantic rules

    IEEE Access

    (2018)
  • Cited by (17)

    • Progress and prospects of data-driven stock price forecasting research

      2023, International Journal of Cognitive Computing in Engineering
    • Domain adaptation with a shrinkable discrepancy strategy for cross-domain sentiment classification

      2022, Neurocomputing
      Citation Excerpt :

      For example, zhang et al. [13] introduced interactive attention transfer network to jointly train domain classifier and sentiment classifier, and used GRL in domain confusion. Zhao et al. [14] proposed parameter transferring and attention sharing mechanisms to transfer sentiment across domain and bridged source and target domains by reducing the cosine distance of attentional weights. Yao et al. [15] introduced DANN-based approach to achieve the domain-adaptation, and extracted domain-relevant information to predict the sentiment polarity of target domain.

    View all citing articles on Scopus
    View full text