Elsevier

Applied Soft Computing

Volume 112, November 2021, 107792
Applied Soft Computing

Sentiment classification using attention mechanism and bidirectional long short-term memory network

https://doi.org/10.1016/j.asoc.2021.107792Get rights and content

Highlights

  • Sentiment classification based on Attention-based Bidirectional long short-term memory network.

  • A mix embedding model for sentiment classification based on the Word2Vec and GloVe.

  • Propose an effective sentiment classification method for large-scale microblog text.

Abstract

We propose a sentiment classification method for large scale microblog text based on the attention mechanism and the bidirectional long short-term memory network (SC-ABiLSTM). We use an experimental study to compare our proposed method with baseline methods using real world large-scale microblog data. Comparing the accuracy of the baseline methods to the accuracy of our model, we demonstrate the efficacy of our proposed method. While sentiment classification of social media data has been extensively studied, the main novelty of our study is the implementation of the attention mechanism in a deep learning network for analyzing large scale social media data.

Introduction

In the era of big data, online users frequently express their opinions on various social media platforms. For example, Microblog has become one of the most popular online broadcast media for Internet users. As the published text from microblogging users is growing exponentially, microblog text that contains numerous social media users’ subjective opinions has become a very valuable source of information [1]. The function of sentiment classification is to classify opinions into different sentiment categories, etc. positive, negative or neutral for text objects [2], [3]. Sentiment classification of microblog text has attracted great attentions in recent years, mainly because microblog text has many distinct characteristics (e.g., real time, short and diverse information elements). However, these distinct characteristics of microblog text present new challenges for sentiment classification [1].

With the increasing need for higher accuracy, using machine learning methods for sentiment classification is becoming a key topic for social media related research. Previous studies of sentiment classification have used machine learning methods such as Naive Bayes (NB) [4], Support Vector Machine (SVM) [5], and Maximum Entropy(ME) [6]. Facing the challenge of high dimensional and sparse data structure of microblog text, most studies of machine learning models for sentiment classification focus on designing effective handcrafted features to achieve better prediction performance. However, sentiment features engineering requires significant time and efforts [7]. Thus, how to automatically and efficiently extract features has become a key research question [8].

Bidirectional long short-term memory (BiLSTM) [9] can access both the previous and subsequent context by combining forward and backward hidden layers. Thus, BiLSTM is more effective than LSTM for the tasks of sequential modeling [10]. Currently, it have been successfully applied to sentimental classification [11], [12], [13]. For example, Nguyen and Nguyen (2018) designed a convolutional N-gram BiLSTM word embedding algorithm for multilingual opinion mining on YouTube [12] . For the tasks of sentiment classification, a disadvantage of BiLSTM is that it is very difficult to extract the important information that can improve the classification accuracy. By setting different weights, the attention mechanism can extract some important information (from the contextual environments) that BiLSTM ignores [10]. Attention mechanism has several important applications in different NLP aspects, such as speech recognition [14], [15] and neural machine translation [16], [17]. The combination of BiLSTM and attention mechanism can further improve accuracy in multiple application areas including aspect-level sentiment classification [18], [19], multi-domain sentiment classification [20], [21], and implicit sentiment classification [22].

Microblog text contains complex and abundant sentiments that reflect users’ opinions on a given topic [1], [23]. While previous studies have investigated sentiment classification of English microblog such as Twitter [24], [25], [26], [27] and Chinese microblog such as Sina microblog [1], [23], [28], [29], most of these studies have used open datasets or small scale datasets. Because it is a challenging task to extract opinions and classify sentiments using real world large scale microblog datasets, the main objective of our study is to design attention based neural network models that are capable of achieving superior classification performance with large scale microblog datasets.

In this paper, a sentiment classification method for Large Scale Microblog text based on Attention mechanism and Bidirectional long short-term memory network (SC-ABiLSTM) is proposed to process microblog text. As the first step of our proposed method, the n-gram sentimental features from the micro blog text are extracted with word embedding methods in convolutional layer. After contextual sentimental sequences are gained by BiLSTM, the connections between sentimental aspect words and their context words are captured, taking into account the different attention factors sequentially and more accurately. Consequently, sentiment classification of large-scale microblog text can be done using SC-ABiLSTM with improved accuracy.

This study has three main contributions. First, we demonstrate that SC-ABiLSTM can recognize the sentimental words from large scale microblog text more effectively, thereby classifying the correlations between sentimental words and their contextual words with higher accuracy. Second, we demonstrate that SC-ABiLSTM is capable of classifying the negative sentimental words with four different types (etc. sadness, disgust, anger and fear) more precisely and, as a result, it can facilitate most timely and effective crisis responses. Third, our experiment results demonstrate that, for sentiment classification of large scale English and Chinese microblog datasets, the proposed model can obtain better results and performance than all the baseline models.

The remainder of this paper is organized as follows. A brief literature review of sentiment classification and the attention based LSTM is presented in Section 2. The research methodology is presented and explained in Section 3. Our experimental results are reported in Section 4. Theoretical contribution and practical implication are discussed in Section 5. Finally, Sections 6 Conclusion, 7 Limitations and future research conclude the paper and discuss several potential directions for future research.

Section snippets

Related literature

Classification problems are very important in many theoretical and practical applications [30]. With the development of AI and big data, more efforts are made to improve the quality and transparency of the classification solutions based on multi-dimension features. The research focus has moved from standard single-label classification to Multi-Label Classification (MLC) that is often combined with attention mechanism based deep learning methods. Interesting achievements have been obtained on

Methodology

As shown in Fig. 1, there are three main steps in the architecture of SC-ABiLSTM model [10], [12], [20], [48], [93], [94]. Firstly, words in the microblog corpus is used as the basic unit to form a sequence of words. N-gram features are extracted from the microblog text based on word embedding methods (e.g., Word2Vec and GloVe). Each word is then mapped into a multidimensional continuous value vector based the trained word vector, and a word vector matrix representation of the entire sentence

Experimental setup

A series of comparative experiments are designed to evaluate the performance of the proposed model for sentiment classification using various benchmarking datasets. For training SC-ABiLSTM for target extraction and sentimental classification, the following public and manual labeled opinion corpuses are used:

(1) NLP&CC2013. NPLCC3013 (Natural Language Processing and Chinese Computing, https://download.csdn.net/download/bright_man/10291480?utm_source=bbsseo) is an open sentimental polarity

Contributions to the literature

While several deep learning models have been used in online users’ sentiment classification, many previous studies in this domain have focused on analyzing English social media corpuses [54], [70], [113], [114]. Some researchers have recently studied multilingual opinion mining. [115] examined multilingual opinion mining on YouTube using Italian and English corpuses. [116] studied supervised sentiment analyses of English and Spanish tweets in a multilingual environment. [95] performed sentiment

Conclusion

In this paper, we proposed a sentiment classification method for large scale microblog text. It can effectively implement the attention mechanism and BiLSTM (SC-ABiLSTM) to extract social media opinions and classify sentiments using real world large-scale microblog datasets. The Word2Vec (CBOW, SKIP-GRAM), GloVe and one-hot methods are used to train semantic embedding by predicting the target words of blog text in accordance with its context. Experiments are conducted on seven benchmark

Limitations and future research

As the Internet becomes more pervasive around the world, social media platforms have gained increasing popularity. Thus, there is an urgent need for more efficient methods for classifying online users’ sentiments. The existing sentiment classification models are often unable to meet the demands of real-world big data processing, especially for large-scale sentiment mining. Therefore, how to mine social media sentiments precisely and efficiently is still a challenging faced by both practitioners

CRediT authorship contribution statement

Peng Wu: Conceptualization, Methodology, Writing – original draft, Writing – review & editing, Funding acquisition. Xiaotong Li: Methodology, Writing – original draft, Writing – review & editing. Chen Ling: Conceptualization, Methodology, Writing – original draft. Shengchun Ding: Conceptualization, Methodology. Si Shen: Conceptualization, Funding acquisition.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Funding Acknowledgments

National Natural Science Foundation of China Award Number: 71774084, Recipient: Peng Wu

National Natural Science Foundation of China: Award Number: 71974094, Recipient: Si Shen

Program for Jiangsu Excellent Scientific and Technological Innovation Team: Award Number: [2020]10, Recipient: Peng Wu

Major national social science bidding Foundation of China: 20&ZD142, Recipient: Yuelin Li

Jiangsu Postdoctoral Research Foundation, China : Award Number: 2020Z193, Recipient: Chen Ling

Peng Wu is a Professor of the School of Economics and Management, Nanjing University of Science and Technology. His research work mainly involves online users’ behavior analysis and sentimental analysis.

References (117)

  • DongH. et al.

    A many-objective feature selection for multi-label classification

    Knowledge-Based Systems

    (2020)
  • YunD. et al.

    Dual aggregated feature pyramid network for multi label classification

    Pattern Recognition Letters

    (2021)
  • HeZ.-F. et al.

    Joint multi-label classification and label correlations with missing labels and feature selection

    Knowledge-Based Systems

    (2019)
  • PaulD. et al.

    Multi-objective PSO based online feature selection for multi-label classification

    Knowledge-Based Systems

    (2021)
  • LvJ. et al.

    Compact learning for multi-label classification

    Pattern Recognition

    (2021)
  • BelloM. et al.

    Deep neural network to extract high-level features and labels in multi-label classification problems

    Neurocomputing

    (2020)
  • NápolesG. et al.

    Long-term cognitive network-based architecture for multi-label classification

    Neural Networks

    (2021)
  • LiuD.

    The effectiveness of three-way classification with interpretable perspective

    Information Sciences

    (2021)
  • BelloM. et al.

    Data quality measures based on granular computing for multi-label classification

    Information Sciences

    (2021)
  • ZhouC. et al.

    Multi-label graph node classification with label attentive neighborhood convolution

    Expert Systems with Applications

    (2021)
  • LiangY. et al.

    Fusion of heterogeneous attention mechanisms in multi-view convolutional neural network for text classification

    Information Sciences

    (2021)
  • PandeS. et al.

    Adaptive hybrid attention network for hyperspectral image classification

    Pattern Recognition Letters

    (2021)
  • JiangL. et al.

    DECAB-LSTM: Deep contextualized attentional bidirectional LSTM for cancer hallmark classification

    Knowledge-Based Systems

    (2020)
  • LiX. et al.

    A hybrid medical text classification framework: Integrating attentive rule construction and neural network

    Neurocomputing

    (2021)
  • WangP. et al.

    A hybrid approach to classifying wikipedia article quality flaws with feature fusion framework

    Expert Systems with Applications

    (2021)
  • ChenM.-Y. et al.

    Modeling public mood and emotion: Stock market trend prediction with anticipatory computing approach

    Computers in Human Behavior

    (2019)
  • YinC. et al.

    Reposting negative information on microblogs: Do personality traits matter?

    Information Processing & Management

    (2020)
  • PoriaS. et al.

    Ensemble application of convolutional neural networks and multiple kernel learning for multimodal sentiment analysis

    Neurocomputing

    (2017)
  • KuduguntaS. et al.

    Deep neural networks for bot detection

    Information Sciences

    (2018)
  • ZhangD. et al.

    Modeling and simulating of reservoir operation using the artificial neural network, support vector regression, deep learning algorithm

    Journal of Hydrology

    (2018)
  • LiQ. et al.

    Mining opinion summarizations using convolutional neural networks in Chinese microblogging systems

    Knowledge-Based Systems

    (2016)
  • RezaeiniaS.M. et al.

    Sentiment analysis based on improved pre-trained word embeddings

    Expert Systems with Applications

    (2019)
  • JA.K. et al.

    Aspect-based opinion ranking framework for product reviews using a Spearman’s rank correlation coefficient method

    Information Sciences

    (2018)
  • SunS. et al.

    A review of natural language processing techniques for opinion mining systems

    Information Fusion

    (2017)
  • OuertataniA. et al.

    Argued opinion extraction from festivals and cultural events on Twitter

    Procedia Computer Science

    (2018)
  • ChenL. et al.

    Two-layer fuzzy multiple random forest for speech emotion recognition in human–robot interaction

    Information Sciences

    (2020)
  • ChenT. et al.

    Emotion recognition using empirical mode decomposition and approximation entropy

    Computers & Electrical Engineering

    (2018)
  • ChenT. et al.

    Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN

    Expert Systems with Applications

    (2017)
  • SongM. et al.

    Attention-based long short-term memory network using sentiment lexicon embedding for aspect-level sentiment analysis in Korean

    Information Processing & Management

    (2019)
  • HoppT. et al.

    Does negative campaign advertising stimulate uncivil communication on social media? Measuring audience response using big data

    Computers in Human Behavior

    (2017)
  • MayshakR. et al.

    The impact of negative online social network content on expressed sentiment, executive function, and working memory

    Computers in Human Behavior

    (2016)
  • TaranS. et al.

    Emotion recognition from single-channel EEG signals using a two-stage correlation and instantaneous frequency-based filtering method

    Computer Methods and Programs in Biomedicine

    (2019)
  • XiaoF. et al.

    DAA: Dual LSTMs with adaptive attention for image captioning

    Neurocomputing

    (2019)
  • GengZ. et al.

    Semantic relation extraction using sequential and tree-structured LSTM with attention

    Information Sciences

    (2020)
  • MaR. et al.

    Feature-based compositing memory networks for aspect-based sentiment classification in social internet of things

    Future Generation Computer Systems

    (2019)
  • YangC. et al.

    Aspect-based sentiment analysis with alternating coattention networks

    Information Processing & Management

    (2019)
  • RuwaN. et al.

    Triple attention network for sentimental visual question answering

    Computer Vision and Image Understanding

    (2019)
  • ShuangK. et al.

    AELA-DLSTMs: Attention-enabled and location-aware double LSTMs for aspect-level sentiment classification

    Neurocomputing

    (2019)
  • GiatsoglouM. et al.

    Sentiment analysis leveraging emotions and word embeddings

    Expert Systems with Applications

    (2017)
  • SymeonidisS. et al.

    A comparative evaluation of pre-processing techniques and their interactions for Twitter sentiment analysis

    Expert Systems with Applications

    (2018)
  • Cited by (28)

    View all citing articles on Scopus

    Peng Wu is a Professor of the School of Economics and Management, Nanjing University of Science and Technology. His research work mainly involves online users’ behavior analysis and sentimental analysis.

    Prof. Xiaotong Li is a Professor of Information Systems at College of Business, University of Alabama in Huntsville. He has served on the editorial board of Marketing Science, and he is an associate editor of Electronic Commerce Research and Applications.

    Chen Ling is an Associate Professor of the School of Economics and Management, Nanjing University of Science and Technology. His research work mainly involves crowd simulation and sentimental analysis.

    Shengchun Ding is a Professor of the School of Economics and Management, Nanjing University of Science and Technology. Her research work mainly involves big data analysis; deep learning and Natural Language Processing.

    Si Shen is an Associate Professor of the School of Economics and Management, Nanjing University of Science and Technology. Her research work mainly involves big data analysis; machine learning.

    View full text