Fuzzy rule based unsupervised sentiment analysis from social media posts

https://doi.org/10.1016/j.eswa.2019.112834Get rights and content

Highlights

  • A fuzzy rule-based approach is proposed for sentiment analysis of tweets.

  • Formulation of nine fuzzy rules to compute sentiment of each tweet.

  • The proposed unsupervised approach is suitable for any sentiment lexicon.

  • The approach can be adapted to both bipolar and tripolar sentiment analysis tasks.

Abstract

In this paper, we compute the sentiment of social media posts using a novel set of fuzzy rules involving multiple lexicons and datasets. The proposed fuzzy system integrates Natural Language Processing techniques and Word Sense Disambiguation using a novel unsupervised nine fuzzy rule based system to classify the post into: positive, negative or neutral sentiment class. We perform a comparative analysis of our method on nine public twitter datasets, three sentiment lexicons, four state-of-the-art approaches for unsupervised Sentiment Analysis and one state-of-the-art method for supervised machine learning. Traditionally, Sentiment Analysis of twitter data is performed using a single lexicon. Our results can give an insight to researchers to choose which lexicon is best for social media. The fusion of fuzzy logic with lexicons for sentiment classification provides a new paradigm in Sentiment Analysis. Our method can be adapted to any lexicon and any dataset (two-class or three-class sentiment). The experiments on benchmark datasets yield higher performance for our approach as compared to the state-of-the-art.

Introduction

Sentiment Analysis is a challenging research problem especially on social media. Users can freely express their views, opinions and feelings on different trending events, topics, etc. via social media posts. These posts need to be analysed to know what sentiment is conveyed through these posts. Sentiment Analysis, also referred as emotion AI, involves analyzing views from the written text so as to understand and gauge human emotions. The social media allows world-wide users to connect and interact with each other and express the opinions on general topics. Social Sentiment Analysis can be used to improve customer service and marketing and also serves as a measure of social media performance. In recent years, the impact of social media websites on daily life has become so considerable that even information on large and small incidents or disasters is gathered via social media sites. The users portray not only the content about events but also their feelings (Yoo, Song, & Jeong, 2018). The automated extraction of sentiment from these posts and classifying them into different polarities–positive, negative or neutral– has received extensive attention from researchers during the past decade.

Twitter is one of the popular social media and boasts of a respectful 255 million active monthly users. Some of the challenges in analysing tweets are: use of informal language, short forms, abbreviations, heavy use of emoticons and slangs. Twitter, also known as microblogging, has limited size of tweets that makes it difficult to compute the polarity. In this paper, we apply fuzzy rule-based unsupervised approach to process the tweets in such a way as to overcome the above challenges. We have implemented our approach on multiple public twitter datasets using multiple lexicons. The proposed fuzzy rule-based approach can compute sentiment for two-class and three-class sentiment datasets. Two-class datasets have only positive and negative sentiment while three-class have neutral sentiment as well.

Fuzzy logic is an extension of deterministic logic, i.e. the truth value has range from 0 to 1 rather than a binary value. The primary aim of the theory of fuzzy logic is turning a black and white problem into a grey problem (Zadeh, 2015). In the field of artificial intelligence, possibly the easiest way to represent the human knowledge is to transform it into natural language expressions in the format of IF-THEN rules. These rules are based on natural language representations and models, which are themselves based on fuzzy sets and fuzzy logic (Ross, 2010). Classification systems based on fuzzy rules are powerful and acknowledged tools for pattern recognition and classification. These systems can handle uncertainty, ambiguity or vagueness in a very efficient way due to the presence of fuzziness (López, Río, Benítez, & Herrera, 2015). We have used the concept of fuzzy rule-based system to create our own nine fuzzy rules to determine the sentiment of each tweet.

The main contributions of this paper are: i) formulation of nine fuzzy rules to compute sentiment of each tweet ii) the proposed unsupervised approach is suitable for any sentiment lexicon iii) also suitable for any dataset (two-class or three-class) iv) comparison of our proposed rule-based approach for Sentiment Analysis with four state-of-the-art methods for unsupervised sentiment classification and one state-of-the-art method for supervised machine learning. The rest of the paper is organized as follows. Section 2 describes the state-of-the-art on Sentiment Analysis from social media, while our proposed fuzzy rule-based system is presented in Section 3. Section 4 is about the experimental setup & implementation. Results are discussed in Section 5. The overall conclusions are drawn in Section 6.

Section snippets

Related work

In recent years, a lot of progress has been achieved in the task of sentiment classification of social media posts. Among social media posts, tweets are most popular. Most of the researchers have classified tweets according to the sentiment contained in tweets. The different methods for Sentiment Analysis of social media posts can be classified as supervised, semi-supervised and unsupervised approach. In social media, to keep track of user opinion behavior, historical information about users

Proposed fuzzy rule system for sentiment analysis

In this section, we present the details of the proposed fuzzy logic-based model. Fig. 1 describes the framework of a fuzzy logic-based model. Fuzzification is the process of making a crisp quantity fuzzy. The crisp or real inputs are mapped to fuzzy sets whose elements have a degree of membership computed using fuzzy membership functions (MF). In this work, we select the triangular-fuzzy membership function because it is easy to understand and commonly used.

In the field of artificial

Experimental setup and implementation

This section reports the experimental setup and implementation of the proposed fuzzy rule-based classifier for Sentiment Analysis. We have implemented our fuzzy rule-based system in python version 3.6.5. The system has as Intel Core i5 processor, 64-bit operating system and 8GB RAM. The code containing the implementation of our work is given at: https://github.com/SrishtiVashishtha. Most of the papers use the Twitter API to extract tweets but we have used publicly available datasets. In this

Processing of a single tweet

In this section, we present how a single tweet is being processed by our proposed fuzzy rule based unsupervised Sentiment Analysis model. Processing of a sample tweet of Nuclear Twitter Dataset (2019) using VADER lexicon (Gilbert & Hutto, 2014) is shown in Fig. 5. Initially text preprocessing is done.

Then we apply VADER lexicons’ polarity_scores(a) method which gives positive (TweetPos) score equal to 0.1 and negative (TweetNeg) score equal to 0.1 of the tweet as output. The fuzzy sets Low,

Conclusion

In this paper, we have proposed a fuzzy rule-based approach for Sentiment Analysis of social media posts specifically for twitter datasets. The novelty of this paper is i) the formulation of nine fuzzy rules to evaluate the sentiment class of tweets, ii) the proposed approach is unsupervised and can be adapted to any lexicon and iii) to any dataset (two-class or three-class). Two-class datasets have positive and negative sentiment classes while three-class datasets have an additional neutral

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

CRediT authorship contribution statement

Srishti Vashishtha: Conceptualization, Formal analysis, Methodology, Validation, Writing - original draft, Writing - review & editing, Software, Investigation, Resources, Data curation, Visualization. Seba Susan: Conceptualization, Methodology, Investigation, Writing - review & editing, Supervision, Project administration, Funding acquisition.

References (75)

  • L.A. Zadeh

    Fuzzy logic: A personal perspective

    Fuzzy Sets and Systems

    (2015)
  • A. Agarwal et al.

    Sentiment analysis of twitter data

  • Apple Twitter Dataset, Retrieed Jan 31, (2019). from...
  • S. Baccianella et al.

    Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining

  • Y. Bae et al.

    Sentiment analysis of twitter audiences: Measuring the positive or negative influence of popular twitterers

    Journal of the American Society for Information Sci C:\Users\srish\Desktop\New folderence and Technology

    (2012)
  • S. Banerjee et al.

    An adapted Lesk algorithm for word sense disambiguation using WordNet

  • P. Barnaghi et al.

    Opinion mining and sentiment polarity on twitter and correlation between events and sentiment

  • C. Baziotis et al.

    Datastories at semeval-2017 task 4: Deep lstm with attention for message-level and topic-based sentiment analysis

  • S. Bird et al.

    Natural language processing with Python: Analyzing text with the natural language toolkit

    (2009)
  • D.C. Cavalcanti et al.

    Good to be bad? Distinguishing between positive and negative citations in scientific impact

  • Y.C. Chang et al.

    "Fuzzy interpolative reasoning for sparse fuzzy-rule-based systems based on the areas of fuzzy sets

    IEEE Transactions on Fuzzy Systems

    (2008)
  • Cliche, M. (2017). BB_twtr at SemEval-2017 task 4: Twitter sentiment analysis with CNNs and LSTMs. arXiv preprint...
  • L.C. Duţu et al.

    A fast and accurate rule-base generation method for Mamdani fuzzy systems

    IEEE Transactions on Fuzzy Systems

    (2018)
  • G. Gautam et al.

    Sentiment analysis of twitter data using machine learning approaches and semantic analysis

  • A. Giachanou et al.

    Like it or not: A survey of twitter sentiment analysis methods

    ACM Computing Surveys (CSUR)

    (2016)
  • E. Gilbert et al.

    VADER: A parsimonious rule-based model for sentiment analysis of social media text

  • Go, A., Bhayani, R., & Huang, L. (2009). Twitter sentiment classification using distant supervision. CS224N Project...
  • B. Gokulakrishnan et al.

    Opinion mining and sentiment analysis on a twitter data stream

  • H. Hamdan et al.

    Experiments with DBpedia, WordNet and SentiWordNet as resources for sentiment analysis in micro-blogging

  • Haque, M. (2014). Sentiment analysis by using fuzzy logic. arXiv preprint...
  • A. Hassan et al.

    Twitter sentiment analysis: A bootstrap ensemble framework

  • H. Hellendoorn et al.

    Defuzzification in fuzzy controllers

    Journal of Intelligent & Fuzzy Systems

    (1993)
  • H. Ishibuchi et al.

    Effect of rule weights in fuzzy rule-based classification systems

    IEEE Transactions on Fuzzy Systems

    (2001)
  • H. Ishibuchi et al.

    Rule weight specification in fuzzy rule-based classification systems

    IEEE Transactions on Fuzzy Systems

    (2005)
  • A.P. Jain et al.

    Sentiments analysis of Twitter data using data mining

  • J.S.R. Jang et al.

    Neuro-fuzzy and soft computing-a computational approach to learning and machine intelligence [Book Review]

    IEEE Transactions on Automatic Control

    (1997)
  • C. Jefferson et al.

    Fuzzy approach for sentiment analysis

  • Cited by (140)

    View all citing articles on Scopus
    View full text