To read this content please select one of the options below:

A proposed scheme for sentiment analysis: Effective feature reduction based on statistical information of SentiWordNet

Sajjad Tofighy (Department of Computer Science and Engineering and Information Technology, School of Electrical and Computer Engineering, Shiraz University, Shiraz, Iran)
Seyed Mostafa Fakhrahmad (School of Electrical and Computer Engineering, Shiraz University, Shiraz, Iran)

Kybernetes

ISSN: 0368-492X

Article publication date: 5 March 2018

Issue publication date: 2 May 2018

183

Abstract

Purpose

This paper aims to propose a statistical and context-aware feature reduction algorithm that improves sentiment classification accuracy. Classification of reviews with different granularities in two classes of reviews with negative and positive polarities is among the objectives of sentiment analysis. One of the major issues in sentiment analysis is feature engineering while it severely affects time complexity and accuracy of sentiment classification.

Design/methodology/approach

In this paper, a feature reduction method is proposed that uses context-based knowledge as well as synset statistical knowledge. To do so, one-dimensional presentation proposed for SentiWordNet calculates statistical knowledge that involves polarity concentration and variation tendency for each synset. Feature reduction involves two phases. In the first phase, features that combine semantic and statistical similarity conditions are put in the same cluster. In the second phase, features are ranked and then the features which are given lower ranks are eliminated. The experiments are conducted by support vector machine (SVM), naive Bayes (NB), decision tree (DT) and k-nearest neighbors (KNN) algorithms to classify the vectors of the unigram and bigram features in two classes of positive or negative sentiments.

Findings

The results showed that the applied clustering algorithm reduces SentiWordNet synset to less than half which reduced the size of the feature vector by less than half. In addition, the accuracy of sentiment classification is improved by at least 1.5 per cent.

Originality/value

The presented feature reduction method is the first use of the synset clustering for feature reduction. In this paper features reduction algorithm, first aggregates the similar features into clusters then eliminates unsatisfactory cluster.

Keywords

Citation

Tofighy, S. and Fakhrahmad, S.M. (2018), "A proposed scheme for sentiment analysis: Effective feature reduction based on statistical information of SentiWordNet", Kybernetes, Vol. 47 No. 5, pp. 957-984. https://doi.org/10.1108/K-06-2017-0229

Publisher

:

Emerald Publishing Limited

Copyright © 2018, Emerald Publishing Limited

Related articles