Short text opinion detection using ensemble of classifiers and semantic indexing

doi:10.1016/j.eswa.2016.06.025

Expert Systems with Applications

Volume 62, 15 November 2016, Pages 243-249

https://doi.org/10.1016/j.eswa.2016.06.025 Get rights and content

Highlights

•
An ensemble system to perform opinion detection in short text messages is proposed.
•
The model combines the state-of-the-art classification methods and NLP techniques.
•
The proposed ensemble can improve performance of the most text categorization tasks.
•
Experimental results on nine real English public datasets are reported.
•
The proposed method is statistically superior to the compared approaches.

Abstract

The popularity of social networks has attracted attention of companies. The growing amount of connected users and messages posted per day make these environments fruitful to detect needs, tendencies, opinions, and other interesting information that can feed marketing and sales departments. However, the most social networks impose size limit to messages, which lead users to compact them by using abbreviations, slangs, and symbols. As a consequence, these problems impact the sample representation and degrade the classification performance. In this way, we have proposed an ensemble system to find the best way to combine the state-of-the-art text processing approaches, as text normalization and semantic indexing techniques, with traditional classification methods to automatically detect opinion in short text messages. Our experiments were diligently designed to ensure statistically sound results, which indicate that the proposed system has achieved a performance higher than the individual established classifiers.

Introduction

Digital inclusion has allowed an increasing number of Internet users, which recently has been responsible for the most success of social networks. In such applications, users are able to share and read information, and perform many activities. Among shared information, users often post opinions and rate products. According to a press release of ComScore¹, online reviews have a significant impact on purchasing behavior. Consequently, companies noticed how important it is to be able to analyze a huge amount of messages in a fast way to discover tendencies and opinion of users.

The employment of classification methods in opinion detection were presented in some works (Denecke, 2008, Luo, Zeng, Duan, 2016, Pang, Lee, Vaithyanathan, 2002). However, in most cases, it is still very difficult to identify the polarity of text samples extracted from social networks because, besides being very short, they are often rife with idioms, slang, symbols, emoticons and abbreviations which make even tokenization a challenge task (Denecke, 2008).

Noise in text messages can appear in different ways. The following phrase offers an example: “dz ne1 knw h2 ripair dis terrible LPT? :(”. There are misspelled words “dz,ne1,knw,h2,dis”, abbreviation “LPT” and symbol “:(”. In order to transcribe such phrase to a proper English grammar, a Lingo dictionary² would be needed along with a standard dictionary, which associates each slang, symbol or abbreviation to a correct term. After a step of text normalization, the input phrase would be translated to “Does anyone know how to repair this terrible printer? :(” and the symbol at the end would mean the author has a sad or dissatisfied sentiment about the product.

In addition to noisy messages, there are other well-known problems described in literature such as sarcasm, ambiguous words in context (polysemy) and different words with the same meaning (synonymy). When such cases are properly handled, better results can be achieved (Mostafa, 2013, Pang, Lee, 2008).

Both synonymy and polysemy problems can have their effect minimized by semantic indexing for word sense disambiguation (Navigli, Ponzetto, 2012, Taieb, Aouicha, Hamadou, 2013). Such dictionaries associate meanings to words by finding similar terms given the context of message. In general, the effectiveness of applying such dictionaries relies in the quality of terms extracted from samples. However, common tools for natural language processing can not be suitable to deal with short texts, demanding proper tools for working in this context (Bontcheva, Derczynski, Funk, Greenwood, Maynard, Aswani, 2013, Maynard, Bontcheva, Rout, 2012).

Even after dealing with problems of polysemy and synonymy, resulting terms may not be enough to detect opinion because the original messages are usually very short. Some recent works recommend to employ ontology models to analyze each term and find associated new terms (with the same meaning) to enrich original sample with more features (Kontopoulos, Berberidis, Dergiades, & Bassiliades, 2013).

Terms achieved by ontology models and semantic indexing (called expansion process) are more representative for classification methods if they can be related to an individual polarity. This way, recent works also demonstrate that lexical dictionaries can enhance classification performances (Mostafa, 2013, Nastase, Strube, 2013).

Original samples can be processed by different text processing techniques and resulting text samples become inputs to classification methods. Since there are several techniques to perform feature processing and different established classification methods, an ensemble system that naturally integrates these approaches could overcome individual drawbacks, achieve better hypothesis and consequently enhance the overall prediction performance. Ensemble strategies are commonly applied in literature to combine outputs of several classifiers in an integrated final output (Dietterich, 2000, Wang, Sun, Ma, Xu, Gu, 2014, Xia, Zong, Li, 2011).

In this scenario, we have designed and evaluated an ensemble system to perform opinion detection in short text messages extracted from social networks. Our model combines text normalization methods along with state-of-the-art natural language processing techniques to improve quality of extracted features which are then used by established machine learning approaches. The results demonstrate that our proposal clearly outperforms established methods available in literature.

This paper is organized as follows: Section 2 presents the most relevant related work. Text normalization and semantic indexing techniques are described in Section 3. Section 4 presents the proposed ensemble system. Experimental methodology is described in Section 5. Section 6 presents the achieved results and main conclusions are provided in Section 7.

Section snippets

Related work

Opinion detection is the task of analyzing huge amounts of information from thousands (or millions) of users to detect the majority opinion about anything in discussion. The understanding and fast reaction about such opinions allows companies to guide their marketing and to aid in decision making (Mostafa, 2013, Pang, Lee, Vaithyanathan, 2002). According to results available in literature, this task is far from being properly solved due to many reasons, such as difficulties to deal with

Text processing techniques

In scenarios where messages are short and rife with idioms, symbols and abbreviations, just employing a simple bag of words is not generally enough to achieve satisfactory results (Gabrilovich & Markovitch, 2005). Often, a lexical normalization step is needed to translate obfuscated messages to standard English. Next, as messages can be very short, the amount of features can not be enough to lead to good performance, mainly when problems of synonymy or polysemy are frequent. In this way,

The ensemble system

The proposed ensemble system is divided in two distinct stages: model selection and classification.

In model selection, the first step is to perform a grid search to set the main parameters of methods that compose the system. As this process is time-consuming, only a stratified randomly selected sample set from the original dataset is used. The next step is to employ text processing techniques ( $E_{1}, \dots, E_{k}$ ) to normalize and expand the original input samples. All possible merging rules are used and

Methodology

To give credibility to the found results and in order to make the experiments reproducible, we detail the experimental methodology as follows.

Results

Table 5 presents the average F-measure and standard deviation achieved by each evaluated classifier over each dataset. Bold values indicate the best scores.

The results indicate that, under the same condition and methodology, the proposed ensemble system clearly presented an overall superior performance to any of the other evaluated individual classifier. However, to ensure that results were not obtained by chance, we have performed a statistical analysis using the non-parametric Friedman test (

Conclusion and future work

The task of automatically detecting opinion in short messages posted on social networks is still a real challenge nowadays. Two main issues make difficult the application of established classification algorithms for this specific field of research: the low number of features that can be extracted per message and the fact that messages are filled with idioms, abbreviations, and symbols.

In order to fill these gaps, we proposed an ensemble system that automatically combines the most recommended

Acknowledgment

The authors would like to thank the financial support of Brazilian agencies FAPESP (grant 2014/01237-7) and Capes.

References (40)

V. Nastase et al.
Transforming Wikipedia into a large scale multilingual concept network
Artificial Intelligence
(2013)
R. Navigli et al.
An experimental study of graph connectivity for unsupervised word sense disambiguation
IEEE Transactions on Pattern Analysis and Machine Intelligence
(2010)
R. Navigli et al.
Multilingual WSD with just a few lines of code: the BabelNet API
Proc. of 2012 ACL
(2012)
P.F. Nemenyi
Distribution-free multiple comparisons
(1963)
J. Quinlan
C4.5: Programs for machine learning
(1993)
H. Saif et al.
Evaluation datasets for twitter sentiment analysis: a survey and a new dataset, the sts-gold
1st interantional workshop on emotion and sentiment in social and expressive media: Approaches and perspectives from AI (ESSEM 2013)
(2013)
A. Agarwal et al.
Sentiment analysis of twitter data
Proceedings of 2011 LSM
(2011)
E. Agirre et al.
Word sense disambiguation
(2006)
D. Aha et al.
Instance-based learning algorithms
Machine Learning
(1991)
T. Almeida et al.
Spam filtering: how the dimensionality reduction affects the accuracy of naive bayes classifiers
JISA
(2011)

Analytics, S. (2011). Dataset - Twitter sentiment. http://www.sananalytics.com/lab/twitter-sentiment/. [Online;...

K. Bontcheva et al.

Twitie: an open-source information extraction pipeline for microblog text

Proceedings of 2013 RANLP

(2013)

P. Cook et al.

An unsupervised model for text message normalization

Proceedings of the 2009 CALC

(2009)

C. Cortes et al.

Support-vector networks

Machine Learning

(1995)

K. Denecke

Using SentiWordNet for multilingual sentiment analysis

Proceedings of 2008 ICDEW

(2008)

T.G. Dietterich

Ensemble methods in machine learning

Proceedings of 2000 MCS

(2000)

Y. Freund et al.

Experiments with a new boosting algorithm

Proceedings of the 13rd ICML

(1996)

M. Friedman

A comparison of alternative tests of significance for the problem of m rankings

The Annals of Mathematical Statistics

(1940)

E. Gabrilovich et al.

Feature generation for text categorization using world knowledge

Proceedings of the 19th IJCAI

(2005)

GoA. et al.

Twitter sentiment classification using distant supervision

Technical Report

(2009)

Cited by (0)

View full text

Short text opinion detection using ensemble of classifiers and semantic indexing

Highlights

Abstract

Introduction

Section snippets

Related work

Text processing techniques

The ensemble system

Methodology

Results

Conclusion and future work

Acknowledgment

Artificial Intelligence

IEEE Transactions on Pattern Analysis and Machine Intelligence

Sentiment analysis of twitter data

Proceedings of 2011 LSM

Word sense disambiguation

Instance-based learning algorithms

Machine Learning

Spam filtering: how the dimensionality reduction affects the accuracy of naive bayes classifiers

JISA

Twitie: an open-source information extraction pipeline for microblog text

Proceedings of 2013 RANLP

An unsupervised model for text message normalization

Proceedings of the 2009 CALC

Support-vector networks

Machine Learning

Using SentiWordNet for multilingual sentiment analysis

Proceedings of 2008 ICDEW

Ensemble methods in machine learning

Proceedings of 2000 MCS

Experiments with a new boosting algorithm

Proceedings of the 13rd ICML

A comparison of alternative tests of significance for the problem of m rankings

The Annals of Mathematical Statistics

Feature generation for text categorization using world knowledge

Proceedings of the 19th IJCAI

Twitter sentiment classification using distant supervision

Technical Report