Effective Use of Evaluation Measures for the Validation of Best Classifier in Urdu Sentiment Analysis

Mukhtar, Neelam; Khan, Mohammad Abid; Chiragh, Nadia

doi:10.1007/s12559-017-9481-5

Effective Use of Evaluation Measures for the Validation of Best Classifier in Urdu Sentiment Analysis

Published: 27 May 2017

Volume 9, pages 446–456, (2017)
Cite this article

Cognitive Computation Aims and scope Submit manuscript

728 Accesses
20 Citations
1 Altmetric
Explore all metrics

Abstract

Sentiment analysis (SA) can help in decision making, drawing conclusion, or recommending appropriate solution for different business, political, or other problems. At the same time reliable ways are also required to verify the results that are achieved after SA. In the frame of biologically inspired approaches for machine learning, getting reliable result is challenging but important. Properly verified and validated results are always appreciated and preferred by the research community. The strategy of achieving reliable result is adopted in this research by using three standard evaluation measures. First, SA of Urdu is performed. After collection and annotation of data, five classifiers, i.e., PART, Naives Bayes mutinomial Text, Lib SVM (support vector machine), decision tree (J48), and k nearest neighbor (KNN, IBK) are employed using Weka. After using 10-fold cross-validation, three top most classifiers, i.e., Lib SVM, J48, and IBK are selected on the basis of high accuracy, precision, recall, and F-measure. Further, IBK resulted as the best classifier among the three. For verification of this result, labels of the sentences (positive, negative, or neutral) are predicted by using training and test data, followed by the application of the three standard evaluation measures, i.e., McNemar’s test, kappa statistic, and root mean squared error. IBK performs much better than the other two classifiers. To make this result more reliable, a number of steps are taken including the use of three evaluation measures for getting a confirmed and validated result which is the main contribution of this research. It is concluded with confidence that IBK is the best classifier in this case.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Classification of Textual Sentiment Using Ensemble Technique

Article 05 November 2021

Performance Analysis of Machine Learning Techniques for Sentiment Analysis

Lexical Resource Creation and Evaluation: Sentiment Analysis in Marathi

Notes

References

Cambria E, Schuller B, Xia Y, Havasi C. New avenues in opinion mining and sentiment analysis. IEEE Intell Syst. 2013;28(2):15–21.
Article Google Scholar
Palogiannidi E, Kolovou A, Christopoulou F, Kokkinos F, Iosif E, Malandrakis N, et al., editors. Tweester at SemEval-2016 Task 4: Sentiment analysis in Twitter using semantic- affective model adaptation. 10^th International Workshop on Semantic Evaluation (SemEval 2016) 2016; San Diego, US.
Cambria E. Affective computing and sentiment analysis. IEEE Intell Syst. 2016;31:102–7.
Article Google Scholar
Ofek N, Rokach L, Cambria E, Hussain A, Shabtai A. Unsupervised commonsense knowledge enrichment for domain-specific sentiment analysis. Cogn Comput. 2016;8(3):467–77.
Article Google Scholar
Oneto L, Bisio F, Cambria E, Anguita D. Statistical learning theory and ELM for big social data analysis. IEEE Comput Intell Mag. 2016;11(3):45–55.
Article Google Scholar
Bautin M, Vijayarenu L, Skiena S, editors. International Sentiment Analysis for News and Blog. Second International Conference on Weblogs and Social Media Seattle, WA; 2008.
Cambria E, Poria S, Bajpai R, Schuller B, editors. SenticNet 4: A Semantic Resource for Sentiment Analysis Based on Conceptual Primitives. Proceedings of COLING 2016, the 26^th International Conference on Computational Linguistics: Technical Papers; 2016; Japan.
Appela O, Chiclana F, Cartera J, Fujitab H. A hybrid approach to the sentiment analysis problem at the sentence level. Spec Issue New Avenues Knowl Bases Nat Lang Process Knowl-Based Syst. 2016;108:110–24.
Google Scholar
Minhas S, Hussain A. From spin to swindle: identifying falsification in financial text. Cogn Comput. 2016;8:729–45.
Article Google Scholar
Khan FH, Qamar U, Bashir S. Multi-objective model selection (MOMS)-based semi-supervised framework for sentiment analysis. Cogn Comput. 2016;8(4):614–28.
Article Google Scholar
Dashtipour K, Poria S, Hussain A, Cambria E, Hawalah AYA, Gelbukh A, et al. Multilingual sentiment analysis: state of the art and independent comparison of techniques. Cogn Comput. 2016;8:757–71.
Article Google Scholar
Bilal M, Israr H, Shahid M, Khan A. Sentiment classification of Roman-Urdu opinions using Naı¨ve Bayesian, decision tree and KNN classification techniques. J King Saud Univ Comput Inf Sci. 2015;
Syed AZ, Muhammad A, Enríquez AMM, editors. Lexicon Based Sentiment Analysis of Urdu Text Using SentiUnits. Proceedings of the 9th Mexican international conference of artificial intelligence, MICAI; 2010; Berlin Heidelberg. Springer.
Syed AZ, Muhammad A, Enríquez AMM. Adjectival phrases as the sentiment carriers in Urdu. J Am Sci. 2011;7(3):644–52.
Google Scholar
Syed AZ, Muhammad A, Enríquez AMM. Associating targets with SentiUnits: a step forward in sentiment analysis of Urdu text. Artif Intell Rev Springer. 2014;41(4):535–61.
Article Google Scholar
Daud M, Khan R, Duad A. Roman Urdu opinion mining system (RUOMiS). CSEIJ. 2014;4(6):1–9.
Article Google Scholar
Dietterich TG. Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput. 1998;10:1895–923.
Article CAS PubMed Google Scholar
Bouckaert RR, Frank E, editors. Evaluating the replicability of significance tests for comparing learning algorithms. 8th Pacific-Asia Conference; 2004.
Demsar J. Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res. 2006:1–6.
Bostanci B, Bostanci E, editors. An evaluation of classification algorithms using Mc Nemar’s test. Seventh International Conference on Bio-Inspired Computing: Theories and Applications; 2013; New Delhi. Advances in Intelligent Systems and Computing, Springer.
Westfall PH, Troendle JF, Pennello G. Multiple McNemar tests. Biometrics. 2010;66(4):1185–91.
Article PubMed PubMed Central Google Scholar
Vieira S, Kaymak U, Sousa J, editors. Cohen’s kappa coefficient as a performance measure for feature selection. IEEE International Conference on Fuzzy Systems (FUZZ) 2010; Piscataway.
Ben-David A. Comparison of classification accuracy using Cohen's weighted kappa. Expert Syst Appl. 2008;34(2):825–32.
Petrakos M, Benediktsson J. The effect of classifier agreement on the accuracy of the combined classifier in decision level fusion. IEEE Trans Geosci Remote Sens. 2001;39(11):2539–46.
Article Google Scholar
Caruana R, Niculescu-Mizil A, editors. An empirical comparison of supervised learning algorithms. 23^rd International Conference on Machine learning; 2006; New York. ACM.
Tushkanova O, editor. Comparative analysis of the numerical measures for mining associative and causal relationships in big data Creativity in intelligent technologies and data science, First conference Proceedings, CIT &DS 2015; Russia.
Braga-Neto UM. Classification and error estimation for discrete data. Curr Genomics. 2009;10(7):446–62.
Article CAS PubMed PubMed Central Google Scholar
Siegel S, John Castellan N. Nonparametric statistics for the behavioral sciences. Second ed: McGraw-Hill; 1988.
McHugh M. Interrater reliability: the kappa statistic. Biochem Med. 2012;22:276–82.
Article Google Scholar
Viera AJ, Garrett JM. Understanding inter observer agreement: the kappa statistic. Family Med. 2005;37(5):360–3.
Google Scholar
Silva C, Ribeiro B, editors. The importance of stop word removal on recall values in text categorization. Neural Netw, 2003 Proceedings of the International Joint Conference; 2003. IEEE.
Sun X, Yang Z, editors. Generalized McNemar's test for homogeneity of the marginal distributions. SAS Global Forum. Cary: SAS Institute; 2008.
Google Scholar
McNemar Q. Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika. 1947;17:153–7.
Article Google Scholar
Witten IH, Frank E, Hall MA, editors. Data mining: practical machine learning tools and techniques; 2011.
Japkowicz N, Shah M, editors. Evaluating learning algorithms: a classification perspective. Cambridge: Cambridge University Press; 2011.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Peshawar, Peshawar, Khyber Pakhtunkhwa, Pakistan
Neelam Mukhtar & Mohammad Abid Khan
University of Agriculture, Peshawar, Pakistan
Nadia Chiragh

Authors

Neelam Mukhtar
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Abid Khan
View author publications
You can also search for this author in PubMed Google Scholar
Nadia Chiragh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Neelam Mukhtar.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Informed Consent

All procedures followed were in accordance with the ethical standards of the responsible committee on human experimentation (institutional and national) and with the Helsinki Declaration of 1975, as revised in 2008 [6].

Human and Animal Rights

This article does not contain any studies with human participants or animals performed by any of the authors.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mukhtar, N., Khan, M.A. & Chiragh, N. Effective Use of Evaluation Measures for the Validation of Best Classifier in Urdu Sentiment Analysis. Cogn Comput 9, 446–456 (2017). https://doi.org/10.1007/s12559-017-9481-5

Download citation

Received: 26 October 2016
Accepted: 11 May 2017
Published: 27 May 2017
Issue Date: August 2017
DOI: https://doi.org/10.1007/s12559-017-9481-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Effective Use of Evaluation Measures for the Validation of Best Classifier in Urdu Sentiment Analysis

Abstract

Access this article

Similar content being viewed by others

Classification of Textual Sentiment Using Ensemble Technique

Performance Analysis of Machine Learning Techniques for Sentiment Analysis

Lexical Resource Creation and Evaluation: Sentiment Analysis in Marathi

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Informed Consent

Human and Animal Rights

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Effective Use of Evaluation Measures for the Validation of Best Classifier in Urdu Sentiment Analysis

Abstract

Access this article

Similar content being viewed by others

Classification of Textual Sentiment Using Ensemble Technique

Performance Analysis of Machine Learning Techniques for Sentiment Analysis

Lexical Resource Creation and Evaluation: Sentiment Analysis in Marathi

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Informed Consent

Human and Animal Rights

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation