Abstract
It is a practice that users or customers intend to share their comments or reviews about any product in different social networking sites. An analyst usually processes to reviews properly to obtain any meaningful information from it. Classification of sentiments associated with reviews is one of these processing steps. The reviews framed are often made in text format. While processing the text reviews, each word of the review is considered as a feature. Thus, selection of right kind of features needs to be carried out to select the best feature from the set of all features. In this paper, the machine learning algorithm, i.e., support vector machine, is used to select the best features from the training data. These features are then given input to artificial neural network method, to process further. Different performance evaluation parameters such as precision, recall, f-measure, accuracy have been considered to evaluate the performance of the proposed approach on two different datasets, i.e., IMDb dataset and polarity dataset.
Similar content being viewed by others
References
Pang B, Lee L, Vaithyanathan S (2002) Thumbs up?: Sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 conference on Empirical methods in natural language processing, vol 10, Association for Computational Linguistics, 2002, pp 79–86
Pang B, Lee L (2004) A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd annual meeting on Association for Computational Linguistics, Association for Computational Linguistics, 2004, p 271
Turney PD (2002) Thumbs up or thumbs down?: Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th annual meeting on Association for Computational Linguistics, 2002, pp 417–424
Liu B (2012) Sentiment analysis and opinion mining. Synth Lect Hum Lang Technol 5(1):1–167
Feldman R (2013) Techniques and applications for sentiment analysis. Commun ACM 56(4):82–89
Gautam G, Yadav D (2014) Sentiment analysis of twitter data using machine learning approaches and semantic analysis. In: 2014 seventh international conference on contemporary computing (IC3), IEEE, 2014, pp 437–442
Hastie T, Tibshirani R, Friedman J (2009) Unsupervised learning. Springer, Berlin
Hady MFA, Schwenker F (2013) Semi-supervised learning. In: Bianchini M, Maggini M, Jain LC (eds) Handbook on neural information processing. Springer, Berlin, pp 215–239
IMDb, Internet movie database (IMDb) (2011). http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz
Garreta R, Moncecchi G (2013) Learning scikit-learn: machine Learning in Python. Packt Publishing Ltd, Birmingham
Matsumoto S, Takamura H, Okumura M (2005) Sentiment classification using word sub-sequences and dependency sub-trees. In: Ho TB, Chung D, Liu H (eds) Advances in knowledge discovery and data mining. Springer, Berlin, pp 301–311
Moraes R, Valiati JF, Neto WPG (2013) Document-level sentiment classification: an empirical comparison between SVM and ANN. Expert Syst Appl 40(2):621–633
Tang D (2015) Sentiment-specific representation learning for document-level sentiment analysis. In: Proceedings of the eighth ACM international conference on web search and data mining, ACM, 2015, pp 447–452
Tu Z, He Y, Foster J, van Genabith J, Liu Q, Lin S (2012) Identifying high-impact sub-structures for convolution kernels in document-level sentiment classification. In: Proceedings of the 50th annual meeting of the Association for Computational Linguistics: short papers, vol 2, Association for Computational Linguistics, 2012, pp 338–343
Liu SM, Chen J-H (2015) A multi-label classification based approach for sentiment classification. Expert Syst Appl 42(3):1083–1093
Zhang D, Xu H, Su Z, Xu Y (2015) Chinese comments sentiment classification based on word2vec and SVM perf. Expert Syst Appl 42(4):1857–1863
Luo B, Zeng J, Duan J (2016) Emotion space model for classifying opinions in stock message board. Expert Syst Appl 44:138–146
Niu T, Zhu S, Pang L, El Saddik A (2016) Sentiment analysis on multi-view social data. In: Tian Q, Sebe N, Qi G, Huet B, Hong R, Liu X (eds) Multimedia modeling. Springer, Berlin, pp 15–27
Tripathy A, Agrawal A, Rath SK (2016) Classification of sentiment reviews using n-gram machine learning approach. Expert Syst Appl 57:117–126
Govindarajan M (2013) Sentiment analysis of movie reviews using hybrid method of naive bayes and genetic algorithm. Int J Adv Comput Res 3(4):139
Abbasi A, Chen H, Salem A (2008) Sentiment analysis in multiple languages: feature selection for opinion classification in web forums. ACM Trans Inf Syst (TOIS) 26(3):12
Balage Filho PP, Avanço L, Pardo TA, Nunes MG (2014) NILC USP: an improved hybrid system for sentiment analysis in Twitter messages. SemEval 2014:428
Jagtap B, Dhotre V (2014) SVM and HMM based hybrid approach of sentiment analysis for teacher feedback assessment. Int J Emerg Trends Technol Comput Sci (IJETCS) 3(3):229–232
Wang S, Wei Y, Li D, Zhang W, Li W (2007) A hybrid method of feature selection for Chinese text sentiment classification, In: Fourth international conference on fuzzy systems and knowledge discovery, 2007 (FSKD 2007), vol 3, IEEE, 2007, pp 435–439
Babatunde O, Armstrong L, Leng J, Diepeveen D (2014) A genetic algorithm-based feature selection. Br J Math Comput Sci 4(21):889–905
Neumann J, Schnörr C, Steidl G (2005) Combined SVM-based feature selection and classification. Mach Learn 61(1–3):129–150
Fernandez-Lozano C, Seoane JA, Gestal M, Gaunt TR, Dorado J, Campbell C (2015) Texture classification using feature selection and kernel-based techniques. Soft Comput 19(9):2469–2480
Maldonado S, Weber R, Basak J (2011) Simultaneous feature selection and classification using kernel-penalized support vector machines. Inf Sci 181(1):115–128
Zheng L, Wang H, Gao S (2015) Sentimental feature selection for sentiment analysis of Chinese online reviews. Int J Mach Learn Cybern 6:1–10
Sharma A, Dey S (2012) A comparative study of feature selection and machine learning techniques for sentiment analysis. In: Proceedings of the 2012 ACM Research in Applied Computation Symposium, ACM, 2012, pp 1–7
Hardin D, Tsamardinos I, Aliferis CF (2004) A theoretical characterization of linear svm-based feature selection. In: Proceedings of the twenty-first international conference on machine learning, ACM, 2004, p 48
Tang H, Tan S, Cheng X (2009) A survey on sentiment detection of reviews. Expert Syst Appl 36(7):10760–10773
Refaeilzadeh P, Tang L, Liu H Cross-validation. http://www.public.asu.edu.tang9/papers/ency-cross-validation.pdf
Hsu CW, Chang CC, Lin CJ (2003) A practical guide to support vector classification. Technical Report, Department of Computer Science, National Taiwan University
Zhang GP (2000) Neural networks for classification: a survey. IEEE Trans Syst Man Cybern C Appl Rev 30(4):451–462
Reby D, Lek S, Dimopoulos I, Joachim J, Lauga J, Aulagnier S (1997) Artificial neural networks as a classification method in the behavioural sciences. Behav Process 40(1):35–43
Mouthami K, Devi KN, Bhaskaran VM (2013) Sentiment analysis and classification based on textual reviews. In: 2013 international conference on information communication and embedded systems (ICICES), IEEE, 2013, pp 271–276
Salvetti F, Lewis S, Reichenbach C (2004) Automatic opinion polarity classification of movie. Colo Res Linguist 17:2
Mullen T, Collier N (2004) Sentiment analysis using support vector machines with diverse information sources. In: Lin D, Wu D (eds) EMNLP, vol 4, pp 412–418
Beineke P, Hastie T, Vaithyanathan S (2004) The sentimental factor: improving review classification via human-provided information. In: Proceedings of the 42nd annual meeting on Association for Computational Linguistics, 2004, p 263
Whitelaw C, Garg N, Argamon S (2005) Using appraisal groups for sentiment analysis. In: Proceedings of the 14th ACM international conference on information and knowledge management, ACM, 2005, pp 625–631
Aue A, Gamon M (2005) Customizing sentiment classifiers to new domains: a case study. In: Proceedings of recent advances in natural language processing (RANLP), vol. 1, 2005, pp 1–7
Read J (2005) Using emoticons to reduce dependency in machine learning techniques for sentiment classification. In: Proceedings of the ACL student research workshop, Association for Computational Linguistics, 2005, pp 43–48
Kennedy A, Inkpen D (2006) Sentiment classification of movie reviews using contextual valence shifters. Comput Intell 22(2):110–125
Joachims T (1998) Text categorization with support vector machines: learning with many relevant features. In: European conference on machine learning, pp 137–142
Socher R, Perelygin A, Wu JY, Chuang J, Manning C, Ng A, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP), pp 1642–1654
Cao Y, Xu R, Chen T (2015) Combining convolutional neural network and support vector machine for sentiment classification. In: Chinese national conference on social media processing, pp 144–155
Liu B (2015) Sentiment analysis: mining opinions, sentiments, and emotions. Cambridge University Press, Cambridge
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781
van Rijsbergen CJ, Robertson SE, Porter MF, Martin F (1980) New models in probabilistic information retrieval. British Library Research and Development Department, London
Goldberg Y, Levy O (2014) word2vec Explained: deriving Mikolov et al.’s negative-sampling word-embedding method. arXiv preprint arXiv:1402.3722
Blake C, Merz CJ (1998) \(\{\text{UCI}\}\) Repository of machine learning databases. University of California, Dept. of Inform. Computer science, Irvine, CA, Available: http://www.ics.uci.edu/mlearn/ML-Repository.html
Weston J, Elisseeff A, Schölkopf B, Tipping M (2003) Use of the zero-norm with linear models and kernel methods. J Mach Learn Res 3:1439–1461
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Tripathy, A., Anand, A. & Rath, S.K. Document-level sentiment classification using hybrid machine learning approach. Knowl Inf Syst 53, 805–831 (2017). https://doi.org/10.1007/s10115-017-1055-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-017-1055-z