Abstract
Sentiment classification aims to classify documents according to their overall sentiment orientation, which plays an important role in many web applications, such as electronic commerce. Machine learning is an effective method for such tasks. In general, a classifier is determined by a feature type, a weighting function and a classification algorithm for a given training set. Thus, users are required to predetermine which ones should be applied, that is a troublesome problem for them, because each classifier always achieves different performance for different domains. To deal with this problem, we develop a three phase framework based on assembling multiple classifiers. In order to choose the optimal combination of classifiers, we propose a criterion for estimating the quality of the combination based on sentiment classification accuracy and diversity of the results generated by these classifiers. Moreover, we study the effect of the number of classifiers selected experimentally. With our solution, users can achieve a good performance without making a choice among plentiful combinations of different classifiers. We perform extensive experiments to demonstrate the effectiveness of our solution for different domains.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Martineau, J., Finin, T., Joshi, A., Patel, S.: Improve binary classification on text problems using differential word features. In: CIKM 2009, pp. 2019–2024 (2009)
Dasgupta, S., Ng, V.: Mine the easy, classify the hard: a semi-supervised approach to automatic sentiment classification. In: Proc. of the 47th ACL and the 4th IJCNLP of the AFNLP, pp. 701–709 (2009)
Wei, W., Gulla, J.A.: Sentiment Learning on product reviews via sentiment ontology tree. In: Proc. of the 48th ACL, pp. 404–413 (2010)
Tan, C., Lee, L., Tang, J., Jiang, L., Zhou, M., Li, P.: User-level sentiment analysis incorporating social networks. In: KDD 2011, pp. 1397–1405 (2011)
Lin, Y., Zhang, J., Wang, X., Zhou, A.: Sentiment classification via integrating multiple feature presentation. In: WWW 2012, pp. 569–570 (2012)
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment Classification Using Maching Learning Technique. In: Proc. of the 7th EMNLP, pp. 79–86 (2002)
Lin, Y., Zhang, J., Wang, X., Zhou, A.: An information theoretic approach to sentiment polarity classification. In: WebQuarity 2012, pp. 35–40 (2012)
Matsumoto, S., Takamura, H., Okumura, M.: Sentiment Classification Using Word Sub-sequences and Dependency Sub-trees. In: Ho, T.-B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 301–311. Springer, Heidelberg (2005)
Alm, C.O., Roth, D., Sproat, R.: Emotions from text: machine learning for text-based emotion prediction. In: HLT/EMNLP, pp. 579–586 (2005)
Alec Go, Richa Bhayani, Lei Huang: Twitter sentiment classification using distant supervison. Technical report, Stanford (2009)
Agarwal, A., Xie, B., Vovsha, I., Rambow, O., Passonneau, R.: Sentiment analysis of Twotter data. In: LSM, pp. 30–38 (2011)
Paltoglou, G., Thelwall, M.: A study of informationretrieval weigheing schemes for sentiment analysis. In: Proc. of the 48th ACL, pp. 1386–1395 (2010)
Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.: Introduction to wordnet: An on-line lexical database. International Journal of Lexicography 3(4), 235–312 (1990)
Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. In: Proc. 14th ICML, pp. 412–420 (1997)
Hassan, A., Radev, D.: Identify text polarity using random walks. In: Proc. of the 48th ACL, pp. 395–403 (2010)
Kamps, J., Marx, M., Mokken, R.J., Rijke, M.D.: Using wordnet to measure semantic orientations of adjectives. In: Proc. of the 4th LREC, pp. 1115–1118 (2004)
Tan, C., Lee, L., Tang, J., Jiang, L., Zhou, M., Li, P.: User-level sentiment analysis incorporating social networks. In: KDD 2011, pp. 1397–1405 (2011)
Blitzer, J., Dredze, M., Pereira, F.: Biographies, bollywood, boo-boxes and blenders: Domain adaptation for sentiment classification. In: Proc. of the 45th ACL, pp. 440–447 (2007)
Džeroski, S., Ženko, B.: Is Combining Classifiers with Stacking Better than Selecting the Best One? Machine Learning 54(3), 255–273 (2004)
Fleiss, J.L., Levin, B.: Statistical Methods for Rates and Proportions, 3rd edn. Wiley, New York (2003)
Ženko, B., Todorovski, L., Džeroski, S.: A comparison of stacking with MDTs to bagging, boosting, and other stacking methods. In: ICDM 2001, pp. 669–670 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lin, Y., Wang, X., Zhang, J., Zhou, A. (2012). Assembling the Optimal Sentiment Classifiers. In: Wang, X.S., Cruz, I., Delis, A., Huang, G. (eds) Web Information Systems Engineering - WISE 2012. WISE 2012. Lecture Notes in Computer Science, vol 7651. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35063-4_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-35063-4_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35062-7
Online ISBN: 978-3-642-35063-4
eBook Packages: Computer ScienceComputer Science (R0)