Abstract
With the rapid growth of e-commerce, product reviews on the Web have become an important information source for customers’ decision making when they intend to buy some product. As the reviews are often too many for customers to go through, how to automatically classify them into different sentiment orientation categories (i.e. positive/negative) has become a research problem. In this paper, based on Fisher’s discriminant ratio, an effective feature selection method is proposed for product review text sentiment classification. In order to validate the validity of the proposed method, we compared it with other methods respectively based on information gain and mutual information while support vector machine is adopted as the classifier. In this paper, 6 subexperiments are conducted by combining different feature selection methods with 2 kinds of candidate feature sets. Under 1006 review documents of cars, the experimental results indicate that the Fisher’s discriminant ratio based on word frequency estimation has the best performance with F value 83.3% while the candidate features are the words which appear in both positive and negative texts.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment Classification Using Machine Learning Techniques. In: EMNLP 2002 (2002)
Turney, P.D., Littman, M.L.: Unsupervised Learning of Semantic Orientation from A Hundred-billion-word Corpus. Technical Report EGB-1094, National Research Council Canada (2002)
Turney, P.D., Littman, M.L.: Measuring Praise and Criticism: Inference of Semantic Orientation from Association. ACM Transactions on Information Systems (TOIS) 21(4), 315–346 (2003)
Tony, M., Nigel, C.: Sentiment Analysis Using Support Vector Machines with Diverse Information Sources. In: Lin, D., Wu, D. (eds.) Proceedings of EMNLP 2004, Barcelona, Spain, July 2004, pp. 412–418 (2004)
Michael, G.: Sentiment Classification on Customer Feedback Data: Noisy Data, Large Feature Vectors, and the Role of Linguistic Analysis. In: Proceedings the 20th international conference on computational linguistics (2004)
Kennedy, A., Inkpen, D.: Sentiment Classification of Movie and Product Reviews Using Contextual Valence Shifters. In: Workshop on the analysis of Informal and Formal Information Exchange during negotiations (2005)
Chaovalit, P., Zhou, L.: Movie Review Mining: A Comparison Between Supervised and Unsupervised Classification Approaches. In: IEEE proceedings of the 38th Hawaii International Conference on System Sciences, Big Island, Hawaii, pp. 1–9 (2005)
Ye, Q., Lin, B., Li, Y.: Sentiment Classification for Chinese Reviews: A Comparison Between SVM and Semantic Approaches. In: The 4th International Inference on Machine Learning and Cybernetics ICMLC 2005 (June 2005)
Ye, Q., Zhang, Z., Rob, L.: Sentiment Classification of Online Reviews to Travel Destinations by Supervised Machine Learning Approaches. Expert System with Application (in press)
Wang, S., Wei, Y., Li, D., Zhang, W., Li, W.: A Hybrid Method of Feature Selection for Chinese Text Sentiment Classification. In: Proceedings of the 4th International Conference on Fuzzy Systems and Knowledge Discovery, pp. 435–439. IEEE Computer Society, Los Alamitos (2007)
Ahmed, A., Chen, H., Salem, A.: Sentiment Analysis in Multiple Languages: Feature Selection for Opinion Classification in Web Forums. ACM Transactions on Information Systems 26(3) (2008)
Tan, S., Zhang, J.: An Empirical Study of Sen-timent Analysis for Chinese Documents. Expert Systems with Applications 34, 2622–2629 (2008)
Yi, J., Nasukawa, T., Bunescu, R., Niblack, W.: Sentiment Analyzer: Extracting Sentiments About A Given Topic Using Natural Language Processing Techniques. In: Proceedings of the 3rd IEEE International Conference on Data Mining, pp. 427–434 (2003)
Webb, A.R.: Statistical Pattern Recognition, 2nd edn. John Wiley & Sons, Inc., Chichester
Yang, Y., Liu, X.: A Re-examination of Text Categorization Methods. In: SIGIR, pp. 42–49 (1999)
Yang, Y., Pedersen, J.O.: A Comparative Study on Feature Selection in Text Categorization. In: ICML, pp. 412–420 (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, S., Li, D., Wei, Y., Li, H. (2009). A Feature Selection Method Based on Fisher’s Discriminant Ratio for Text Sentiment Classification. In: Liu, W., Luo, X., Wang, F.L., Lei, J. (eds) Web Information Systems and Mining. WISM 2009. Lecture Notes in Computer Science, vol 5854. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-05250-7_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-05250-7_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-05249-1
Online ISBN: 978-3-642-05250-7
eBook Packages: Computer ScienceComputer Science (R0)