Skip to main content

A Feature Selection Method Based on Fisher’s Discriminant Ratio for Text Sentiment Classification

  • Conference paper
Web Information Systems and Mining (WISM 2009)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5854))

Included in the following conference series:

Abstract

With the rapid growth of e-commerce, product reviews on the Web have become an important information source for customers’ decision making when they intend to buy some product. As the reviews are often too many for customers to go through, how to automatically classify them into different sentiment orientation categories (i.e. positive/negative) has become a research problem. In this paper, based on Fisher’s discriminant ratio, an effective feature selection method is proposed for product review text sentiment classification. In order to validate the validity of the proposed method, we compared it with other methods respectively based on information gain and mutual information while support vector machine is adopted as the classifier. In this paper, 6 subexperiments are conducted by combining different feature selection methods with 2 kinds of candidate feature sets. Under 1006 review documents of cars, the experimental results indicate that the Fisher’s discriminant ratio based on word frequency estimation has the best performance with F value 83.3% while the candidate features are the words which appear in both positive and negative texts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment Classification Using Machine Learning Techniques. In: EMNLP 2002 (2002)

    Google Scholar 

  2. Turney, P.D., Littman, M.L.: Unsupervised Learning of Semantic Orientation from A Hundred-billion-word Corpus. Technical Report EGB-1094, National Research Council Canada (2002)

    Google Scholar 

  3. Turney, P.D., Littman, M.L.: Measuring Praise and Criticism: Inference of Semantic Orientation from Association. ACM Transactions on Information Systems (TOIS) 21(4), 315–346 (2003)

    Article  Google Scholar 

  4. Tony, M., Nigel, C.: Sentiment Analysis Using Support Vector Machines with Diverse Information Sources. In: Lin, D., Wu, D. (eds.) Proceedings of EMNLP 2004, Barcelona, Spain, July 2004, pp. 412–418 (2004)

    Google Scholar 

  5. Michael, G.: Sentiment Classification on Customer Feedback Data: Noisy Data, Large Feature Vectors, and the Role of Linguistic Analysis. In: Proceedings the 20th international conference on computational linguistics (2004)

    Google Scholar 

  6. Kennedy, A., Inkpen, D.: Sentiment Classification of Movie and Product Reviews Using Contextual Valence Shifters. In: Workshop on the analysis of Informal and Formal Information Exchange during negotiations (2005)

    Google Scholar 

  7. Chaovalit, P., Zhou, L.: Movie Review Mining: A Comparison Between Supervised and Unsupervised Classification Approaches. In: IEEE proceedings of the 38th Hawaii International Conference on System Sciences, Big Island, Hawaii, pp. 1–9 (2005)

    Google Scholar 

  8. Ye, Q., Lin, B., Li, Y.: Sentiment Classification for Chinese Reviews: A Comparison Between SVM and Semantic Approaches. In: The 4th International Inference on Machine Learning and Cybernetics ICMLC 2005 (June 2005)

    Google Scholar 

  9. Ye, Q., Zhang, Z., Rob, L.: Sentiment Classification of Online Reviews to Travel Destinations by Supervised Machine Learning Approaches. Expert System with Application (in press)

    Google Scholar 

  10. Wang, S., Wei, Y., Li, D., Zhang, W., Li, W.: A Hybrid Method of Feature Selection for Chinese Text Sentiment Classification. In: Proceedings of the 4th International Conference on Fuzzy Systems and Knowledge Discovery, pp. 435–439. IEEE Computer Society, Los Alamitos (2007)

    Chapter  Google Scholar 

  11. Ahmed, A., Chen, H., Salem, A.: Sentiment Analysis in Multiple Languages: Feature Selection for Opinion Classification in Web Forums. ACM Transactions on Information Systems 26(3) (2008)

    Google Scholar 

  12. Tan, S., Zhang, J.: An Empirical Study of Sen-timent Analysis for Chinese Documents. Expert Systems with Applications 34, 2622–2629 (2008)

    Article  MathSciNet  Google Scholar 

  13. Yi, J., Nasukawa, T., Bunescu, R., Niblack, W.: Sentiment Analyzer: Extracting Sentiments About A Given Topic Using Natural Language Processing Techniques. In: Proceedings of the 3rd IEEE International Conference on Data Mining, pp. 427–434 (2003)

    Google Scholar 

  14. Webb, A.R.: Statistical Pattern Recognition, 2nd edn. John Wiley & Sons, Inc., Chichester

    Google Scholar 

  15. Yang, Y., Liu, X.: A Re-examination of Text Categorization Methods. In: SIGIR, pp. 42–49 (1999)

    Google Scholar 

  16. Yang, Y., Pedersen, J.O.: A Comparative Study on Feature Selection in Text Categorization. In: ICML, pp. 412–420 (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, S., Li, D., Wei, Y., Li, H. (2009). A Feature Selection Method Based on Fisher’s Discriminant Ratio for Text Sentiment Classification. In: Liu, W., Luo, X., Wang, F.L., Lei, J. (eds) Web Information Systems and Mining. WISM 2009. Lecture Notes in Computer Science, vol 5854. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-05250-7_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-05250-7_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-05249-1

  • Online ISBN: 978-3-642-05250-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics