Skip to main content

Comparing of Multi-class Text Classification Methods for Automatic Ratings of Consumer Reviews

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12832))

Abstract

Consumer reviews show inconsistent ratings when compared to their contents as a result of sarcastic feedback. Consequently, they cannot provide valuable feedback to improve products and services of the firms. One possible solution is to utilize consumer review contents to identify the true ratings. In this work, different multi-class classification methods were applied to assign automatic ratings for consumer reviews based on a 5-star rating scale, where the original review ratings were inconsistent with the content. Two term weighting schemes (i.e. tf-idf and tf-igm) and five supervised machine learning algorithms (i.e. k-NN, MNB, RF, XGBoost and SVM) were compared. The dataset was downloaded from the Amazon website, and language experts helped to correct the real rating for each consumer review. After verifying the effectiveness of the proposed methods, the multi-class classifier model developed by SVM along with tf-igm returned the best results for automatic ratings of consumer reviews, with average improved scores of accuracies and F1 over the other methods at 11.7% and 10.5%, respectively.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Caropreso, M.F., Matwin, S.: Beyond the bag of words: a text representation for sentence selection. In: Lamontagne, L., Marchand, M. (eds.) AI 2006. LNCS (LNAI), vol. 4013, pp. 324–335. Springer, Heidelberg (2006). https://doi.org/10.1007/11766247_28

    Chapter  Google Scholar 

  2. Chen, K., Zhang, Z., Long, J., Zhang, H.: Turning from tf-idf to tf-igm for term weighting in text classification. Expert Syst. Appl. 66, 245–260 (2016)

    Article  Google Scholar 

  3. Constantinides, E., Holleschovsky, N.I.: Impact of online product reviews on purchasing decisions. In: International Conference on Web Information Systems and Technologies, vol. 2, pp. 271–278. SCITEPRESS (2016)

    Google Scholar 

  4. Cuizon, J.C., Lopez, J., Jones, D.R.: Text mining customer reviews for aspect-based restaurant rating. Int. J. Comput. Sci. Inf. Technol. (IJCSIT) 10 (2018)

    Google Scholar 

  5. Dai, X., Spasic, I., Andrès, F.: A framework for automated rating of online reviews against the underlying topics. In: Proceedings of the SouthEast Conference, pp. 164–167 (2017)

    Google Scholar 

  6. Domeniconi, G., Moro, G., Pasolini, R., Sartori, C.: A comparison of term weighting schemes for text classification and sentiment analysis with a supervised variant of tf.idf. In: Helfert, M., Holzinger, A., Belo, O., Francalanci, C. (eds.) DATA 2015. CCIS, vol. 584, pp. 39–58. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-30162-4_4

    Chapter  Google Scholar 

  7. Ganu, G., Elhadad, N., Marian, A.: Beyond the stars: improving rating predictions using review text content. In: WebDB, vol. 9, pp. 1–6. Citeseer (2009)

    Google Scholar 

  8. Ganu, G., Kakodkar, Y., Marian, A.: Improving the quality of predictions using textual information in online user reviews. Inf. Syst. 38(1), 1–15 (2013)

    Article  Google Scholar 

  9. Geetha, M., Singha, P., Sinha, S.: Relationship between customer sentiment and online customer ratings for hotels-an empirical analysis. Tour. Manage. 61, 43–54 (2017)

    Article  Google Scholar 

  10. Ghose, A., Ipeirotis, P.G.: Estimating the helpfulness and economic impact of product reviews: mining text and reviewer characteristics. IEEE Trans. Knowl. Data Eng. 23(10), 1498–1512 (2010)

    Article  Google Scholar 

  11. Hanif, I.: Implementing extreme gradient boosting (xgboost) classifier to improve customer churn prediction. In: ICSA 2019: Proceedings of the 1st International Conference on Statistics and Analytics, ICSA 2019, 2–3 August 2019, Bogor, Indonesia, p. 434. European Alliance for Innovation (2019)

    Google Scholar 

  12. Islam, M.Z., Liu, J., Li, J., Liu, L., Kang, W.: A semantics aware random forest for text classification. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pp. 1061–1070 (2019)

    Google Scholar 

  13. Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998). https://doi.org/10.1007/BFb0026683

    Chapter  Google Scholar 

  14. Karakaya, F., Barnes, N.G.: Impact of online reviews of customer care experience on brand or company selection. J. Consum. Mark. (2010)

    Google Scholar 

  15. Kibriya, A.M., Frank, E., Pfahringer, B., Holmes, G.: Multinomial Naive Bayes for text categorization revisited. In: Webb, G.I., Yu, X. (eds.) AI 2004. LNCS (LNAI), vol. 3339, pp. 488–499. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30549-1_43

    Chapter  Google Scholar 

  16. Liu, Y., Zhou, Y., Wen, S., Tang, C.: A strategy on selecting performance metrics for classifier evaluation. Int. J. Mob. Comput. Multimedia Commun. (IJMCMC) 6(4), 20–35 (2014)

    Article  Google Scholar 

  17. Oreški, D., Novosel, T.: Comparison of feature selection techniques in knowledge discovery process. TEM J. 3(4), 285 (2014)

    Google Scholar 

  18. Porter, M.F.: Snowball: a language for stemming algorithms (2001)

    Google Scholar 

  19. Qiao, Z., Wang, G.A., Zhou, M., Fan, W.: The impact of customer reviews on product innovation: empirical evidence in mobile apps. In: Deokar, A.V., Gupta, A., Iyer, L.S., Jones, M.C. (eds.) Analytics and Data Science. AIS, pp. 95–110. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-58097-5_8

    Chapter  Google Scholar 

  20. Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM 18(11), 613–620 (1975)

    Article  Google Scholar 

  21. Shi, X., Liang, X.: Resolving inconsistent ratings and reviews on commercial webs based on support vector machines. In: 2015 12th International Conference on Service Systems and Service Management (ICSSSM), pp. 1–6. IEEE (2015)

    Google Scholar 

  22. Trstenjak, B., Mikac, S., Donko, D.: KNN with TF-IDF based framework for text categorization. Procedia Eng. 69, 1356–1364 (2014)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jantima Polpinij .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Polpinij, J., Luaphol, B. (2021). Comparing of Multi-class Text Classification Methods for Automatic Ratings of Consumer Reviews. In: Chomphuwiset, P., Kim, J., Pawara, P. (eds) Multi-disciplinary Trends in Artificial Intelligence. MIWAI 2021. Lecture Notes in Computer Science(), vol 12832. Springer, Cham. https://doi.org/10.1007/978-3-030-80253-0_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-80253-0_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-80252-3

  • Online ISBN: 978-3-030-80253-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics