Comparing of Multi-class Text Classification Methods for Automatic Ratings of Consumer Reviews

Polpinij, Jantima; Luaphol, Bancha

doi:10.1007/978-3-030-80253-0_15

Comparing of Multi-class Text Classification Methods for Automatic Ratings of Consumer Reviews

Jantima Polpinij¹¹ &
Bancha Luaphol¹²

Conference paper
First Online: 27 June 2021

482 Accesses
4 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12832))

Abstract

Consumer reviews show inconsistent ratings when compared to their contents as a result of sarcastic feedback. Consequently, they cannot provide valuable feedback to improve products and services of the firms. One possible solution is to utilize consumer review contents to identify the true ratings. In this work, different multi-class classification methods were applied to assign automatic ratings for consumer reviews based on a 5-star rating scale, where the original review ratings were inconsistent with the content. Two term weighting schemes (i.e. tf-idf and tf-igm) and five supervised machine learning algorithms (i.e. k-NN, MNB, RF, XGBoost and SVM) were compared. The dataset was downloaded from the Amazon website, and language experts helped to correct the real rating for each consumer review. After verifying the effectiveness of the proposed methods, the multi-class classifier model developed by SVM along with tf-igm returned the best results for automatic ratings of consumer reviews, with average improved scores of accuracies and F1 over the other methods at 11.7% and 10.5%, respectively.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Caropreso, M.F., Matwin, S.: Beyond the bag of words: a text representation for sentence selection. In: Lamontagne, L., Marchand, M. (eds.) AI 2006. LNCS (LNAI), vol. 4013, pp. 324–335. Springer, Heidelberg (2006). https://doi.org/10.1007/11766247_28
Chapter Google Scholar
Chen, K., Zhang, Z., Long, J., Zhang, H.: Turning from tf-idf to tf-igm for term weighting in text classification. Expert Syst. Appl. 66, 245–260 (2016)
Article Google Scholar
Constantinides, E., Holleschovsky, N.I.: Impact of online product reviews on purchasing decisions. In: International Conference on Web Information Systems and Technologies, vol. 2, pp. 271–278. SCITEPRESS (2016)
Google Scholar
Cuizon, J.C., Lopez, J., Jones, D.R.: Text mining customer reviews for aspect-based restaurant rating. Int. J. Comput. Sci. Inf. Technol. (IJCSIT) 10 (2018)
Google Scholar
Dai, X., Spasic, I., Andrès, F.: A framework for automated rating of online reviews against the underlying topics. In: Proceedings of the SouthEast Conference, pp. 164–167 (2017)
Google Scholar
Domeniconi, G., Moro, G., Pasolini, R., Sartori, C.: A comparison of term weighting schemes for text classification and sentiment analysis with a supervised variant of tf.idf. In: Helfert, M., Holzinger, A., Belo, O., Francalanci, C. (eds.) DATA 2015. CCIS, vol. 584, pp. 39–58. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-30162-4_4
Chapter Google Scholar
Ganu, G., Elhadad, N., Marian, A.: Beyond the stars: improving rating predictions using review text content. In: WebDB, vol. 9, pp. 1–6. Citeseer (2009)
Google Scholar
Ganu, G., Kakodkar, Y., Marian, A.: Improving the quality of predictions using textual information in online user reviews. Inf. Syst. 38(1), 1–15 (2013)
Article Google Scholar
Geetha, M., Singha, P., Sinha, S.: Relationship between customer sentiment and online customer ratings for hotels-an empirical analysis. Tour. Manage. 61, 43–54 (2017)
Article Google Scholar
Ghose, A., Ipeirotis, P.G.: Estimating the helpfulness and economic impact of product reviews: mining text and reviewer characteristics. IEEE Trans. Knowl. Data Eng. 23(10), 1498–1512 (2010)
Article Google Scholar
Hanif, I.: Implementing extreme gradient boosting (xgboost) classifier to improve customer churn prediction. In: ICSA 2019: Proceedings of the 1st International Conference on Statistics and Analytics, ICSA 2019, 2–3 August 2019, Bogor, Indonesia, p. 434. European Alliance for Innovation (2019)
Google Scholar
Islam, M.Z., Liu, J., Li, J., Liu, L., Kang, W.: A semantics aware random forest for text classification. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pp. 1061–1070 (2019)
Google Scholar
Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998). https://doi.org/10.1007/BFb0026683
Chapter Google Scholar
Karakaya, F., Barnes, N.G.: Impact of online reviews of customer care experience on brand or company selection. J. Consum. Mark. (2010)
Google Scholar
Kibriya, A.M., Frank, E., Pfahringer, B., Holmes, G.: Multinomial Naive Bayes for text categorization revisited. In: Webb, G.I., Yu, X. (eds.) AI 2004. LNCS (LNAI), vol. 3339, pp. 488–499. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30549-1_43
Chapter Google Scholar
Liu, Y., Zhou, Y., Wen, S., Tang, C.: A strategy on selecting performance metrics for classifier evaluation. Int. J. Mob. Comput. Multimedia Commun. (IJMCMC) 6(4), 20–35 (2014)
Article Google Scholar
Oreški, D., Novosel, T.: Comparison of feature selection techniques in knowledge discovery process. TEM J. 3(4), 285 (2014)
Google Scholar
Porter, M.F.: Snowball: a language for stemming algorithms (2001)
Google Scholar
Qiao, Z., Wang, G.A., Zhou, M., Fan, W.: The impact of customer reviews on product innovation: empirical evidence in mobile apps. In: Deokar, A.V., Gupta, A., Iyer, L.S., Jones, M.C. (eds.) Analytics and Data Science. AIS, pp. 95–110. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-58097-5_8
Chapter Google Scholar
Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM 18(11), 613–620 (1975)
Article Google Scholar
Shi, X., Liang, X.: Resolving inconsistent ratings and reviews on commercial webs based on support vector machines. In: 2015 12th International Conference on Service Systems and Service Management (ICSSSM), pp. 1–6. IEEE (2015)
Google Scholar
Trstenjak, B., Mikac, S., Donko, D.: KNN with TF-IDF based framework for text categorization. Procedia Eng. 69, 1356–1364 (2014)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Intellect Laboratory, Department of Computer Science, Faculty of Informatics, Mahasarakham University, Talat, Thailand
Jantima Polpinij
Department of Digital Technology, Faculty of Administrative Science, Kalasin University, Kalasin, Thailand
Bancha Luaphol

Authors

Jantima Polpinij
View author publications
You can also search for this author in PubMed Google Scholar
Bancha Luaphol
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jantima Polpinij .

Editor information

Editors and Affiliations

Mahasarakham University, Maha Sarakham, Thailand
Phatthanaphong Chomphuwiset
Korea Advanced Institute of Science and Technology, Daejeon, Korea (Republic of)
Junmo Kim
Mahasarakham University, Maha Sarakham, Thailand
Pornntiwa Pawara

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Polpinij, J., Luaphol, B. (2021). Comparing of Multi-class Text Classification Methods for Automatic Ratings of Consumer Reviews. In: Chomphuwiset, P., Kim, J., Pawara, P. (eds) Multi-disciplinary Trends in Artificial Intelligence. MIWAI 2021. Lecture Notes in Computer Science(), vol 12832. Springer, Cham. https://doi.org/10.1007/978-3-030-80253-0_15

Download citation

DOI: https://doi.org/10.1007/978-3-030-80253-0_15
Published: 27 June 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-80252-3
Online ISBN: 978-3-030-80253-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics