Abstract
Consumer reviews show inconsistent ratings when compared to their contents as a result of sarcastic feedback. Consequently, they cannot provide valuable feedback to improve products and services of the firms. One possible solution is to utilize consumer review contents to identify the true ratings. In this work, different multi-class classification methods were applied to assign automatic ratings for consumer reviews based on a 5-star rating scale, where the original review ratings were inconsistent with the content. Two term weighting schemes (i.e. tf-idf and tf-igm) and five supervised machine learning algorithms (i.e. k-NN, MNB, RF, XGBoost and SVM) were compared. The dataset was downloaded from the Amazon website, and language experts helped to correct the real rating for each consumer review. After verifying the effectiveness of the proposed methods, the multi-class classifier model developed by SVM along with tf-igm returned the best results for automatic ratings of consumer reviews, with average improved scores of accuracies and F1 over the other methods at 11.7% and 10.5%, respectively.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Caropreso, M.F., Matwin, S.: Beyond the bag of words: a text representation for sentence selection. In: Lamontagne, L., Marchand, M. (eds.) AI 2006. LNCS (LNAI), vol. 4013, pp. 324–335. Springer, Heidelberg (2006). https://doi.org/10.1007/11766247_28
Chen, K., Zhang, Z., Long, J., Zhang, H.: Turning from tf-idf to tf-igm for term weighting in text classification. Expert Syst. Appl. 66, 245–260 (2016)
Constantinides, E., Holleschovsky, N.I.: Impact of online product reviews on purchasing decisions. In: International Conference on Web Information Systems and Technologies, vol. 2, pp. 271–278. SCITEPRESS (2016)
Cuizon, J.C., Lopez, J., Jones, D.R.: Text mining customer reviews for aspect-based restaurant rating. Int. J. Comput. Sci. Inf. Technol. (IJCSIT) 10 (2018)
Dai, X., Spasic, I., Andrès, F.: A framework for automated rating of online reviews against the underlying topics. In: Proceedings of the SouthEast Conference, pp. 164–167 (2017)
Domeniconi, G., Moro, G., Pasolini, R., Sartori, C.: A comparison of term weighting schemes for text classification and sentiment analysis with a supervised variant of tf.idf. In: Helfert, M., Holzinger, A., Belo, O., Francalanci, C. (eds.) DATA 2015. CCIS, vol. 584, pp. 39–58. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-30162-4_4
Ganu, G., Elhadad, N., Marian, A.: Beyond the stars: improving rating predictions using review text content. In: WebDB, vol. 9, pp. 1–6. Citeseer (2009)
Ganu, G., Kakodkar, Y., Marian, A.: Improving the quality of predictions using textual information in online user reviews. Inf. Syst. 38(1), 1–15 (2013)
Geetha, M., Singha, P., Sinha, S.: Relationship between customer sentiment and online customer ratings for hotels-an empirical analysis. Tour. Manage. 61, 43–54 (2017)
Ghose, A., Ipeirotis, P.G.: Estimating the helpfulness and economic impact of product reviews: mining text and reviewer characteristics. IEEE Trans. Knowl. Data Eng. 23(10), 1498–1512 (2010)
Hanif, I.: Implementing extreme gradient boosting (xgboost) classifier to improve customer churn prediction. In: ICSA 2019: Proceedings of the 1st International Conference on Statistics and Analytics, ICSA 2019, 2–3 August 2019, Bogor, Indonesia, p. 434. European Alliance for Innovation (2019)
Islam, M.Z., Liu, J., Li, J., Liu, L., Kang, W.: A semantics aware random forest for text classification. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pp. 1061–1070 (2019)
Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998). https://doi.org/10.1007/BFb0026683
Karakaya, F., Barnes, N.G.: Impact of online reviews of customer care experience on brand or company selection. J. Consum. Mark. (2010)
Kibriya, A.M., Frank, E., Pfahringer, B., Holmes, G.: Multinomial Naive Bayes for text categorization revisited. In: Webb, G.I., Yu, X. (eds.) AI 2004. LNCS (LNAI), vol. 3339, pp. 488–499. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30549-1_43
Liu, Y., Zhou, Y., Wen, S., Tang, C.: A strategy on selecting performance metrics for classifier evaluation. Int. J. Mob. Comput. Multimedia Commun. (IJMCMC) 6(4), 20–35 (2014)
Oreški, D., Novosel, T.: Comparison of feature selection techniques in knowledge discovery process. TEM J. 3(4), 285 (2014)
Porter, M.F.: Snowball: a language for stemming algorithms (2001)
Qiao, Z., Wang, G.A., Zhou, M., Fan, W.: The impact of customer reviews on product innovation: empirical evidence in mobile apps. In: Deokar, A.V., Gupta, A., Iyer, L.S., Jones, M.C. (eds.) Analytics and Data Science. AIS, pp. 95–110. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-58097-5_8
Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM 18(11), 613–620 (1975)
Shi, X., Liang, X.: Resolving inconsistent ratings and reviews on commercial webs based on support vector machines. In: 2015 12th International Conference on Service Systems and Service Management (ICSSSM), pp. 1–6. IEEE (2015)
Trstenjak, B., Mikac, S., Donko, D.: KNN with TF-IDF based framework for text categorization. Procedia Eng. 69, 1356–1364 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Polpinij, J., Luaphol, B. (2021). Comparing of Multi-class Text Classification Methods for Automatic Ratings of Consumer Reviews. In: Chomphuwiset, P., Kim, J., Pawara, P. (eds) Multi-disciplinary Trends in Artificial Intelligence. MIWAI 2021. Lecture Notes in Computer Science(), vol 12832. Springer, Cham. https://doi.org/10.1007/978-3-030-80253-0_15
Download citation
DOI: https://doi.org/10.1007/978-3-030-80253-0_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-80252-3
Online ISBN: 978-3-030-80253-0
eBook Packages: Computer ScienceComputer Science (R0)