Abstract
Customer reviews and star ratings are widely used on E-commerce and reviewing sites for the public to express their opinions. To help the online public make decisions, items (e.g., products, services, movies, books) are typically represented and ordered by an aggregated star rating from all reviews. Existing approaches simply average star ratings or use other statistical functions to aggregate star ratings. However, these approaches rely on the existence of large numbers of reviews to work effectively. On the other hand, many new items have few reviews. In this paper, we argue that at the core of review aggregation is ranking items, hence, we cast the problem of ranking a set of items as a learning to rank (L2R) problem to address the issue of reviews scarcity. We devise a rank-oriented loss function to directly optimize the ranking of groups of items. Standard L2R models require ranking labels for training, but item ranking ground-truth information is not always available. Therefore, we propose to aggregate star ratings for items with large numbers of reviews to automatically generate weak supervision ranking labels for training. We further propose to extract features from review contents, rating distributions and helpfulness information to train the ranking model. Extensive experiments on an Amazon dataset showed that our model is very effective compared to state-of-the-art heuristic aggregation approaches, regression and standard L2R approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Technically for Amazon these are products, but we wish to maintain consistent terminology and hence use items.
- 2.
References
Abdel-Hafez, A., Xu, Y., Josang, A.: A normal-distribution based rating aggregation method for generating product reputations. Web Intell. 13(1), 43–51 (2015)
Cue, L., Zhang, X., Qin, A., Wu, L.: CDS: collaborative distant supervision for Twitter account classification. Expert Syst. Appl. 83(15), 94–103 (2017)
Freund, Y., Iyer, R., Schapire, R., Singer, Y.: An efficient boosting algorithm for combining preferences. JMLR 4, 933–969 (2003)
Friedman, J.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5), 1189–1232 (2001)
Garcin, F., Flaing, B., Jurca, R.: Aggregating reputation feedback. In: ICORE, pp. 62–74 (2009)
Han, X., Sun, L.: Global distant supervision for relation extraction. In: AAAI, pp. 2950–2956 (2016)
Homann, R., Zhang, C., Ling, X., Zelemoyer, L., Weld, D.S.: Knowledge-based weak supervision for information extraction of overlapping relations. In: ACL, pp. 541–550 (2011)
Joachims, T.: Optimizing search engines using clickthrough data. In: KDD, pp. 133–142 (2002)
Josang, A., Haller, J.: Dirichlet reputation systems. In: ARES, pp. 112–119 (2007)
Li, H.: A short introduction to learning to rank. IEICE TIOS E94, 1854–1862 (2011)
Liu, J., Cao, Y., Lin, C., Huang, Y., Zhou, M.: Low-quality product review detection in opinion summarization. In: EMNLP-CoNLL, pp. 334–342 (2007)
McAuley, J., Targett, C., Shi, J., Van den Hengel, A.: Image-based recommendations on styles and substitutes. In: SIGIR, vol. 14, pp. 43–52 (2015)
McGlohon, M., Glance, N., Reiter, Z.: Star quality: aggregating reviews to rank products and merchants. In: ICWSM, pp. 1844–1851 (2010)
Metzler, D., Croft, B.: Linear feature-based models for information retrieval. Inf. Retr. 10(3), 257–274 (2007)
Shaalan, Y., Zhang, X.: A time and opinion quality-weighted model for aggregating online reviews. In: Cheema, M.A., Zhang, W., Chang, L. (eds.) ADC 2016. LNCS, vol. 9877, pp. 269–282. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46922-5_21
Shi, Y., Larson, M., Hanjalic, A.: List-wise learning to rank with matrix factorization for collaborative filtering. In: Proceedings of the Fourth ACM Conference on Recommender Systems, pp. 269–272. ACM (2010)
Tehelwall, M., Buckley, K., Paltoglou, G.: Sentiment strength detection for the social web. JASIST 63(1), 63–173 (2012)
Wu, Q., Burges, C., Svore, K., Gao, J.: Adapting boosting for information retrieval measures. Inf. Retr. 13(3), 254–270 (2010)
Zhang, K., Cheng, Y., Liao, W., Choudhary, A.: Mining millions of reviews: a technique to rank products based on importance of reviews. ICEC 12, 1–8 (2011)
Zhang, X., Cui, L., Wang, Y.: CommTrust: computing multi-dimensional trust by mining E-commerce feedback comments. IEEE TKDE 26(7), 1631–1643 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Shaalan, Y., Zhang, X., Chan, J. (2018). Learning to Rank Items of Minimal Reviews Using Weak Supervision. In: Phung, D., Tseng, V., Webb, G., Ho, B., Ganji, M., Rashidi, L. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2018. Lecture Notes in Computer Science(), vol 10937. Springer, Cham. https://doi.org/10.1007/978-3-319-93034-3_50
Download citation
DOI: https://doi.org/10.1007/978-3-319-93034-3_50
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93033-6
Online ISBN: 978-3-319-93034-3
eBook Packages: Computer ScienceComputer Science (R0)