ABSTRACT
E-commerce applications rely heavily on session-based recommendation algorithms to improve the shopping experience of their customers. Recent progress in session-based recommendation algorithms shows great promise. However, translating that promise to real-world outcomes is a challenging task for several reasons, but mostly due to the large number and varying characteristics of the available models. In this paper, we discuss the approach and lessons learned from the process of identifying and deploying a successful session-based recommendation algorithm for a leading e-commerce application in the home-improvement domain. To this end, we initially evaluate fourteen session-based recommendation algorithms in an offline setting using eight different popular evaluation metrics on three datasets. The results indicate that offline evaluation does not provide enough insight to make an informed decision since there is no clear winning method on all metrics. Additionally, we observe that standard offline evaluation metrics fall short for this application. Specifically, they reward an algorithm only when it predicts the exact same item that the user clicked next or eventually purchased. In a practical scenario, however, there are near-identical products which, although they are assigned different identifiers, they should be considered as equally-good recommendations. To overcome these limitations, we perform an additional round of evaluation, where human experts provide both objective and subjective feedback for the recommendations of five algorithms that performed the best in the offline evaluation. We find that the experts’ opinion is oftentimes different from the offline evaluation results. Analysis of the feedback confirms that the performance of all models is significantly higher when we evaluate near-identical product recommendations as relevant. Finally, we run an A/B test with one of the models that performed the best in the human evaluation phase. The treatment model increased conversion rate by 15.6% and revenue per visit by 18.5% when compared with a leading third-party solution.
- R. Agrawal, T. Imieliński, and A. Swami. 1993. Mining Association Rules Between Sets of Items in Large Databases. In SIGMOD ’93.Google Scholar
- J. Beel and S. Langer. 2015. A Comparison of Offline Evaluations, Online Evaluations, and User Studies in the Context of Research-Paper Recommender Systems. In Research and Advanced Technology for Digital Libraries.Google Scholar
- G. Bonnin and D. Jannach. 2014. Automated generation of music playlists: Survey and experiments. CSUR 47, 2 (2014).Google Scholar
- K. Cho, B. van Merriënboer, D. Bahdanau, and Y. Bengio. 2014. On the Properties of Neural Machine Translation: Encoder–Decoder Approaches. In SSST ’14.Google Scholar
- P. Cremonesi, F. Garzotto, and R. Turrin. 2012. Investigating the Persuasion Potential of Recommender Systems from a Quality Perspective: An Empirical Study. TiiS 2, 2 (2012).Google Scholar
- M. Dacrema, P. Cremonesi, and D. Jannach. 2019. Are We Really Making Much Progress? A Worrying Analysis of Recent Neural Recommendation Approaches. In RecSys ’19.Google Scholar
- F. Garcin, B. Faltings, O. Donatsch, A. Alazzawi, C. Bruttin, and A. Huber. 2014. Offline and online evaluation of news recommender systems at swissinfo.ch. In RecSys ’14.Google Scholar
- D. Garg, P. Gupta, P. Malhotra, L. Vig, and G. Shroff. 2019. Sequence and time aware neighborhood for session-based recommendations: Stan. In SIGIR ’19.Google Scholar
- C. Gomez-Uribe and N. Hunt. 2016. The Netflix Recommender System: Algorithms, Business Value, and Innovation. TMIS 6, 4 (2016).Google Scholar
- A. Gunawardana and G. Shani. 2015. Evaluating Recommender Systems. Recommender Systems Handbook.Google Scholar
- N. Hariri, B. Mobasher, and R. Burke. 2012. Context-aware music recommendation based on latenttopic sequential patterns. In RecSys ’12.Google Scholar
- B. Hidasi and A. Karatzoglou. 2018. Recurrent Neural Networks with Top-k Gains for Session-based Recommendations. In CIKM ’18.Google Scholar
- B. Hidasi, A. Karatzoglou, L. Baltrunas, and D. Tikk. 2016. Session-based recommendations with recurrent neural networks. In ICLR ’16.Google Scholar
- D. Jannach and M. Ludewig. 2017. When Recurrent Neural Networks Meet the Neighborhood for Session-Based Recommendation. In RecSys ’17.Google Scholar
- I. Kamehkhosh and D. Jannach. 2017. User Perception of Next-Track Music Recommendations. In UMAP ’17.Google Scholar
- I. Kamehkhosh, D. Jannach, and M. Ludewig. 2017. A Comparison of Frequent Pattern Techniques and a Deep Learning Method for Session-Based Recommendation. In TempRec ’17.Google Scholar
- L. Lerche, D. Jannach, and M. Ludewig. 2016. On the value of reminders within e-commerce recommendations. In UMAP ’16.Google Scholar
- J. Li, P. Ren, Z. Chen, Z. Ren, T. Lian, and J. Ma. 2017. Neural Attentive Session-Based Recommendation. In CIKM ’17.Google Scholar
- Q. Liu, Y. Zeng, R. Mokhosi, and H. Zhang. 2018. STAMP: short-term attention/memory priority model for session-based recommendation. In KDD ’18.Google Scholar
- M. Ludewig and D. Jannach. 2018. Evaluation of session-based recommendation algorithms. UMUAI 28, 4 (2018).Google Scholar
- M. Ludewig and D. Jannach. 2019. User-centric evaluation of session-based recommendations for an automated radio station. In RecSys ’19.Google Scholar
- M. Ludewig, N. Mauro, S. Latifi, and D. Jannach. 2019. Empirical Analysis of Session-Based Recommendation Algorithms. CoRR abs/1910.12781(2019).Google Scholar
- M. Ludewig, N. Mauro, S. Latifi, and D. Jannach. 2019. Performance Comparison of Neural and Non-Neural Approaches to Session-Based Recommendation. In RecSys ’19.Google Scholar
- S. McNee, J. Riedl, and J. Konstan. 2006. Being accurate is not enough: How accuracy metrics have hurt recommender systems. In CHI ’06.Google Scholar
- F. Mi and B. Faltings. 2018. Context tree for adaptive session-based recommendation. arXiv preprint arXiv:1806.03733(2018).Google Scholar
- M. Rossetti, F. Stella, and M. Zanker. 2016. Contrasting Offline and Online Results when Evaluating Recommendation Algorithms. In RecSys ’16.Google Scholar
- K. Wagstaff. 2012. Machine Learning that Matters. In ICML’ 12.Google Scholar
- M. Wang, P. Ren, L. Mei, Z. Chen, J. Ma, and M. de Rijke. 2019. A Collaborative Session-Based Recommendation Approach with Parallel Memory Modules. In SIGIR’19.Google Scholar
- S. Wang, L. Cao, and Y. Wang. 2019. A Survey on Session-based Recommender Systems. CoRR abs/1902.04864(2019).Google Scholar
- S. Wu, Y. Tang, Y. Zhu, L. Wang, X. Xie, and T. Tan. 2019. Session-Based Recommendation with Graph Neural Networks. In AAAI ’19.Google Scholar
- F. Yuan, A. Karatzoglou, I. Arapakis, J. Jose, and X. He. 2019. A simple convolutional generative network for next item recommendation. In WSDM ’19.Google Scholar
Recommendations
Exploiting intra- and inter-session dependencies for session-based recommendations
AbstractSession-based recommender systems (SBRSs) aim at predicting the next item via learning the dynamic and short-term preferences of users. Most of the existing SBRSs usually make predictions based on the intra-session dependencies embedded in session ...
Temporal Augmented Graph Neural Networks for Session-Based Recommendations
SIGIR '21: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information RetrievalSession-based recommendation aims to predict the next item that is most likely to be clicked by an anonymous user, based on his/her clicking sequence within one visit. It becomes an essential function of many recommender systems since it protects ...
News Recommendations by Combining Intra-session with Inter-session and Content-Based Probabilistic Modelling
Computational Collective IntelligenceAbstractRecommender systems in news industry use the time dimension to reveal users’ preferences over time, but they miss to exploit adequately the information encapsulated inside user sessions. Here, we combine intra- with inter-session item transition ...
Comments