research-article

Common Pitfalls in Training and Evaluating Recommender Systems

Authors:

Hung-Hsuan Chen,

Hsin-Chien Huang,

Wen TsuiAuthors Info & Claims

ACM SIGKDD Explorations Newsletter, Volume 19, Issue 1

Pages 37 - 45

https://doi.org/10.1145/3137597.3137601

Published: 01 September 2017 Publication History

Abstract

This paper formally presents four common pitfalls in training and evaluating recommendation algorithms for information systems. Specifically, we show that it could be problematic to separate the server logs into training and test data for model generation and model evaluation if the training and the test data are selected improperly. In addition, we show that click through rate { a common metric to measure and compare the performance of different recommendation algorithms -- may not be a good measurement of profitability { the income a recommendation module brings to a website. Moreover, we demonstrate that evaluating recommendation revenue may not be a straightforward task as it first looks. Unfortunately, these pitfalls appeared in many previous studies on recommender systems and information systems. We explicitly explain these problems and propose methods to address them. We conducted experiments to support our claims. Finally, we review previous papers and competitions that may suffer from these problems.

References

[1]

ACM RecSys Challenge 2017. http://2017. recsyschallenge.com/. Accessed: 2017-07-14.

[2]

Click-through rate prediction. https://www.kaggle. com/c/avazu-ctr-prediction. Accessed: 2017-07-14.

[3]

Display advertising challenge. https://www.kaggle.com/c/criteo-display-ad-challenge. Accessed: 2017-07-14.

[4]

Outbrain click prediction. https://www.kaggle.com/c/outbrain-click-prediction. Accessed: 2017-07-14.

[5]

R. Agrawal, S. Gollapudi, A. Halverson, and S. Ieong. Diversifying search results. In Proceedings of the second ACM international conference on web search and data mining, pages 5--14. ACM, 2009.

Digital Library

[6]

D. Ben-Shimon, A. Tsikinovsky, M. Friedmann, B. Shapira, L. Rokach, and J. Hoerle. Recsys challenge 2015 and the yoochoose dataset. In Proceedings of the 9th ACM Conference on Recommender Systems, pages 357--358. ACM, 2015.

Digital Library

[7]

J. Bennett and S. Lanning. The netix prize. In Proceedings of KDD cup and workshop, volume 2007, page 35, 2007.

[8]

H.-H. Chen, L. Gou, X. Zhang, and C. L. Giles. CollabSeer: a search engine for collaboration discovery. In Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries, pages 231--240. ACM, 2011.

Digital Library

[9]

H.-H. Chen, I. Ororbia, G. Alexander, and C. L. Giles. ExpertSeer: a Keyphrase Based Expert Recommender for Digital Libraries. arXiv preprint arXiv:1511.02058, 2015.

[10]

M. Deshpande and G. Karypis. Item-based top-n recommendation algorithms. ACM Transactions on Information Systems (TOIS), 22(1):143--177, 2004.

Digital Library

[11]

D. Eck, P. Lamere, T. Bertin-Mahieux, and S. Green. Automatic generation of social tags for music recommendation. In Advances in neural information processing systems, pages 385--392, 2008.

[12]

Y. Goldberg and O. Levy. word2vec explained: Deriving mikolov et al.'s negative-sampling word-embedding method. arXiv preprint arXiv:1402.3722, 2014.

[13]

Q. Guo and E. Agichtein. Ready to buy or just browsing?: detecting web searcher goals from interaction data. In Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval, pages 130--137. ACM, 2010.

Digital Library

[14]

J. L. Herlocker, J. A. Konstan, L. G. Terveen, and J. T. Riedl. Evaluating collaborative filtering recommender systems. ACM Trans. Inf. Syst., 22(1):5--53, Jan. 2004.

Digital Library

[15]

Y. Juan, Y. Zhuang, W.-S. Chin, and C.-J. Lin. Field-aware factorization machines for ctr prediction. In Proceedings of the 10th ACM Conference on Recommender Systems, pages 43--50. ACM, 2016.

Digital Library

[16]

Y. Koren, R. Bell, C. Volinsky, et al. Matrix factorization techniques for recommender systems. Computer, 42(8):30--37, 2009.

Digital Library

[17]

L. Li, S. Chen, J. Kleban, and A. Gupta. Counter-factual estimation and optimization of click metrics in search engines: A case study. In Proceedings of the 24th International Conference on World Wide Web, pages 929--934. ACM, 2015.

Digital Library

[18]

L. Li, W. Chu, J. Langford, and X. Wang. Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms. In Proceedings of the fourth ACM international conference on Web search and data mining, pages 297--306. ACM, 2011.

Digital Library

[19]

G. Linden, B. Smith, and J. York. Amazon.com recom- mendations: Item-to-item collaborative filtering. IEEE Internet computing, 7(1):76--80, 2003.

Digital Library

[20]

I. MacKenzie. How retailers can keep up with consumers. http://www.mckinsey.com/industries/retail/our-insights/how-retailers-can-keep-up-with-consumers. Accessed: 2017-07-14.

[21]

M. J. Pazzani and D. Billsus. Content-based recommendation systems. In The adaptive web, pages 325--341. Springer, 2007.

Digital Library

[22]

S. Rendle. Factorization machines with libfm. ACM Transactions on Intelligent Systems and Technology (TIST), 3(3):57, 2012.

Digital Library

[23]

P. Romov and E. Sokolov. Recsys challenge 2015: ensemble learning with categorical features. In Proceed- ings of the 2015 International ACM Recommender Systems Challenge, page 1. ACM, 2015.

[24]

R. Salakhutdinov, A. Mnih, and G. Hinton. Restricted boltzmann machines for collaborative filtering. In Proceedings of the 24th international conference on Machine learning, pages 791--798. ACM, 2007.

Digital Library

[25]

H. Steck. Item popularity and recommendation accuracy. In Proceedings of the fifth ACM conference on Recommender systems, pages 125--132. ACM, 2011.

Digital Library

[26]

W. Xiao, X. Xu, K. Liang, J. Mao, and J. Wang. Job recommendation with hawkes process: an effective solution for recsys challenge 2016. In Proceedings of the Recommender Systems Challenge, page 11. ACM, 2016.

Digital Library

[27]

T. Zhou, Z. Kuscsik, J.-G. Liu, M. Medo, J. R. Wakeling, and Y.-C. Zhang. Solving the apparent diversity-accuracy dilemma of recommender systems. Proceedings of the National Academy of Sciences, 107(10):4511--4515, 2010.

Cited By

Fan YJi YZhang JSun A(2024)Our Model Achieves Excellent Performance on MovieLens: What Does It Mean?ACM Transactions on Information Systems10.1145/367516342:6(1-25)Online publication date: 18-Oct-2024
https://dl.acm.org/doi/10.1145/3675163
Möller LPadó S(2024)Explaining Neural News Recommendation with Attributions onto Reading HistoriesACM Transactions on Intelligent Systems and Technology10.1145/367323316:1(1-25)Online publication date: 18-Jun-2024
https://dl.acm.org/doi/10.1145/3673233
Sun A(2023)On Challenges of Evaluating Recommender Systems in an Offline SettingProceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3609495(1284-1285)Online publication date: 14-Sep-2023
https://dl.acm.org/doi/10.1145/3604915.3609495
Show More Cited By

Recommendations

Evaluating Decision-Aware Recommender Systems
RecSys '17: Proceedings of the Eleventh ACM Conference on Recommender Systems

The main goal of a Recommender System is to suggest relevant items to users, although other utility dimensions - such as diversity, novelty, confidence, possibility of providing explanations - are often considered. In this work, in order to increase the ...
Acquiring User Information Needs for Recommender Systems
WI-IAT '13: Proceedings of the 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) - Volume 03

Most recommender systems attempt to use collaborative filtering, content-based filtering or hybrid approach to recommend items to new users. Collaborative filtering recommends items to new users based on their similar neighbours, and content-based ...
Evaluating Performance of Recommender Systems: An Experimental Comparison
WI-IAT '08: Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01

Much early evaluation work focused specifically on the "accuracy" of recommendation algorithms. Good recommendation (in terms of accuracy) has, however, to be coupled with other considerations. This work suggests measures aiming at evaluating other ...

Comments

Information & Contributors

Information

Published In

cover image ACM SIGKDD Explorations Newsletter

ACM SIGKDD Explorations Newsletter Volume 19, Issue 1

June 2017

59 pages

ISSN:1931-0145

EISSN:1931-0153

DOI:10.1145/3137597

Editors:
Charu Aggarwal
IBM T.J. Watson
,
Haixun Wang
Google
,
Ankur Teredesai
University of Washington Tacoma
,
Hanghang Tong
Arizona State University

Issue’s Table of Contents

Copyright © 2017 Authors.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 September 2017

Published in SIGKDD Volume 19, Issue 1

Check for updates

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

17
Total Citations
View Citations
241
Total Downloads

Downloads (Last 12 months)7
Downloads (Last 6 weeks)2

Reflects downloads up to 13 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Fan YJi YZhang JSun A(2024)Our Model Achieves Excellent Performance on MovieLens: What Does It Mean?ACM Transactions on Information Systems10.1145/367516342:6(1-25)Online publication date: 18-Oct-2024
https://dl.acm.org/doi/10.1145/3675163
Möller LPadó S(2024)Explaining Neural News Recommendation with Attributions onto Reading HistoriesACM Transactions on Intelligent Systems and Technology10.1145/367323316:1(1-25)Online publication date: 18-Jun-2024
https://dl.acm.org/doi/10.1145/3673233
Sun A(2023)On Challenges of Evaluating Recommender Systems in an Offline SettingProceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3609495(1284-1285)Online publication date: 14-Sep-2023
https://dl.acm.org/doi/10.1145/3604915.3609495
Ji YSun AZhang JLi C(2023)A Critical Study on Data Leakage in Recommender System Offline EvaluationACM Transactions on Information Systems10.1145/356993041:3(1-27)Online publication date: 7-Feb-2023
https://dl.acm.org/doi/10.1145/3569930
Sun AChen HDuh WHuang HKato MMothe JPoblete B(2023)Take a Fresh Look at Recommender Systems from an Evaluation StandpointProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591931(2629-2638)Online publication date: 19-Jul-2023
https://dl.acm.org/doi/10.1145/3539618.3591931
Wu DLin TZhang XChen CChen JChen H(2023)Detecting Inaccurate Sensors on a Large-Scale Sensor Network Using Centralized and Localized Graph Neural NetworksIEEE Sensors Journal10.1109/JSEN.2023.328727023:15(16446-16455)Online publication date: 1-Aug-2023
https://doi.org/10.1109/JSEN.2023.3287270
Deffayet RThonet TRenders Jde Rijke M(2022)Offline Evaluation for Reinforcement Learning-Based RecommendationACM SIGIR Forum10.1145/3582900.358290556:2(1-14)Online publication date: 1-Dec-2022
https://dl.acm.org/doi/10.1145/3582900.3582905
Hsu CChen TChen H(2022)Experience: Analyzing Missing Web Page Visits and Unintentional Web Page Visits from the Client-side Web LogsJournal of Data and Information Quality10.1145/349039214:2(1-17)Online publication date: 23-Mar-2022
https://dl.acm.org/doi/10.1145/3490392
Ren HLiu JGuo BQiu CXiang LLi Z(2022)MGRec: Multi-Graph Fusion for Recommendation2022 8th International Conference on Big Data Computing and Communications (BigCom)10.1109/BigCom57025.2022.00041(266-275)Online publication date: Aug-2022
https://doi.org/10.1109/BigCom57025.2022.00041
Massimo DRicci F(2022)Building effective recommender systems for touristsAI Magazine10.1002/aaai.1205743:2(209-224)Online publication date: 23-Jun-2022
https://dl.acm.org/doi/10.1002/aaai.12057
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents