short-paper

A Complete Framework for Offline and Counterfactual Evaluations of Interactive Recommendation Systems

Authors:
Yan Andrade

Universidade Federal de São João del-Rei, Brazil

Universidade Federal de São João del-Rei, Brazil

0009-0003-7243-4705
View Profile

,
Nícollas Silva

Universidade Federal de Minas Gerais, Brazil

Universidade Federal de Minas Gerais, Brazil

0000-0003-4393-3348
View Profile

,
Thiago Silva

Universidade Federal de São João del-Rei, Brazil

Universidade Federal de São João del-Rei, Brazil

0000-0003-0870-593X
View Profile

,
Adriano Pereira

Universidade Federal de Minas Gerais, Brazil

Universidade Federal de Minas Gerais, Brazil

0000-0003-2389-0512
View Profile

,
Diego Dias

Universidade Federal de São João del-Rei, Brazil

Universidade Federal de São João del-Rei, Brazil

0000-0001-9619-2171
View Profile

,
Elisa T. Albergaria

Universidade Federal de São João del-Rei, Brazil

Universidade Federal de São João del-Rei, Brazil

0000-0003-3595-9978
View Profile

,
Leonardo Rocha

Universidade Federal de São João del-Rei, Brazil

Universidade Federal de São João del-Rei, Brazil

0000-0002-4913-4902
View Profile

WebMedia '23: Proceedings of the 29th Brazilian Symposium on Multimedia and the WebOctober 2023Pages 193–197https://doi.org/10.1145/3617023.3617049

Published:23 October 2023Publication History

WebMedia '23: Proceedings of the 29th Brazilian Symposium on Multimedia and the Web

Pages 193–197

ABSTRACT

Interactive recommendation has been recognized as a Multi-Armed Bandit (MAB) problem. Items are arms to be pulled (i.e., recommended) and the user’s satisfaction is the reward to be maximized. Despite the advances, there is still a lack of consensus on the best practices to evaluate such solutions. Recently, two complementary frameworks were proposed to evaluate bandit solutions more accurately: iRec and OBP. The first one has a complete set of offline metrics and bandit models that allows us to perform an comparisons with several evaluation policies. The second one provides a huge set of bandit models to be evaluated through several counterfactual estimators. However, there is a room to be explored when joining these two frameworks. We propose and evaluate an integration between both, demonstrating the potential and richness of such combination.

References

Marc Abeille and Alessandro Lazaric. 2017. Linear thompson sampling revisited. In Artificial Intelligence and Statistics. PMLR, 176–184. https://doi.org/10.48550/arXiv.1611.06534Google ScholarCross Ref
Peter Auer. 2002. Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research 3, Nov (2002), 397–422. https://doi.org/10.1162/153244303321897663Google ScholarCross Ref
Peter Auer, Nicolo Cesa-Bianchi, and Paul Fischer. 2002. Finite-time analysis of the multiarmed bandit problem. Machine learning 47, 2-3 (2002), 235–256. https://doi.org/10.1023/A:1013689704352Google ScholarDigital Library
Olivier Chapelle and Lihong Li. 2011. An empirical evaluation of thompson sampling. In Advances in neural information processing systems. 2249–2257.Google Scholar
Jaya Kawale, Hung H Bui, Branislav Kveton, Long T Thanh, and Sanjay Chawla. 2015. Efficient thompson sampling for online matrix-factorization recommendation. Advances in Neural Information Processing Systems 28 (2015), 1297–1305. https://doi.org/10.5555/2969239.2969384Google ScholarDigital Library
Shuai Li, Alexandros Karatzoglou, and Claudio Gentile. 2016. Collaborative filtering bandits. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval. 539–548.Google ScholarDigital Library
Yaxu Liu, Jui-Nan Yen, Bowen Yuan, Rundong Shi, Peng Yan, and Chih-Jen Lin. 2022. Practical Counterfactual Policy Learning for Top-K Recommendations. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 1141–1151.Google ScholarDigital Library
Weishen Pan, Sen Cui, Hongyi Wen, Kun Chen, Changshui Zhang, and Fei Wang. 2021. Correcting the User Feedback-Loop Bias for Recommendation Systems. arXiv preprint arXiv:2109.06037 (2021).Google Scholar
Yuta Saito, Shunsuke Aihara, Megumi Matsutani, and Yusuke Narita. 2020. Open bandit dataset and pipeline: Towards realistic and reproducible off-policy evaluation. arXiv preprint arXiv:2008.07146 (2020).Google Scholar
Javier Sanz-Cruzado, Pablo Castells, and Esther López. 2019. A simple multi-armed nearest-neighbor bandit for interactive recommendation. In Proceedings of the 13th ACM Conference on Recommender Systems. 358–362.Google ScholarDigital Library
Sulthana Shams, Daron Anderson, and Douglas Leith. 2021. Cluster-Based Bandits: Fast Cold-Start for Recommender System New Users. (2021).Google Scholar
Nicollas Silva, Thiago Silva, Heitor Werneck, Leonardo Rocha, and Adriano Pereira. 2023. User Cold-Start Problem in Multi-Armed Bandits: When the First Recommendations Guide the User’s Experience. ACM Trans. Recomm. Syst. 1, 1 (2023). https://doi.org/10.1145/3554819Google ScholarDigital Library
Nícollas Silva, Heitor Werneck, Thiago Silva, Adriano C. M. Pereira, and Leonardo Rocha. 2021. A contextual approach to improve the user’s experience in interactive recommendation systems. In WebMedia ’21: Brazilian Symposium on Multimedia and the Web, Belo Horizonte, Minas Gerais, Brazil, November 5-12, 2021, Adriano César Machado Pereira and Leonardo Chaves Dutra da Rocha (Eds.). ACM, 89–96. https://doi.org/10.1145/3470482.3479621Google ScholarDigital Library
Thiago Silva, Nícollas Silva, Carlos Mito, Adriano C. M. Pereira, and Leonardo Rocha. 2022. Interactive POI Recommendation: applying a Multi-Armed Bandit framework to characterise and create new models for this scenario. In WebMedia ’22: Brazilian Symposium on Multimedia and Web, Curitiba, Brazil, November 7 - 11, 2022, Thiago Henrique Silva, Leyza Baldo Dorini, Jussara M. Almeida, and Humberto Torres Marques-Neto (Eds.). ACM, 211–221. https://doi.org/10.1145/3539637.3557060Google ScholarDigital Library
Thiago Silva, Nícollas Silva, Heitor Werneck, Carlos Mito, Adriano CM Pereira, and Leonardo Rocha. 2022. Irec: An interactive recommendation framework. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 3165–3175.Google ScholarDigital Library
Qing Wang, Chunqiu Zeng, Wubai Zhou, Tao Li, S Sitharama Iyengar, Larisa Shwartz, and Genady Ya Grabarnik. 2018. Online interactive collaborative filtering using multi-armed bandit with dependent arms. IEEE Transactions on Knowledge and Data Engineering 31, 8 (2018), 1569–1580.Google ScholarDigital Library
Qingyun Wu, Naveen Iyer, and Hongning Wang. 2018. Learning contextual bandits in a non-stationary environment. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. 495–504.Google ScholarDigital Library
Yanming Yang, Xin Xia, David Lo, and John Grundy. 2022. A survey on deep learning for software engineering. ACM Computing Surveys (CSUR) 54, 10s (2022), 1–73.Google ScholarDigital Library
Xiaoxue Zhao, Weinan Zhang, and Jun Wang. 2013. Interactive collaborative filtering. In Proceedings of the 22nd ACM international conference on Information & Knowledge Management. 1411–1420.Google ScholarDigital Library
Sijin Zhou, Xinyi Dai, Haokun Chen, Weinan Zhang, Kan Ren, Ruiming Tang, Xiuqiang He, and Yong Yu. 2020. Interactive recommender system via knowledge graph-enhanced reinforcement learning. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 179–188.Google ScholarDigital Library
Lixin Zou, Long Xia, Yulong Gu, Xiangyu Zhao, Weidong Liu, Jimmy Xiangji Huang, and Dawei Yin. 2020. Neural Interactive Collaborative Filtering. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 749–758.Google ScholarDigital Library

Index Terms

A Complete Framework for Offline and Counterfactual Evaluations of Interactive Recommendation Systems
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Recommender systems

Recommendations

A comparative analysis of offline and online evaluations and discussion of research paper recommender system evaluation
RepSys '13: Proceedings of the International Workshop on Reproducibility and Replication in Recommender Systems Evaluation

Offline evaluations are the most common evaluation method for research paper recommender systems. However, no thorough discussion on the appropriateness of offline evaluations has taken place, despite some voiced criticism. We conducted a study in which ...
Read More
Revisiting offline evaluation for implicit-feedback recommender systems
RecSys '19: Proceedings of the 13th ACM Conference on Recommender Systems

Recommender systems are typically evaluated in an offline setting. A subset of the available user-item interactions is sampled to serve as test set, and some model trained on the remaining data points is then evaluated on its performance to predict ...
Read More
Bridging the Gap Between User-centric and Offline Evaluation of Personalized Recommendation Systems
UMAP '18: Adjunct Publication of the 26th Conference on User Modeling, Adaptation and Personalization

In this paper, we propose to evaluate recommender systems by conducting both offline and user-centric evaluations, while considering multiple quality aspects in realistic settings. This comprehensive evaluation would provide insight on how to improve ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

WebMedia '23: Proceedings of the 29th Brazilian Symposium on Multimedia and the Web
October 2023
285 pages
ISBN:9798400709081
DOI:10.1145/3617023

Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 23 October 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Contextual Bandits
Counterfactual Evaluation
Offline Evaluation
Qualifiers
- short-paper
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate270of873submissions,31%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 15
  Total Downloads
- Downloads (Last 12 months)15
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

A Complete Framework for Offline and Counterfactual Evaluations of Interactive Recommendation Systems

WebMedia '23: Proceedings of the 29th Brazilian Symposium on Multimedia and the Web

ABSTRACT

References

Cited By

Index Terms

Recommendations

A comparative analysis of offline and online evaluations and discussion of research paper recommender system evaluation

Revisiting offline evaluation for implicit-feedback recommender systems

Bridging the Gap Between User-centric and Offline Evaluation of Personalized Recommendation Systems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

A Complete Framework for Offline and Counterfactual Evaluations of Interactive Recommendation Systems

WebMedia '23: Proceedings of the 29th Brazilian Symposium on Multimedia and the Web

ABSTRACT

References

Cited By

Index Terms

Recommendations

A comparative analysis of offline and online evaluations and discussion of research paper recommender system evaluation

Revisiting offline evaluation for implicit-feedback recommender systems

Bridging the Gap Between User-centric and Offline Evaluation of Personalized Recommendation Systems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media