extended-abstract

Bridging the Gap: Unpacking the Hidden Challenges in Knowledge Distillation for Online Ranking Systems

Authors:

Li Wei,

Ed ChiAuthors Info & Claims

RecSys '24: Proceedings of the 18th ACM Conference on Recommender Systems

Pages 758 - 761

https://doi.org/10.1145/3640457.3688055

Published: 08 October 2024 Publication History

Get Access

Abstract

Knowledge Distillation (KD) is a powerful approach for compressing a large model into a smaller, more efficient model, particularly beneficial for latency-sensitive applications like recommender systems. However, current KD research predominantly focuses on Computer Vision (CV) and NLP tasks, overlooking unique data characteristics and challenges inherent to recommender systems. This paper addresses these overlooked challenges, specifically: (1) mitigating data distribution shifts between teacher and student models, (2) efficiently identifying optimal teacher configurations within time and budgetary constraints, and (3) enabling computationally efficient and rapid sharing of teacher labels to support multiple students. We present a robust KD system developed and rigorously evaluated on multiple large-scale personalized video recommendation systems within Google. Our live experiment results demonstrate significant improvements in student model performance while ensuring consistent and reliable generation of high-quality teacher labels from a continuous data stream of data.

References

[1]

Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, 2016. Wide & deep learning for recommender systems. In Proceedings of the 1st workshop on deep learning for recommender systems. 7–10.

Digital Library

Google Scholar

[2]

Yu Cheng, Duo Wang, Pan Zhou, and Tao Zhang. 2017. A survey of model compression and acceleration for deep neural networks. arXiv 2017. arXiv preprint arXiv:1710.09282 (2017).

Google Scholar

[3]

Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep neural networks for youtube recommendations. In Proceedings of the 10th ACM conference on recommender systems. 191–198.

Digital Library

Google Scholar

[4]

Carlos A Gomez-Uribe and Neil Hunt. 2015. The netflix recommender system: Algorithms, business value, and innovation. ACM Transactions on Management Information Systems (TMIS) 6, 4 (2015), 1–19.

Digital Library

Google Scholar

[5]

Priya Goyal, Piotr Dollár, Ross Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, and Kaiming He. 2017. Accurate, large minibatch sgd: Training imagenet in 1 hour. arXiv preprint arXiv:1706.02677 (2017).

Google Scholar

[6]

Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015).

Google Scholar

[7]

Alex Krizhevsky and Geoff Hinton. 2010. Convolutional deep belief networks on cifar-10. Unpublished manuscript 40, 7 (2010), 1–9.

Google Scholar

[8]

Seyed Iman Mirzadeh, Mehrdad Farajtabar, Ang Li, Nir Levine, Akihiro Matsukawa, and Hassan Ghasemzadeh. 2020. Improved knowledge distillation via teacher assistant. In Proceedings of the AAAI conference on artificial intelligence, Vol. 34. 5191–5198.

Crossref

Google Scholar

[9]

Harald Steck. 2018. Calibrated recommendations. In Proceedings of the 12th ACM conference on recommender systems. 154–162.

Digital Library

Google Scholar

[10]

Jiaxi Tang, Yoel Drori, Daryl Chang, Maheswaran Sathiamoorthy, Justin Gilmer, Li Wei, Xinyang Yi, Lichan Hong, and Ed H Chi. 2023. Improving Training Stability for Multitask Ranking Models in Recommender Systems. arXiv preprint arXiv:2302.09178 (2023).

Google Scholar

Index Terms

Bridging the Gap: Unpacking the Hidden Challenges in Knowledge Distillation for Online Ranking Systems
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Multi-task learning
2. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Learning to rank
    2. Retrieval tasks and goals
      1. Recommender systems

Recommendations

Predicting User Engagement in Twitter with Collaborative Ranking
RecSysChallenge '14: Proceedings of the 2014 Recommender Systems Challenge

Collaborative Filtering (CF) is a core component of popular web-based services such as Amazon, YouTube, Netflix, and Twitter. Most applications use CF to recommend a small set of items to the user. For instance, YouTube presents to a user a list of top-...
Bridging the gap: experience with the qatar summer college preview program
FIE'09: Proceedings of the 39th IEEE international conference on Frontiers in education conference

The transition from high school to college is a challenge for many. The academic, social, and pedagogical expectations among high schools and colleges in different cultures can make this transition even more challenging. How does one bridge the gap from ...
Adapting Recommendations to Contextual Changes Using Hierarchical Hidden Markov Models
RecSys '15: Proceedings of the 9th ACM Conference on Recommender Systems

Recommender systems help users find items of interest by tailoring their recommendations to users' personal preferences. The utility of an item for a user, however, may vary greatly depending on that user's specific situation or the context in which the ...

Comments

Information & Contributors

Information

Published In

RecSys '24: Proceedings of the 18th ACM Conference on Recommender Systems

October 2024

1438 pages

ISBN:9798400705052

DOI:10.1145/3640457

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 October 2024

Check for updates

Author Tags

Qualifiers

Extended-abstract
Research
Refereed limited

Conference

RecSys '24

Sponsor:

RecSys '24: 18th ACM Conference on Recommender Systems

October 14 - 18, 2024

Bari, Italy

Acceptance Rates

Overall Acceptance Rate 254 of 1,295 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
271
Total Downloads

Downloads (Last 12 months)271
Downloads (Last 6 weeks)20

Reflects downloads up to 18 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Abstract

References

Index Terms

Recommendations

Predicting User Engagement in Twitter with Collaborative Ranking

Bridging the gap: experience with the qatar summer college preview program

Adapting Recommendations to Contextual Changes Using Hierarchical Hidden Markov Models

Comments

Information

Published In

Sponsors

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Login options

Full Access

View options

PDF

eReader

HTML Format

Share

Share this Publication link

Share on social media

Affiliations