skip to main content
10.1145/3460231.3474606acmconferencesArticle/Chapter ViewAbstractPublication PagesrecsysConference Proceedingsconference-collections
abstract

Jointly Optimize Capacity, Latency and Engagement in Large-scale Recommendation Systems

Published: 13 September 2021 Publication History

Abstract

As the recommendation systems behind commercial services scale up and apply more and more sophisticated machine learning models, it becomes important to optimize computational cost (capacity) and runtime latency, besides the traditional objective of user engagement. Caching recommended results and reusing them later is a common technique used to reduce capacity and latency. However, the standard caching approach negatively impacts user engagement. To overcome the challenge, this paper presents an approach to optimizing capacity, latency and engagement simultaneously. We propose a smart caching system including a lightweight adjuster model to refresh the cached ranking scores, achieving significant capacity savings without impacting ranking quality. To further optimize latency, we introduce a prefetching strategy which leverages the smart cache. Our production deployment on Facebook Marketplace demonstrates that the approach reduces capacity demand by 50% and p75 end-to-end latency by 35%. While Facebook Marketplace is used as a case study, the approach is applicable to other industrial recommendation systems as well.

Supplementary Material

MP4 File (recsys.mp4)
As recommendation systems leverage sophisticated machine learning models and scale up for users, it becomes important to optimize computational cost (capacity) and runtime latency, besides the traditional objective of user engagement. Caching and reusing recommendations is a common technique used to reduce capacity and latency. However, the standard caching approach has a large negative impact on engagement. To overcome the challenge, we present an approach to jointly optimize capacity, latency and engagement. We propose a smart caching system including a lightweight ML model to refresh the cached ranking scores, achieving significant capacity savings without impacting ranking quality. To further optimize latency, we introduce a prefetching technique leveraging the smart cache. In production deployment on Facebook Marketplace, capacity reduced by 50% and p75 latency reduced by 35%. While Facebook Marketplace is used as a case study, the approach is applicable to most other recommendation systems as well.

References

[1]
Livia Elena Chatzieleftheriou, Merkourios Karaliopoulos, and Iordanis Koutsopoulos. 2017. Caching-aware recommendations: Nudging user preferences towards better caching performance. In IEEE Conference on Computer Communications, INFOCOM, Atlanta, GA, USA. 1–9.
[2]
Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep Neural Networks for YouTube Recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems, Boston, MA, USA. ACM, 191–198.
[3]
Viet Ha-Thuc, Matthew Wood, Yunli Liu, and Jagadeesan Sundaresan. 2021. From Producer Success to Retention: a New Role of Search and Recommendation Systems on Marketplaces. In The 44th International ACM Conference on Research and Development in Information Retrieval, ACM SIGIR.
[4]
Geoffrey E. Hinton, Oriol Vinyals, and Jeffrey Dean. 2015. Distilling the Knowledge in a Neural Network. CoRR abs/1503.02531(2015).
[5]
Dilip Kumar Krishnappa, Michael Zink, Carsten Griwodz, and Pål Halvorsen. 2015. Cache-Centric Video Recommendation: An Approach to Improve the Efficiency of YouTube Caches. ACM Trans. Multim. Comput. Commun. Appl. 11, 4 (2015), 48:1–48:20.
[6]
Benjamin Letham and Eytan Bakshy. 2019. Bayesian Optimization for Policy Search via Online-Offline Experimentation. J. Mach. Learn. Res. 20(2019), 145:1–145:30.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
RecSys '21: Proceedings of the 15th ACM Conference on Recommender Systems
September 2021
883 pages
ISBN:9781450384582
DOI:10.1145/3460231
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 September 2021

Check for updates

Author Tags

  1. caching
  2. multi-objective optimization
  3. transfer learning

Qualifiers

  • Abstract
  • Research
  • Refereed limited

Conference

RecSys '21: Fifteenth ACM Conference on Recommender Systems
September 27 - October 1, 2021
Amsterdam, Netherlands

Acceptance Rates

Overall Acceptance Rate 254 of 1,295 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 445
    Total Downloads
  • Downloads (Last 12 months)20
  • Downloads (Last 6 weeks)3
Reflects downloads up to 21 Jan 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media