skip to main content
10.1145/3575693.3575718acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article
Public Access

EVStore: Storage and Caching Capabilities for Scaling Embedding Tables in Deep Recommendation Systems

Published:30 January 2023Publication History

ABSTRACT

Modern recommendation systems, primarily driven by deep-learning models, depend on fast model inferences to be useful. To tackle the sparsity in the input space, particularly for categorical variables, such inferences are made by storing increasingly large embedding vector (EV) tables in memory. A core challenge is that the inference operation has an all-or-nothing property: each inference requires multiple EV table lookups, but if any memory access is slow, the whole inference request is slow. In our paper, we design, implement and evaluate EVStore, a 3-layer EV table lookup system that harnesses both structural regularity in inference operations and domain-specific approximations to provide optimized caching, yielding up to 23% and 27% reduction on the average and p90 latency while quadrupling throughput at 0.2% loss in accuracy. Finally, we show that at a minor cost of accuracy, EVStore can reduce the Deep Recommendation System (DRS) memory usage by up to 94%, yielding potentially enormous savings for these costly, pervasive systems.

References

  1. [n. d.]. https://github.com/ucare-uchicago/ev-store-dlrm Google ScholarGoogle Scholar
  2. [n. d.]. Chameleon Cloud Testbed. https://www.chameleoncloud.org/ Google ScholarGoogle Scholar
  3. [n. d.]. RocksDB. http://rocksdb.org/ Google ScholarGoogle Scholar
  4. 2013. Download Terabyte Click Logs. https://labs.criteo.com/2013/12/download-terabyte-click-logs/ Google ScholarGoogle Scholar
  5. 2014. Click-Through Rate Prediction: Predict whether a mobile ad will be clicked. https://www.kaggle.com/c/avazu-ctr-prediction Google ScholarGoogle Scholar
  6. 2014. Display Advertising Challenge. https://www.kaggle.com/c/criteo-display-ad-challenge Google ScholarGoogle Scholar
  7. 2018. Notes from the ai frontier insights from hundreds of use cases. https://www.mckinsey.com/featured-insights/artificial-intelligence/ Google ScholarGoogle Scholar
  8. 2019. Use cases of recommendation systems in business current applications and methods. https://emerj.com/ai-sector-overviews/use-cases-recommendation-systems/ Google ScholarGoogle Scholar
  9. 2020. SQLite. https://www.sqlite.org/index.html Google ScholarGoogle Scholar
  10. 2021. How machine learning powers Facebook’s News Feed ranking algorithm. https://engineering.fb.com/2021/01/26/ml-applications/news-feed-ranking/ Google ScholarGoogle Scholar
  11. 2022. Benchmarks for Java In Memory Caches. https://github.com/cache2k/cache2k-benchmark Google ScholarGoogle Scholar
  12. 2022. CORTX-Motr. https://github.com/Seagate/cortx-motr Google ScholarGoogle Scholar
  13. 2022. Memory Prices. https://memory.net/memory-prices/ Google ScholarGoogle Scholar
  14. Alaa R Alameldeen and David A Wood. 2004. Adaptive cache compression for high-performance processors. In Proceedings of the 31st Annual International Symposium on Computer Architecture (ISCA). Google ScholarGoogle ScholarCross RefCross Ref
  15. David G. Andersen, Jason Franklin, Michael Kaminsky, Amar Phanishayee, Lawrence Tan, and Vijay Vasudevan. 2009. FAWN: A fast array of wimpy nodes. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles (SOSP). Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Ehsan K. Ardestani, Changkyu Kim, Seung Jae Lee, Luoshang Pan, Valmiki Rampersad, Jens Axboe, Banit Agrawal, Fuxun Yu, Ansha Yu, Trung Le, Hector Yuen, Shishir Juluri, Akshat Nanda, Manoj Wodekar, Dheevatsa Mudigere, Krishnakumar Nair, Maxim Naumov, Chris Peterson, Mikhail Smelyanskiy, and Vijay Rao. 2021. Supporting Massive DLRM Inference Through Software Defined Memory. https://arxiv.org/abs/2110.11489. Google ScholarGoogle Scholar
  17. Bahar Asgari, Ramyad Hadidi, Jiashen Cao, Da Eun Shim, Sung Kyu Lim, and Hyesoon Kim. 2021. Fafnir: Accelerating sparse gathering by using efficient near-memory intelligent reduction. In 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA). Google ScholarGoogle ScholarCross RefCross Ref
  18. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural Machine Translation by Jointly Learning to Align and Translate. In Proceedings of The 3rd International Conference on Learning Representations (ICLR). Google ScholarGoogle Scholar
  19. Sorav Bansal and Dharmendra S. Modha. 2004. CAR: Clock with Adaptive Replacement. In Proceedings of The FAST ’04 Conference on File and Storage Technologies. Google ScholarGoogle Scholar
  20. Nathan Beckmann, Haoxian Chen, and Asaf Cidon. 2018. LHD: Improving cache hit rate by maximizing hit density. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI). Google ScholarGoogle Scholar
  21. Nathan Beckmann and Daniel Sanchez. 2016. Modeling cache performance beyond LRU. In 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA). Google ScholarGoogle ScholarCross RefCross Ref
  22. Nathan Beckmann and Daniel Sanchez. 2017. Maximizing Cache Performance Under Uncertainty. In Proceedings of the 23rd international symposium on High Performance Computer Architecture (HPCA-23). Google ScholarGoogle ScholarCross RefCross Ref
  23. Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, Rohan Anil, Zakaria Haque, Lichan Hong, Vihan Jain, Xiaobing Liu, and Hemal Shah. 2016. Wide & Deep Learning for Recommender Systems. In Proceedings of The 1st Workshop on Deep Learning for Recommender Systems (DLRS@RecSys). Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep Neural Networks for YouTube Recommendations. In Proceedings of The 10th ACM Conference on Recommender Systems (RecSys). Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Asit Dan and Don Towsley. 1990. An approximate analysis of the LRU and FIFO buffer replacement schemes. In Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Per-Erik Danielsson. 1980. Euclidean distance mapping. Computer Graphics and image processing. Google ScholarGoogle Scholar
  27. Jesse Davis and Mark H. Goadrich. 2006. The relationship between Precision-Recall and ROC curves. In Proceedings of the 23rd international conference on Machine learning. Google ScholarGoogle Scholar
  28. Assaf Eisenman, Maxim Naumov, Darryl Gardner, Misha Smelyanskiy, Sergey Pupyrev, Kim M. Hazelwood, Asaf Cidon, and Sachin Katti. 2019. Bandana: Using Non-Volatile Memory for Storing Deep Learning Models. In Proceedings of The 2nd Conference on Machine Learning and Systems (MLSys). Google ScholarGoogle Scholar
  29. Pankaj Gupta, Ashish Goel, Jimmy Lin, Aneesh Sharma, Dong Wang, and Reza Zadeh. 2013. WTF: the who to follow service at Twitter. In Proceedings of the 22nd international conference on World Wide Web (WWW). Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Udit Gupta, Carole-Jean Wu, Xiaodong Wang, Maxim Naumov, Brandon Reagen, David Brooks, Bradford Cottel, Kim M. Hazelwood, Mark Hempstead, Bill Jia, Hsien-Hsin S. Lee, Andrey Malevich, Dheevatsa Mudigere, Mikhail Smelyanskiy, Liang Xiong, and Xuan Zhang. 2020. The Architectural Implications of Facebook’s DNN-based Personalized Recommendation. In Proceedings of The 26th IEEE International Symposium on High-Performance Computer Architecture (HPCA). Google ScholarGoogle ScholarCross RefCross Ref
  31. Kim M. Hazelwood, Sarah Bird, David M. Brooks, Soumith Chintala, Utku Diril, Dmytro Dzhulgakov, Mohamed Fawzy, Bill Jia, Yangqing Jia, Aditya Kalro, James Law, Kevin Lee, Jason Lu, Pieter Noordhuis, Misha Smelyanskiy, Liang Xiong, and Xiaodong Wang. 2018. Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective. In Proceedings of The 24th IEEE International Symposium on High-Performance Computer Architecture (HPCA). Google ScholarGoogle ScholarCross RefCross Ref
  32. Tayler H Hetherington, Mike O’Connor, and Tor M Aamodt. 2015. Memcachedgpu: Scaling-up scale-out key-value stores. In Proceedings of the 6th ACM Symposium on Cloud Computing. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Gisli R. Hjaltason and Hanan Samet. 2003. Properties of embedding methods for similarity searching in metric spaces. IEEE Transactions on Pattern Analysis and machine intelligence. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Seokin Hong, Bulent Abali, Alper Buyuktosunoglu, Michael B Healy, and Prashant J Nair. 2019. Touché: Towards ideal and efficient cache compression by mitigating tag area overheads. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. László Jeni, Jeffrey Cohn, and Fernando De la Torre. 2013. Facing Imbalanced Data - Recommendations for the Use of Performance Metrics. Proceedings - 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction, ACII 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Song Jiang, Feng Chen, and Xiaodong Zhang. 2005. CLOCK-Pro: An Effective Improvement of the CLOCK Replacement. In Proceedings of The 2005 USENIX Annual Technical Conference. Google ScholarGoogle Scholar
  37. S. Jiang and X. Zhang. 2002. LIRS: An efficient low inter reference recency set replacement policy to improve buffer cache performance. In Proceedings of The International Conference on Measurements and Modeling of Computer Systems (SIGMETRICS). Google ScholarGoogle Scholar
  38. Norman P Jouppi, Doe Hyun Yoon, Matthew Ashcraft, Mark Gottscho, Thomas B Jablin, George Kurian, James Laudon, Sheng Li, Peter Ma, and Xiaoyu Ma. 2021. Ten lessons from three generations shaped Google’s TPUv4i: Industrial product. In 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA). Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Anne Kao and Steve R. Poteet. 2007. Natural language processing and text mining. Google ScholarGoogle Scholar
  40. Liu Ke, Udit Gupta, Benjamin Youngjae Cho, David Brooks, Vikas Chandra, Utku Diril, Amin Firoozshahian, Kim M. Hazelwood, Bill Jia, Hsien-Hsin S. Lee, Meng Li, Bert Maher, Dheevatsa Mudigere, Maxim Naumov, Martin Schatz, Mikhail Smelyanskiy, Xiaodong Wang, Brandon Reagen, Carole-Jean Wu, Mark Hempstead, and Xuan Zhang. 2020. Recnmp: Accelerating personalized recommendation with near-memory processing. In 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA). Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Kate Keahey, Jason Anderson, Zhuo Zhen, Pierre Riteau, Paul Ruth, Dan Stanzione, Mert Cevik, Jacob Colleran, Haryadi S. Gunawi, Cody Hammock, Joe Mambretti, Alexander Barnes, François Halbach, Alex Rocha, and Joe Stubbs. 2020. Lessons Learned from the Chameleon Testbed. In Proceedings of the 2020 USENIX Annual Technical Conference (ATC). Google ScholarGoogle Scholar
  42. Guy K Kloss. 2009. Automatic C library wrapping Ctypes from the trenches. Google ScholarGoogle Scholar
  43. Youngeun Kwon, Yunjae Lee, and Minsoo Rhu. 2019. Tensordimm: A practical near-memory processing architecture for embeddings and tensor operations in deep learning. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Donghee Lee, Jongmoo Choi, Jong-Hun Kim, Sam H. Noh, Sang Lyul Min, Yookun Cho, and Chong-Sang Kim. 1999. On the existence of a spectrum of policies that subsumes the least recently used (LRU) and least frequently used (LFU) policies. In Proceedings of the 1999 ACM SIGMETRICS international conference on Measurement and modeling of computer systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Yejin Lee, Seong Hoon Seo, Hyunji Choi, Hyoung Uk Sul, Soosung Kim, Jae W. Lee, and Tae Jun Ham. 2021. MERCI: efficient embedding reduction on commodity hardware via sub-query memoization. In Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Adam Lerer, Ledell Wu, Jiajun Shen, Timothée Lacroix, Luca Wehrstedt, Abhijit Bose, and Alex Peysakhovich. 2019. Pytorch-BigGraph: A Large Scale Graph Embedding System. In Proceedings of The 2nd Conference on Machine Learning and Systems (MLSys). Google ScholarGoogle Scholar
  47. Huaicheng Li, Daniel S. Berger, Stanko Novakovic, Lisa Hsu, Dan Ernst, Pantea Zardoshti, Monish Shah, Ishwar Agarwal, Mark D. Hill, Marcus Fontoura, and Ricardo Bianchini. 2022. First-generation Memory Disaggregation for Cloud Platforms. https://arxiv.org/pdf/2203.00241. Google ScholarGoogle Scholar
  48. Leo Liberti, Carlile Lavor, Nelson Maculan, and Antonio Mucherino. 2014. Euclidean distance geometry and applications. SIAM review. Google ScholarGoogle Scholar
  49. Hyeontaek Lim, Bin Fan, David G Andersen, and Michael Kaminsky. 2011. SILT: A memory-efficient, high-performance key-value store. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles (SOSP). Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. N. Megiddo and D. S. Modha. 2003. ARC: A self-tuning, low overhead replacement cache. In Proceedings of The FAST ’03 Conference on File and Storage Technologies. Google ScholarGoogle Scholar
  51. Xupeng Miao, Hailin Zhang, Yining Shi, Xiaonan Nie, Zhi Yang, Yangyu Tao, and Bin Cui. 2021. Het: Scaling out huge embedding model training via cache-enabled distributed framework. arXiv:2112.07221. Google ScholarGoogle Scholar
  52. Jason Mohoney, Roger Waleffe, Henry Xu, Theodoros Rekatsinas, and Shivaram Venkataraman. 2021. Marius: Learning Massive Graph Embeddings on a Single Machine. In Proceedings of The 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI). Google ScholarGoogle Scholar
  53. James C. Mullikin. 1992. The vector distance transform in two and three dimensions. CVGIP: Graphical Models and Image Processing. Google ScholarGoogle Scholar
  54. Maxim Naumov, Dheevatsa Mudigere, Hao-Jun Michael Shi, Jianyu Huang, Narayanan Sundaraman, Jongsoo Park, Xiaodong Wang, Udit Gupta, Carole-Jean Wu, Alisson G. Azzolini, Dmytro Dzhulgakov, Andrey Mallevich, Ilia Cherniavskii, Yinghai Lu, Raghuraman Krishnamoorthi, Ansha Yu, Volodymyr Kondratenko, Stephanie Pereira, Xianjie Chen, Wenlin Chen, Vijay Rao, Bill Jia, Liang Xiong, and Misha Smelyanskiy. 2019. Deep learning recommendation model for personalization and recommendation systems. arXiv:1906.00091. Google ScholarGoogle Scholar
  55. Victor F Nicola, Asit Dan, and Daniel M Dias. 1992. Analysis of the generalized clock buffer replacement scheme for database transaction processing. In Proceedings of the 1992 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Even Oldridge, Julio Perez, Ben Frederickson, Nicolas Koumchatzky, Minseok Lee, Zehuan Wang, Lei Wu, Fan Yu, Rick Zamora, O Yılmaz, Alec Gunny, and Vinh Nguyen. 2020. Merlin: a gpu accelerated recommendation framework. Proceeding s of IRS. Google ScholarGoogle Scholar
  57. E. Theodore L. Omtzigt, Peter Gottschling, Mark Seligman, and William Zorn. 2020. Universal Numbers Library: design and implementation of a high-performance reproducible number systems library. arXiv:2012.11011. Google ScholarGoogle Scholar
  58. Jongsoo Park, Maxim Naumov, Protonu Basu, Summer Deng, Aravind Kalaiah, Daya Shanker Khudia, James Law, Parth Malani, Andrey Malevich, Nadathur Satish, Juan Miguel Pino, Martin Schatz, Alexander Sidorov, Viswanath Sivakumar, Andrew Tulloch, Xiaodong Wang, Yiming Wu, Hector Yuen, Utku Diril, Dmytro Dzhulgakov, Kim M. Hazelwood, Bill Jia, Yangqing Jia, Lin Qiao, Vijay Rao, Nadav Rotem, Sungjoo Yoo, and Mikhail Smelyanskiy. 2018. Deep Learning Inference in Facebook Data Centers: Characterization, Performance Optimizations and Hardware Implications. https://arxiv.org/abs/1811.09886. Google ScholarGoogle Scholar
  59. Gang Qian, Shamik Sural, Yuelong Gu, and Sakti Pramanik. 2004. Similarity between Euclidean and cosine angle distance for nearest neighbor queries. In Proceedings of the 2004 ACM symposium on Applied computing. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Liana V. Rodriguez, Farzana Beente Yusuf, Steven Lyons, Eysler Paz, Raju Rangaswami, Jason Liu, Ming Zhao, and Giri Narasimhan. 2021. Learning Cache Replacement with CACHEUS. In Proceedings of The 19th USENIX Conference on File and Storage Technologies (FAST). Google ScholarGoogle Scholar
  61. Trausti Saemundsson, Hjortur Bjornsson, Gregory Chockler, and Ymir Vigfusson. 2014. Dynamic performance profiling of cloud caches. In Proceedings of the ACM Symposium on Cloud Computing. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Amit Sharma, Jake M Hofman, and Duncan J Watts. 2015. Estimating the causal impact of recommendation systems from observational data. In Proceedings of the Sixteenth ACM Conference on Economics and Computation. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Zhenyu Song, Daniel S Berger, Kai Li, Anees Shaikh, Wyatt Lloyd, Soudeh Ghorbani, Changhoon Kim, Aditya Akella, Arvind Krishnamurthy, and Emmett Witchel. 2020. Learning relaxed belady for content distribution network caching. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI). Google ScholarGoogle Scholar
  64. Dimitra Tsigkari and Thrasyvoulos Spyropoulos. 2022. An approximation algorithm for joint caching and recommendations in cache networks. IEEE Transactions on Network and Service Management. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Uresh Vahalia. 1996. Unix Internals: The New Frontiers. Google ScholarGoogle Scholar
  66. G. Vietri, L. V. Rodriguez, W. A. Martinez, S. Lyons, J. Liu, R. Rangaswami, M. Zhao, and G. Narasimhan. 2018. Driving Cache Replacement with ML-based LeCaR. In Proceedings of The 10th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage). Google ScholarGoogle Scholar
  67. S Vijayarani and R Janani. 2016. Text mining: open source tokenization tools-an analysis. Advanced Computational Intelligence: An International Journal (ACII). Google ScholarGoogle Scholar
  68. Hu Wan, Xuan Sun, Yufei Cui, Chia-Lin Yang, Tei-Wei Kuo, and Chun Jason Xue. 2021. FlashEmbedding: storing embedding tables in SSD for large-scale recommender systems. In Proceedings of The 12th ACM SIGOPS Asia-Pacific Workshop on Systems (APSys). Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. Jizhe Wang, Pipei Huang, Huan Zhao, Zhibo Zhang, Binqiang Zhao, and Dik Lun Lee. 2018. Billion-scale Commodity Embedding for E-commerce Recommendation in Alibaba. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD). Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Ruoxi Wang, Bin Fu, Gang Fu, and Mingliang Wang. 2017. Deep & cross network for ad click predictions. In Proceedings of ADKDD. Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. Ruoxi Wang, Rakesh Shivanna, Derek Cheng, Sagar Jain, Dong Lin, Lichan Hong, and Ed Chi. 2021. DCN v2: Improved deep & cross network and practical lessons for web-scale learning to rank systems. In Proceedings of the Web Conference 2021. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. Mark Wilkening, Udit Gupta, Samuel Hsia, Caroline Trippel, Carole-Jean Wu, David Brooks, and Gu-Yeon Wei. 2021. RecSSD: near data processing for solid state drive based recommendation inference. In Proceedings of The 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. Minhui Xie, Youyou Lu, Jiazhen Lin, Qing Wang, Jian Gao, Kai Ren, and Jiwu Shu. 2022. Fleche: an efficient GPU embedding cache for personalized recommendations. In Proceedings of the Seventeenth European Conference on Computer Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. Xing Xie, Jianxun Lian, Zheng Liu, Xiting Wang, Fangzhao Wu, Hongwei Wang, and Zhongxia Chen. 2018. Personalized recommendation systems: Five hot research topics you must know. https://www.microsoft.com/en-us/research/lab/microsoft-research-asia/articles/personalized-recommendation-systems/ Microsoft Research Lab-Asia. Google ScholarGoogle Scholar
  75. Ming Xue and Changjun Zhu. 2009. The socket programming and software design for communication based on client/server. In Pacific-Asia Conference on Circuits, Communications and Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. Jie Amy Yang, Jianyu Huang, Jongsoo Park, Ping Tak Peter Tang, and Andrew Tulloch. 2020. Mixed-Precision Embedding Using a Cache. arXiv:2010.11305. Google ScholarGoogle Scholar
  77. Kai Zhang, Kaibo Wang, Yuan Yuan, Lei Guo, Rubao Lee, and Xiaodong Zhang. 2015. Mega-kv: A case for gpus to maximize the throughput of in-memory key-value stores. Proceedings of the VLDB Endowment. Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. Weijie Zhao, Deping Xie, Ronglai Jia, Yulei Qian, Ruiquan Ding, Mingming Sun, and Ping Li. 2020. Distributed Hierarchical GPU Parameter Server for Massive Scale Deep Learning Ads Systems. In Proceedings of The 3rd Conference on Machine Learning and Systems (MLSys). Google ScholarGoogle Scholar
  79. Weijie Zhao, Deping Xie, Ronglai Jia, Yulei Qian, Ruiquan Ding, Mingming Sun, and Ping Li. 2020. Distributed hierarchical gpu parameter server for massive scale deep learning ads systems. Proceedings of Machine Learning and Systems. Google ScholarGoogle Scholar
  80. Weijie Zhao, Jingyuan Zhang, Deping Xie, Yulei Qian, Ronglai Jia, and Ping Li. 2019. Aibox: Ctr prediction model training on a single node. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. Google ScholarGoogle ScholarDigital LibraryDigital Library
  81. Zhe Zhao, Lichan Hong, Li Wei, Jilin Chen, Aniruddh Nath, Shawn Andrews, Aditee Kumthekar, Maheswaran Sathiamoorthy, Xinyang Yi, and Ed H. Chi. 2019. Recommending what video to watch next: a multitask ranking system. In Proceedings of the 13th ACM Conference on Recommender Systems (RecSys). Google ScholarGoogle Scholar
  82. Da Zheng, Xiang Song, Chao Ma, Zeyuan Tan, Zihao Ye, Jin Dong, Hao Xiong, Zheng Zhang, and George Karypis. 2020. DGL-KE: Training Knowledge Graph Embeddings at Scale. In Proceedings of The 43rd International ACM SIGIR conference on research and development in Information Retrieval (SIGIR). Google ScholarGoogle ScholarDigital LibraryDigital Library
  83. Guorui Zhou, Na Mou, Ying Fan, Qi Pi, Weijie Bian, Chang Zhou, Xiaoqiang Zhu, and Kun Gai. 2019. Deep Interest Evolution Network for Click-Through Rate Prediction. In Proceedings of The 31st Innovative Applications of Artificial Intelligence Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  84. Guorui Zhou, Xiaoqiang Zhu, Chengru Song, Ying Fan, Han Zhu, Xiao Ma, Yanghui Yan, Junqi Jin, Han Li, and Kun Gai. 2018. Deep Interest Network for Click-Through Rate Prediction. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD). Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. EVStore: Storage and Caching Capabilities for Scaling Embedding Tables in Deep Recommendation Systems

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in
              • Published in

                cover image ACM Conferences
                ASPLOS 2023: Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2
                January 2023
                947 pages
                ISBN:9781450399166
                DOI:10.1145/3575693

                Copyright © 2023 ACM

                Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

                Publisher

                Association for Computing Machinery

                New York, NY, United States

                Publication History

                • Published: 30 January 2023

                Permissions

                Request permissions about this article.

                Request Permissions

                Check for updates

                Qualifiers

                • research-article

                Acceptance Rates

                Overall Acceptance Rate535of2,713submissions,20%

                Upcoming Conference

              • Article Metrics

                • Downloads (Last 12 months)663
                • Downloads (Last 6 weeks)77

                Other Metrics

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader