EVStore: Storage and Caching Capabilities for Scaling Embedding Tables in Deep Recommendation Systems

Authors:
Daniar H. Kurniawan

University of Chicago, USA

University of Chicago, USA
View Profile

,
Ruipu Wang

Beijing University of Technology, China

Beijing University of Technology, China
View Profile

,
Kahfi S. Zulkifli

Bandung Institute of Technology, Indonesia

Bandung Institute of Technology, Indonesia
View Profile

,
Fandi A. Wiranata

Bandung Institute of Technology, Indonesia

Bandung Institute of Technology, Indonesia
View Profile

,
John Bent

Seagate Technology, USA

Seagate Technology, USA
View Profile

,
Ymir Vigfusson

Emory University, USA

Emory University, USA
View Profile

,
Haryadi S. Gunawi

University of Chicago, USA

University of Chicago, USA
View Profile

ASPLOS 2023: Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2January 2023Pages 281–294https://doi.org/10.1145/3575693.3575718

Published:30 January 2023Publication History

ASPLOS 2023: Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2

Pages 281–294

ABSTRACT

Modern recommendation systems, primarily driven by deep-learning models, depend on fast model inferences to be useful. To tackle the sparsity in the input space, particularly for categorical variables, such inferences are made by storing increasingly large embedding vector (EV) tables in memory. A core challenge is that the inference operation has an all-or-nothing property: each inference requires multiple EV table lookups, but if any memory access is slow, the whole inference request is slow. In our paper, we design, implement and evaluate EVStore, a 3-layer EV table lookup system that harnesses both structural regularity in inference operations and domain-specific approximations to provide optimized caching, yielding up to 23% and 27% reduction on the average and p90 latency while quadrupling throughput at 0.2% loss in accuracy. Finally, we show that at a minor cost of accuracy, EVStore can reduce the Deep Recommendation System (DRS) memory usage by up to 94%, yielding potentially enormous savings for these costly, pervasive systems.

References

[n. d.]. https://github.com/ucare-uchicago/ev-store-dlrm Google Scholar
[n. d.]. Chameleon Cloud Testbed. https://www.chameleoncloud.org/ Google Scholar
[n. d.]. RocksDB. http://rocksdb.org/ Google Scholar
2013. Download Terabyte Click Logs. https://labs.criteo.com/2013/12/download-terabyte-click-logs/ Google Scholar
2014. Click-Through Rate Prediction: Predict whether a mobile ad will be clicked. https://www.kaggle.com/c/avazu-ctr-prediction Google Scholar
2014. Display Advertising Challenge. https://www.kaggle.com/c/criteo-display-ad-challenge Google Scholar
2018. Notes from the ai frontier insights from hundreds of use cases. https://www.mckinsey.com/featured-insights/artificial-intelligence/ Google Scholar
2019. Use cases of recommendation systems in business current applications and methods. https://emerj.com/ai-sector-overviews/use-cases-recommendation-systems/ Google Scholar
2020. SQLite. https://www.sqlite.org/index.html Google Scholar
2021. How machine learning powers Facebook’s News Feed ranking algorithm. https://engineering.fb.com/2021/01/26/ml-applications/news-feed-ranking/ Google Scholar
2022. Benchmarks for Java In Memory Caches. https://github.com/cache2k/cache2k-benchmark Google Scholar
2022. CORTX-Motr. https://github.com/Seagate/cortx-motr Google Scholar
2022. Memory Prices. https://memory.net/memory-prices/ Google Scholar
Alaa R Alameldeen and David A Wood. 2004. Adaptive cache compression for high-performance processors. In Proceedings of the 31st Annual International Symposium on Computer Architecture (ISCA). Google ScholarCross Ref
David G. Andersen, Jason Franklin, Michael Kaminsky, Amar Phanishayee, Lawrence Tan, and Vijay Vasudevan. 2009. FAWN: A fast array of wimpy nodes. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles (SOSP). Google ScholarDigital Library
Ehsan K. Ardestani, Changkyu Kim, Seung Jae Lee, Luoshang Pan, Valmiki Rampersad, Jens Axboe, Banit Agrawal, Fuxun Yu, Ansha Yu, Trung Le, Hector Yuen, Shishir Juluri, Akshat Nanda, Manoj Wodekar, Dheevatsa Mudigere, Krishnakumar Nair, Maxim Naumov, Chris Peterson, Mikhail Smelyanskiy, and Vijay Rao. 2021. Supporting Massive DLRM Inference Through Software Defined Memory. https://arxiv.org/abs/2110.11489. Google Scholar
Bahar Asgari, Ramyad Hadidi, Jiashen Cao, Da Eun Shim, Sung Kyu Lim, and Hyesoon Kim. 2021. Fafnir: Accelerating sparse gathering by using efficient near-memory intelligent reduction. In 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA). Google ScholarCross Ref
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural Machine Translation by Jointly Learning to Align and Translate. In Proceedings of The 3rd International Conference on Learning Representations (ICLR). Google Scholar
Sorav Bansal and Dharmendra S. Modha. 2004. CAR: Clock with Adaptive Replacement. In Proceedings of The FAST ’04 Conference on File and Storage Technologies. Google Scholar
Nathan Beckmann, Haoxian Chen, and Asaf Cidon. 2018. LHD: Improving cache hit rate by maximizing hit density. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI). Google Scholar
Nathan Beckmann and Daniel Sanchez. 2016. Modeling cache performance beyond LRU. In 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA). Google ScholarCross Ref
Nathan Beckmann and Daniel Sanchez. 2017. Maximizing Cache Performance Under Uncertainty. In Proceedings of the 23rd international symposium on High Performance Computer Architecture (HPCA-23). Google ScholarCross Ref
Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, Rohan Anil, Zakaria Haque, Lichan Hong, Vihan Jain, Xiaobing Liu, and Hemal Shah. 2016. Wide & Deep Learning for Recommender Systems. In Proceedings of The 1st Workshop on Deep Learning for Recommender Systems (DLRS@RecSys). Google ScholarDigital Library
Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep Neural Networks for YouTube Recommendations. In Proceedings of The 10th ACM Conference on Recommender Systems (RecSys). Google ScholarDigital Library
Asit Dan and Don Towsley. 1990. An approximate analysis of the LRU and FIFO buffer replacement schemes. In Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems. Google ScholarDigital Library
Per-Erik Danielsson. 1980. Euclidean distance mapping. Computer Graphics and image processing. Google Scholar
Jesse Davis and Mark H. Goadrich. 2006. The relationship between Precision-Recall and ROC curves. In Proceedings of the 23rd international conference on Machine learning. Google Scholar
Assaf Eisenman, Maxim Naumov, Darryl Gardner, Misha Smelyanskiy, Sergey Pupyrev, Kim M. Hazelwood, Asaf Cidon, and Sachin Katti. 2019. Bandana: Using Non-Volatile Memory for Storing Deep Learning Models. In Proceedings of The 2nd Conference on Machine Learning and Systems (MLSys). Google Scholar
Pankaj Gupta, Ashish Goel, Jimmy Lin, Aneesh Sharma, Dong Wang, and Reza Zadeh. 2013. WTF: the who to follow service at Twitter. In Proceedings of the 22nd international conference on World Wide Web (WWW). Google ScholarDigital Library
Udit Gupta, Carole-Jean Wu, Xiaodong Wang, Maxim Naumov, Brandon Reagen, David Brooks, Bradford Cottel, Kim M. Hazelwood, Mark Hempstead, Bill Jia, Hsien-Hsin S. Lee, Andrey Malevich, Dheevatsa Mudigere, Mikhail Smelyanskiy, Liang Xiong, and Xuan Zhang. 2020. The Architectural Implications of Facebook’s DNN-based Personalized Recommendation. In Proceedings of The 26th IEEE International Symposium on High-Performance Computer Architecture (HPCA). Google ScholarCross Ref
Kim M. Hazelwood, Sarah Bird, David M. Brooks, Soumith Chintala, Utku Diril, Dmytro Dzhulgakov, Mohamed Fawzy, Bill Jia, Yangqing Jia, Aditya Kalro, James Law, Kevin Lee, Jason Lu, Pieter Noordhuis, Misha Smelyanskiy, Liang Xiong, and Xiaodong Wang. 2018. Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective. In Proceedings of The 24th IEEE International Symposium on High-Performance Computer Architecture (HPCA). Google ScholarCross Ref
Tayler H Hetherington, Mike O’Connor, and Tor M Aamodt. 2015. Memcachedgpu: Scaling-up scale-out key-value stores. In Proceedings of the 6th ACM Symposium on Cloud Computing. Google ScholarDigital Library
Gisli R. Hjaltason and Hanan Samet. 2003. Properties of embedding methods for similarity searching in metric spaces. IEEE Transactions on Pattern Analysis and machine intelligence. Google ScholarDigital Library
Seokin Hong, Bulent Abali, Alper Buyuktosunoglu, Michael B Healy, and Prashant J Nair. 2019. Touché: Towards ideal and efficient cache compression by mitigating tag area overheads. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture. Google ScholarDigital Library
László Jeni, Jeffrey Cohn, and Fernando De la Torre. 2013. Facing Imbalanced Data - Recommendations for the Use of Performance Metrics. Proceedings - 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction, ACII 2013. Google ScholarDigital Library
Song Jiang, Feng Chen, and Xiaodong Zhang. 2005. CLOCK-Pro: An Effective Improvement of the CLOCK Replacement. In Proceedings of The 2005 USENIX Annual Technical Conference. Google Scholar
S. Jiang and X. Zhang. 2002. LIRS: An efficient low inter reference recency set replacement policy to improve buffer cache performance. In Proceedings of The International Conference on Measurements and Modeling of Computer Systems (SIGMETRICS). Google Scholar
Norman P Jouppi, Doe Hyun Yoon, Matthew Ashcraft, Mark Gottscho, Thomas B Jablin, George Kurian, James Laudon, Sheng Li, Peter Ma, and Xiaoyu Ma. 2021. Ten lessons from three generations shaped Google’s TPUv4i: Industrial product. In 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA). Google ScholarDigital Library
Anne Kao and Steve R. Poteet. 2007. Natural language processing and text mining. Google Scholar
Liu Ke, Udit Gupta, Benjamin Youngjae Cho, David Brooks, Vikas Chandra, Utku Diril, Amin Firoozshahian, Kim M. Hazelwood, Bill Jia, Hsien-Hsin S. Lee, Meng Li, Bert Maher, Dheevatsa Mudigere, Maxim Naumov, Martin Schatz, Mikhail Smelyanskiy, Xiaodong Wang, Brandon Reagen, Carole-Jean Wu, Mark Hempstead, and Xuan Zhang. 2020. Recnmp: Accelerating personalized recommendation with near-memory processing. In 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA). Google ScholarDigital Library
Kate Keahey, Jason Anderson, Zhuo Zhen, Pierre Riteau, Paul Ruth, Dan Stanzione, Mert Cevik, Jacob Colleran, Haryadi S. Gunawi, Cody Hammock, Joe Mambretti, Alexander Barnes, François Halbach, Alex Rocha, and Joe Stubbs. 2020. Lessons Learned from the Chameleon Testbed. In Proceedings of the 2020 USENIX Annual Technical Conference (ATC). Google Scholar
Guy K Kloss. 2009. Automatic C library wrapping Ctypes from the trenches. Google Scholar
Youngeun Kwon, Yunjae Lee, and Minsoo Rhu. 2019. Tensordimm: A practical near-memory processing architecture for embeddings and tensor operations in deep learning. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture. Google ScholarDigital Library
Donghee Lee, Jongmoo Choi, Jong-Hun Kim, Sam H. Noh, Sang Lyul Min, Yookun Cho, and Chong-Sang Kim. 1999. On the existence of a spectrum of policies that subsumes the least recently used (LRU) and least frequently used (LFU) policies. In Proceedings of the 1999 ACM SIGMETRICS international conference on Measurement and modeling of computer systems. Google ScholarDigital Library
Yejin Lee, Seong Hoon Seo, Hyunji Choi, Hyoung Uk Sul, Soosung Kim, Jae W. Lee, and Tae Jun Ham. 2021. MERCI: efficient embedding reduction on commodity hardware via sub-query memoization. In Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Google ScholarDigital Library
Adam Lerer, Ledell Wu, Jiajun Shen, Timothée Lacroix, Luca Wehrstedt, Abhijit Bose, and Alex Peysakhovich. 2019. Pytorch-BigGraph: A Large Scale Graph Embedding System. In Proceedings of The 2nd Conference on Machine Learning and Systems (MLSys). Google Scholar
Huaicheng Li, Daniel S. Berger, Stanko Novakovic, Lisa Hsu, Dan Ernst, Pantea Zardoshti, Monish Shah, Ishwar Agarwal, Mark D. Hill, Marcus Fontoura, and Ricardo Bianchini. 2022. First-generation Memory Disaggregation for Cloud Platforms. https://arxiv.org/pdf/2203.00241. Google Scholar
Leo Liberti, Carlile Lavor, Nelson Maculan, and Antonio Mucherino. 2014. Euclidean distance geometry and applications. SIAM review. Google Scholar
Hyeontaek Lim, Bin Fan, David G Andersen, and Michael Kaminsky. 2011. SILT: A memory-efficient, high-performance key-value store. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles (SOSP). Google ScholarDigital Library
N. Megiddo and D. S. Modha. 2003. ARC: A self-tuning, low overhead replacement cache. In Proceedings of The FAST ’03 Conference on File and Storage Technologies. Google Scholar
Xupeng Miao, Hailin Zhang, Yining Shi, Xiaonan Nie, Zhi Yang, Yangyu Tao, and Bin Cui. 2021. Het: Scaling out huge embedding model training via cache-enabled distributed framework. arXiv:2112.07221. Google Scholar
Jason Mohoney, Roger Waleffe, Henry Xu, Theodoros Rekatsinas, and Shivaram Venkataraman. 2021. Marius: Learning Massive Graph Embeddings on a Single Machine. In Proceedings of The 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI). Google Scholar
James C. Mullikin. 1992. The vector distance transform in two and three dimensions. CVGIP: Graphical Models and Image Processing. Google Scholar
Maxim Naumov, Dheevatsa Mudigere, Hao-Jun Michael Shi, Jianyu Huang, Narayanan Sundaraman, Jongsoo Park, Xiaodong Wang, Udit Gupta, Carole-Jean Wu, Alisson G. Azzolini, Dmytro Dzhulgakov, Andrey Mallevich, Ilia Cherniavskii, Yinghai Lu, Raghuraman Krishnamoorthi, Ansha Yu, Volodymyr Kondratenko, Stephanie Pereira, Xianjie Chen, Wenlin Chen, Vijay Rao, Bill Jia, Liang Xiong, and Misha Smelyanskiy. 2019. Deep learning recommendation model for personalization and recommendation systems. arXiv:1906.00091. Google Scholar
Victor F Nicola, Asit Dan, and Daniel M Dias. 1992. Analysis of the generalized clock buffer replacement scheme for database transaction processing. In Proceedings of the 1992 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems. Google ScholarDigital Library
Even Oldridge, Julio Perez, Ben Frederickson, Nicolas Koumchatzky, Minseok Lee, Zehuan Wang, Lei Wu, Fan Yu, Rick Zamora, O Yılmaz, Alec Gunny, and Vinh Nguyen. 2020. Merlin: a gpu accelerated recommendation framework. Proceeding s of IRS. Google Scholar
E. Theodore L. Omtzigt, Peter Gottschling, Mark Seligman, and William Zorn. 2020. Universal Numbers Library: design and implementation of a high-performance reproducible number systems library. arXiv:2012.11011. Google Scholar
Jongsoo Park, Maxim Naumov, Protonu Basu, Summer Deng, Aravind Kalaiah, Daya Shanker Khudia, James Law, Parth Malani, Andrey Malevich, Nadathur Satish, Juan Miguel Pino, Martin Schatz, Alexander Sidorov, Viswanath Sivakumar, Andrew Tulloch, Xiaodong Wang, Yiming Wu, Hector Yuen, Utku Diril, Dmytro Dzhulgakov, Kim M. Hazelwood, Bill Jia, Yangqing Jia, Lin Qiao, Vijay Rao, Nadav Rotem, Sungjoo Yoo, and Mikhail Smelyanskiy. 2018. Deep Learning Inference in Facebook Data Centers: Characterization, Performance Optimizations and Hardware Implications. https://arxiv.org/abs/1811.09886. Google Scholar
Gang Qian, Shamik Sural, Yuelong Gu, and Sakti Pramanik. 2004. Similarity between Euclidean and cosine angle distance for nearest neighbor queries. In Proceedings of the 2004 ACM symposium on Applied computing. Google ScholarDigital Library
Liana V. Rodriguez, Farzana Beente Yusuf, Steven Lyons, Eysler Paz, Raju Rangaswami, Jason Liu, Ming Zhao, and Giri Narasimhan. 2021. Learning Cache Replacement with CACHEUS. In Proceedings of The 19th USENIX Conference on File and Storage Technologies (FAST). Google Scholar
Trausti Saemundsson, Hjortur Bjornsson, Gregory Chockler, and Ymir Vigfusson. 2014. Dynamic performance profiling of cloud caches. In Proceedings of the ACM Symposium on Cloud Computing. Google ScholarDigital Library
Amit Sharma, Jake M Hofman, and Duncan J Watts. 2015. Estimating the causal impact of recommendation systems from observational data. In Proceedings of the Sixteenth ACM Conference on Economics and Computation. Google ScholarDigital Library
Zhenyu Song, Daniel S Berger, Kai Li, Anees Shaikh, Wyatt Lloyd, Soudeh Ghorbani, Changhoon Kim, Aditya Akella, Arvind Krishnamurthy, and Emmett Witchel. 2020. Learning relaxed belady for content distribution network caching. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI). Google Scholar
Dimitra Tsigkari and Thrasyvoulos Spyropoulos. 2022. An approximation algorithm for joint caching and recommendations in cache networks. IEEE Transactions on Network and Service Management. Google ScholarDigital Library
Uresh Vahalia. 1996. Unix Internals: The New Frontiers. Google Scholar
G. Vietri, L. V. Rodriguez, W. A. Martinez, S. Lyons, J. Liu, R. Rangaswami, M. Zhao, and G. Narasimhan. 2018. Driving Cache Replacement with ML-based LeCaR. In Proceedings of The 10th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage). Google Scholar
S Vijayarani and R Janani. 2016. Text mining: open source tokenization tools-an analysis. Advanced Computational Intelligence: An International Journal (ACII). Google Scholar
Hu Wan, Xuan Sun, Yufei Cui, Chia-Lin Yang, Tei-Wei Kuo, and Chun Jason Xue. 2021. FlashEmbedding: storing embedding tables in SSD for large-scale recommender systems. In Proceedings of The 12th ACM SIGOPS Asia-Pacific Workshop on Systems (APSys). Google ScholarDigital Library
Jizhe Wang, Pipei Huang, Huan Zhao, Zhibo Zhang, Binqiang Zhao, and Dik Lun Lee. 2018. Billion-scale Commodity Embedding for E-commerce Recommendation in Alibaba. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD). Google ScholarDigital Library
Ruoxi Wang, Bin Fu, Gang Fu, and Mingliang Wang. 2017. Deep & cross network for ad click predictions. In Proceedings of ADKDD. Google ScholarDigital Library
Ruoxi Wang, Rakesh Shivanna, Derek Cheng, Sagar Jain, Dong Lin, Lichan Hong, and Ed Chi. 2021. DCN v2: Improved deep & cross network and practical lessons for web-scale learning to rank systems. In Proceedings of the Web Conference 2021. Google ScholarDigital Library
Mark Wilkening, Udit Gupta, Samuel Hsia, Caroline Trippel, Carole-Jean Wu, David Brooks, and Gu-Yeon Wei. 2021. RecSSD: near data processing for solid state drive based recommendation inference. In Proceedings of The 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Google ScholarDigital Library
Minhui Xie, Youyou Lu, Jiazhen Lin, Qing Wang, Jian Gao, Kai Ren, and Jiwu Shu. 2022. Fleche: an efficient GPU embedding cache for personalized recommendations. In Proceedings of the Seventeenth European Conference on Computer Systems. Google ScholarDigital Library
Xing Xie, Jianxun Lian, Zheng Liu, Xiting Wang, Fangzhao Wu, Hongwei Wang, and Zhongxia Chen. 2018. Personalized recommendation systems: Five hot research topics you must know. https://www.microsoft.com/en-us/research/lab/microsoft-research-asia/articles/personalized-recommendation-systems/ Microsoft Research Lab-Asia. Google Scholar
Ming Xue and Changjun Zhu. 2009. The socket programming and software design for communication based on client/server. In Pacific-Asia Conference on Circuits, Communications and Systems. Google ScholarDigital Library
Jie Amy Yang, Jianyu Huang, Jongsoo Park, Ping Tak Peter Tang, and Andrew Tulloch. 2020. Mixed-Precision Embedding Using a Cache. arXiv:2010.11305. Google Scholar
Kai Zhang, Kaibo Wang, Yuan Yuan, Lei Guo, Rubao Lee, and Xiaodong Zhang. 2015. Mega-kv: A case for gpus to maximize the throughput of in-memory key-value stores. Proceedings of the VLDB Endowment. Google ScholarDigital Library
Weijie Zhao, Deping Xie, Ronglai Jia, Yulei Qian, Ruiquan Ding, Mingming Sun, and Ping Li. 2020. Distributed Hierarchical GPU Parameter Server for Massive Scale Deep Learning Ads Systems. In Proceedings of The 3rd Conference on Machine Learning and Systems (MLSys). Google Scholar
Weijie Zhao, Deping Xie, Ronglai Jia, Yulei Qian, Ruiquan Ding, Mingming Sun, and Ping Li. 2020. Distributed hierarchical gpu parameter server for massive scale deep learning ads systems. Proceedings of Machine Learning and Systems. Google Scholar
Weijie Zhao, Jingyuan Zhang, Deping Xie, Yulei Qian, Ronglai Jia, and Ping Li. 2019. Aibox: Ctr prediction model training on a single node. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. Google ScholarDigital Library
Zhe Zhao, Lichan Hong, Li Wei, Jilin Chen, Aniruddh Nath, Shawn Andrews, Aditee Kumthekar, Maheswaran Sathiamoorthy, Xinyang Yi, and Ed H. Chi. 2019. Recommending what video to watch next: a multitask ranking system. In Proceedings of the 13th ACM Conference on Recommender Systems (RecSys). Google Scholar
Da Zheng, Xiang Song, Chao Ma, Zeyuan Tan, Zihao Ye, Jin Dong, Hao Xiong, Zheng Zhang, and George Karypis. 2020. DGL-KE: Training Knowledge Graph Embeddings at Scale. In Proceedings of The 43rd International ACM SIGIR conference on research and development in Information Retrieval (SIGIR). Google ScholarDigital Library
Guorui Zhou, Na Mou, Ying Fan, Qi Pi, Weijie Bian, Chang Zhou, Xiaoqiang Zhu, and Kun Gai. 2019. Deep Interest Evolution Network for Click-Through Rate Prediction. In Proceedings of The 31st Innovative Applications of Artificial Intelligence Conference. Google ScholarDigital Library
Guorui Zhou, Xiaoqiang Zhu, Chengru Song, Ying Fan, Han Zhu, Xiao Ma, Yanghui Yan, Junqi Jin, Han Li, and Kun Gai. 2018. Deep Interest Network for Click-Through Rate Prediction. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD). Google ScholarDigital Library

Index Terms

EVStore: Storage and Caching Capabilities for Scaling Embedding Tables in Deep Recommendation Systems

Recommendations

Towards a deep learning model for hybrid recommendation
WI '17: Proceedings of the International Conference on Web Intelligence

The deep learning wave is propagating through many research areas and communities. In the last years it quickly propagated to Recommendation Systems, a research area which aims to recommend items to users. Indeed, many deep learning models and ...
Read More
A General Rating Recommended Weight-Aware Model for Recommendation System
HCC 2016: Revised Selected Papers of the Second International Conference on Human Centered Computing - Volume 9567

In recommendation system, the ratings represent the users' preference and play an important role in recommending items to users. However, the ratings of items may be influenced by many factors, such as time the latest ratings are more able to reflect ...
Read More
An Adaptive Demand-Based Caching Mechanism for NAND Flash Memory Storage Systems

During past decades, the capacity of NAND flash memory has been increasing dramatically, leading to the use of nonvolatile flash in the system’s memory hierarchy. The increasing capacity of NAND flash memory introduces a large RAM footprint to store the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ASPLOS 2023: Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2
January 2023
947 pages
ISBN:9781450399166
DOI:10.1145/3575693
General Chair:
Tor M. Aamodt
University of British Columbia, Canada
,
Program Chairs:
Natalie Enright Jerger
University of Toronto, Canada
,
Michael Swift
University of Wisconsin-Madison, USA
Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 30 January 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Caching systems
Deep learning
Inference systems
Performance
Recommendation Systems
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate535of2,713submissions,20%
Upcoming Conference
ASPLOS '24

Sponsor:

sigarch

sigarch

sigarch

29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems

April 27 - May 1, 2024

La Jolla , CA , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 859
  Total Downloads
- Downloads (Last 12 months)663
- Downloads (Last 6 weeks)77
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

EVStore: Storage and Caching Capabilities for Scaling Embedding Tables in Deep Recommendation Systems

ASPLOS 2023: Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2

ABSTRACT

References

Cited By

Index Terms

Recommendations

Towards a deep learning model for hybrid recommendation

A General Rating Recommended Weight-Aware Model for Recommendation System

An Adaptive Demand-Based Caching Mechanism for NAND Flash Memory Storage Systems