research-article

Public Access

iMARS: an in-memory-computing architecture for recommendation systems

Authors:

Ann Franchesca Laguna,

Michael Niemier,

X. Sharon HuAuthors Info & Claims

DAC '22: Proceedings of the 59th ACM/IEEE Design Automation Conference

Pages 463 - 468

https://doi.org/10.1145/3489517.3530478

Published: 23 August 2022 Publication History

Abstract

Recommendation systems (RecSys) suggest items to users by predicting their preferences based on historical data. Typical RecSys handle large embedding tables and many embedding table related operations. The memory size and bandwidth of the conventional computer architecture restrict the performance of RecSys. This work proposes an in-memory-computing (IMC) architecture (iMARS) for accelerating the filtering and ranking stages of deep neural network-based RecSys. iMARS leverages IMC-friendly embedding tables implemented inside a ferroelectric FET based IMC fabric. Circuit-level and system-level evaluation show that iMARS achieves 16.8x (713x) end-to-end latency (energy) improvement compared to the GPU counterpart for the MovieLens dataset.

References

[1]

M. Naumov, et al. Deep learning recommendation model for personalization and recommendation systems. CoRR, abs/1906.00091, 2019.

[2]

P. Covington, et al. Deep neural networks for youtube recommendations. In Proceedings of the 10th ACM conference on recommender systems, 2016.

Digital Library

[3]

H.-J. M. Shi, et al. Compositional embeddings using complementary partitions for memory-efficient recommendation systems. In SIGKDD, 2020.

Digital Library

[4]

L. Ke, et al. Recnmp: Accelerating personalized recommendation with near-memory processing. In 2020 ACM/IEEE 47th ISCA, pages 790--803. IEEE, 2020.

Digital Library

[5]

K. Ni, et al. Ferroelectric ternary content-addressable memory for one-shot learning. Nature Electronics, 2(11):521--529, 2019.

[6]

A. Ranjan, et al. X-mann: A crossbar based architecture for memory augmented neural networks. In DAC, pages 1--6, 2019.

[7]

D. Reis, et al. Computing in memory with fefets. In ISLPED, 2018.

Digital Library

[8]

X. Zhang, et al. FeMAT: Exploring In-Memory Processing in Multifunctional FeFET-Based Memory Array. In ICCD, pages 541--549, 2019.

[9]

D. Reis, et al. Attention-in-Memory for Few-Shot Learning with Configurable Ferroelectric FET Arrays. In 26th ASPDAC.

[10]

S. Dunkel, et al. A FeFET based super-low-power ultra-fast embedded NVM technology for 22nm FDSOI and beyond. In IEDM, 2017.

[11]

F. M. Harper et al. The movielens datasets: History and context. Acm transactions on interactive intelligent systems (tiis), 5(4):1--19, 2015.

[12]

W. Jiang, et al. Fleetrec: Large-scale recommendation inference on hybrid gpu-fpga clusters. In 27th ACM SIGKDD, 2021.

Digital Library

[13]

W. Jiang, et al. Microrec: Efficient recommendation inference by hardware and data structure solutions, 2021.

[14]

A. Shafiee, et al. Isaac: A convolutional neural network accelerator with in-situ analog arithmetic in crossbars. In ACM/IEEE 43rd ISCA, pages 14--26, 2016.

Digital Library

[15]

S. Jeloka, et al. A 28 nm configurable memory (tcam/bcam/sram) using push-rule 6t bit cell enabling logic-in-memory. JSSC, 51(4), 2016.

[16]

X. Chen, et al. The impact of ferroelectric fets on digital and analog circuits and architectures. IEEE Design & Test, 37(1):79--99, 2019.

[17]

S. Beyer, et al. Fefet: A versatile cmos compatible device with game-changing potential. In 2020 IEEE International Memory Workshop (IMW), pages 1--4, 2020.

[18]

Y. Luo, et al. Mlp+neurosimv3.0: Improving on-chip learning performance with device to algorithm optimizations. In ISCA, ICONS '19, New York, NY, USA, 2019. Association for Computing Machinery.

Digital Library

[19]

K. Ni, et al. A circuit compatible accurate compact model for Ferroelectric FETs. In VLSI Symposium. IEEE, 2018.

[20]

Y. Cao, et al. Predictive technology model. Internet: http://ptm.asu.edu, 2002.

[21]

J. Knudsen. Nangate 45nm open cell library. CDNLive, EMEA, 2008.

[22]

P.-Y. Chen, et al. Neurosim: A circuit-level macro model for benchmarking neuro-inspired architectures in online learning. TCAD, 37(12):3067--3080, 2018.

Cited By

Qiu YLu LYi SJing MZeng XKong YFan Y(2025)Flips: A Flexible Partitioning Strategy Near Memory Processing Architecture for Recommendation SystemIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2025.353953436:4(745-758)Online publication date: Apr-2025
https://doi.org/10.1109/TPDS.2025.3539534
Narla SKumar PAdnaan MNaeemi A(2025)Cross-Layer Modeling and Design of Content Addressable Memories in Advanced Technology Nodes for Similarity SearchIEEE Transactions on Electron Devices10.1109/TED.2024.350650972:1(240-246)Online publication date: Jan-2025
https://doi.org/10.1109/TED.2024.3506509
Nguyen MPark EChoi RJeong DKwon D(2025)Combination-Encoding Content-Addressable Memory Utilizing the Ferroelectric Hf-Zr-O Field-Effect-Transistor ArrayACS Applied Electronic Materials10.1021/acsaelm.4c02180Online publication date: 6-Mar-2025
https://doi.org/10.1021/acsaelm.4c02180
Show More Cited By

Recommendations

User preference representation based on psychometric models
ADC '11: Proceedings of the Twenty-Second Australasian Database Conference - Volume 115

Neighbourhood-based collaborative filtering is one of the most popular recommendation techniques, and has been applied successfully in various fields. User ratings are often used by neighbourhood-based collaborative filtering to compute the similarity ...
A novel user-based collaborative filtering method by inferring tag ratings

User-based collaborative filtering is one of the most widely-used recommendation methods. It recommends items to a user based on her similar users' preferences. The essential part of user-based collaborative filtering is to infer users' similarities. A ...
Using inferred tag ratings to improve user-based collaborative filtering
SAC '12: Proceedings of the 27th Annual ACM Symposium on Applied Computing

User-based collaborative filtering is one of the most widely-used recommender methods. It recommends items to a user according to her similar users' opinions. The key point of user-based collaborative filtering is to compute users' similarities. In ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

DAC '22: Proceedings of the 59th ACM/IEEE Design Automation Conference

July 2022

1462 pages

ISBN:9781450391429

DOI:10.1145/3489517

General Chair:
Rob Oshana
NXP

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGDA: ACM Special Interest Group on Design Automation
IEEE CEDA

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 August 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Funding Sources

Conference

DAC '22

Sponsor:

SIGDA

DAC '22: 59th ACM/IEEE Design Automation Conference

July 10 - 14, 2022

California, San Francisco

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Upcoming Conference

DAC '25

Sponsor:
sigda

62nd ACM/IEEE Design Automation Conference

June 22 - 26, 2025

San Francisco , CA , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

11
Total Citations
View Citations
594
Total Downloads

Downloads (Last 12 months)215
Downloads (Last 6 weeks)35

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Qiu YLu LYi SJing MZeng XKong YFan Y(2025)Flips: A Flexible Partitioning Strategy Near Memory Processing Architecture for Recommendation SystemIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2025.353953436:4(745-758)Online publication date: Apr-2025
https://doi.org/10.1109/TPDS.2025.3539534
Narla SKumar PAdnaan MNaeemi A(2025)Cross-Layer Modeling and Design of Content Addressable Memories in Advanced Technology Nodes for Similarity SearchIEEE Transactions on Electron Devices10.1109/TED.2024.350650972:1(240-246)Online publication date: Jan-2025
https://doi.org/10.1109/TED.2024.3506509
Nguyen MPark EChoi RJeong DKwon D(2025)Combination-Encoding Content-Addressable Memory Utilizing the Ferroelectric Hf-Zr-O Field-Effect-Transistor ArrayACS Applied Electronic Materials10.1021/acsaelm.4c02180Online publication date: 6-Mar-2025
https://doi.org/10.1021/acsaelm.4c02180
Niemier MEnciso ZSharifi MHu XO'Connor IGraening ASharma RGupta PCastrillon JLima JKhan AFarzaneh HAfroze NKhan ARyckaert J(2024)Smoothing Disruption Across the Stack: Tales of Memory, Heterogeneity, & Compilers2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE58400.2024.10546772(1-10)Online publication date: 25-Mar-2024
https://doi.org/10.23919/DATE58400.2024.10546772
Xu ZLiu CLi CMao RYang JKämpfe TImani MLi CZhuo CYin X(2024)FeReX: A Reconfigurable Design of Multi-Bit Ferroelectric Compute-in-Memory for Nearest Neighbor Search2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE58400.2024.10546615(1-6)Online publication date: 25-Mar-2024
https://doi.org/10.23919/DATE58400.2024.10546615
Li MReis DLaguna ANiemier MHu X(2024)Accelerating Recommendation Systems With In-Memory Embedding OperationsIEEE Transactions on Circuits and Systems for Artificial Intelligence10.1109/TCASAI.2024.34878171:2(244-256)Online publication date: Dec-2024
https://doi.org/10.1109/TCASAI.2024.3487817
Guo YYan ZYu XKong QXie JLuo KZeng DWu YJia ZShi Y(2024)Hardware design and the fairness of a neural networkNature Electronics10.1038/s41928-024-01213-07:8(714-723)Online publication date: 25-Jul-2024
https://doi.org/10.1038/s41928-024-01213-0
Nguyen DBhattacharjee AMoitra APanda P(2023)DeepCAM: A Fully CAM-based Inference Accelerator with Variable Hash Lengths for Energy-efficient Deep Neural Networks2023 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE56975.2023.10137068(1-6)Online publication date: Apr-2023
https://doi.org/10.23919/DATE56975.2023.10137068
Reis DLaguna ANiemier MHu XTakahashi A(2023)In-Memory Computing Accelerators for Emerging Learning ParadigmsProceedings of the 28th Asia and South Pacific Design Automation Conference10.1145/3566097.3568356(606-611)Online publication date: 16-Jan-2023
https://dl.acm.org/doi/10.1145/3566097.3568356
Narla SKumar PLaguna AReis DHu XNiemier MNaeemi A(2023)Design of a Compact Spin-Orbit-Torque-Based Ternary Content Addressable MemoryIEEE Transactions on Electron Devices10.1109/TED.2022.323156970:2(506-513)Online publication date: Feb-2023
https://doi.org/10.1109/TED.2022.3231569
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten