skip to main content
10.1145/3649329.3657375acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article

A High-Throughput Private Inference Engine Based on 3D Stacked Memory

Published: 07 November 2024 Publication History

Abstract

Fully Homomorphic Encryption (FHE) enables unlimited computation depth, allowing privacy-enhanced neural network inference tasks directly on the ciphertext. However, existing FHE architectures suffer from the memory access bottleneck. This work proposes a High-throughput FHE engine for private inference (PI) based on 3D stacked memory (H3). H3 adopts the software-hardware co-design that dynamically adjusts the polynomial decomposition during the PI process to minimize the computation and storage overhead at a fine granularity. With 3D hybrid bonding, H3 integrates a logic die with a multi-layer embedded DRAM, routing data efficiently to the processing unit array through an efficient broadcast mechanism. H3 consumes 192mm2 when implemented using a 28nm logic process. It achieves 1.36 million LeNet-5 or 920 ResNet-20 PI per minute, surpassing existing 7nm accelerators by 52%. This demonstrates that 3D memory is a promising technology to promote the performance of FHE.

References

[1]
Craig Gentry. Fully homomorphic encryption using ideal lattices. In Proc. of STOC 2009, pages 169--178. ACM, 2009.
[2]
Jung Hee Cheon et al. Homomorphic encryption for arithmetic of approximate numbers. In Proc. of ASIACRYPT 2017, pages 409--437. Springer, 2017.
[3]
Eunsang Lee et al. Low-complexity deep convolutional neural networks on fully homomorphic encryption using multiplexed parallel convolutions. In Proc. of ICML 2022, pages 12403--12422. PMLR, 2022.
[4]
Jongmin Kim et al. ARK: fully homomorphic encryption accelerator with runtime data generation and inter-operation key reuse. In Proc. of MICRO 2022, pages 1237--1254. IEEE, 2022.
[5]
Nikola Samardzic et al. Craterlake: a hardware accelerator for efficient unbounded computation on encrypted data. In Proc. of ISCA 2022, pages 173--187. ACM, 2022.
[6]
Sangpyo Kim et al. BTS: an accelerator for bootstrappable fully homomorphic encryption. In Proc. of ISCA 2022, pages 711--725. ACM, 2022.
[7]
Jongmin Kim et al. SHARP: A short-word hierarchical accelerator for robust and practical fully homomorphic encryption. In Proc. of ISCA 2023, pages 18:1--18:15. ACM, 2023.
[8]
Nikola Samardzic et al. F1: A fast and programmable accelerator for fully homo-morphic encryption. In Proc. of MICRO 2021, pages 238--252. ACM, 2021.
[9]
Albrecht et al. Homomorphic encryption standard. Protecting privacy through homomorphic encryption, pages 31--62, 2021.
[10]
Patrick Longa et al. Speeding up the number theoretic transform for faster ideal lattice-based cryptography. In Proc. of CANS 2016, pages 124--139, 2016.
[11]
Jung Hee Cheon et al. A full RNS variant of approximate homomorphic encryption. In Proc. of SAC 2018, pages 347--368. Springer, 2018.
[12]
Kyoohyung Han and Dohyeong Ki. Better bootstrapping for approximate homo-morphic encryption. In Proc. of CT-RSA 2020, pages 364--390. Springer, 2020.
[13]
Andrey Kim, Yuriy Polyakov, and Vincent Zucca. Revisiting homomorphic encryption schemes for finite fields. In Proc. of ASIACRYPT 2021, pages 608--639. Springer, 2021.
[14]
Robin Geelen et al. BASALISC: flexible asynchronous hardware accelerator for fully homomorphic encryption. IACR Trans. Cryptogr. Hardw. Embed. Syst., 2023(4):32--57, 2023.
[15]
Wonkyung Jung et al. Over 100x faster bootstrapping in fully homomorphic encryption through memory-centric optimization with GPUs. IACR Trans. Cryptogr. Hardw. Embed. Syst., 2021(4):114--148, 2021.
[16]
Ltd T-Head Semiconductor Co. T-head-semi/opene906, Aug 2022. https://github.com/T-head-Semi/opene906.
[17]
Paul Barrett. Implementing the Rivest Shamir and Adleman public key encryption algorithm on a standard digital signal processor. In Proc. of CRYPTO 1986, pages 311--323. Springer, 1986.
[18]
James W. Cooley and John W. Tukey. An algorithm for the machine calculation of complex fourier series. Mathematics of Computation, 19:297--301, 1965.
[19]
Xuanle Ren et al. CHAM: A customized homomorphic encryption accelerator for fast matrix-vector product. In Proc. of DAC 2023, pages 1--6. IEEE, 2023.
[20]
Dimin Niu et al. 184QPS/W 64Mb/mm2 3D logic-to-DRAM hybrid bonding with process-near-memory engine for recommendation system. In Proc. of ISSCC 2022, pages 1--3. IEEE, 2022.
[21]
Song Wang et al. A 135 GBps/Gbit 0.66 pJ/bit stacked embedded DRAM with multilayer arrays by fine pitch hybrid bonding and mini-TSV. In Proc. of VLSI 2023, pages 1--2. IEEE, 2023.
[22]
Alon Brutzkus et al. Low latency privacy preserving inference. In Proc. of ICML 2019, pages 812--821. PMLR, 2019.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DAC '24: Proceedings of the 61st ACM/IEEE Design Automation Conference
June 2024
2159 pages
ISBN:9798400706011
DOI:10.1145/3649329
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 November 2024

Check for updates

Author Tags

  1. homomorphic encryption
  2. accelerator
  3. private inference

Qualifiers

  • Research-article

Conference

DAC '24
Sponsor:
DAC '24: 61st ACM/IEEE Design Automation Conference
June 23 - 27, 2024
CA, San Francisco, USA

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Upcoming Conference

DAC '25
62nd ACM/IEEE Design Automation Conference
June 22 - 26, 2025
San Francisco , CA , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 88
    Total Downloads
  • Downloads (Last 12 months)88
  • Downloads (Last 6 weeks)35
Reflects downloads up to 05 Jan 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media