skip to main content
10.1145/3620666.3651364acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article

Accelerating Multi-Scalar Multiplication for Efficient Zero Knowledge Proofs with Multi-GPU Systems

Published: 27 April 2024 Publication History

Abstract

Zero-knowledge proof is a cryptographic primitive that allows for the validation of statements without disclosing any sensitive information, foundational in applications like verifiable outsourcing and digital currency. However, the extensive proof generation time limits its widespread adoption. Even with GPU acceleration, proof generation can still take minutes, with Multi-Scalar Multiplication (MSM) accounting for about 78.2% of the workload. To address this, we present DistMSM, a novel MSM algorithm tailored for distributed multi-GPU systems. At the algorithmic level, DistMSM adapts Pippenger's algorithm for multi-GPU setups, effectively identifying and addressing bottlenecks that emerge during scaling. At the GPU kernel level, DistMSM introduces an elliptic curve arithmetic kernel tailored for contemporary GPU architectures. It optimizes register pressure with two innovative techniques and leverages tensor cores for specific big integer multiplications. Compared to state-of-the-art MSM implementations, DistMSM offers an average 6.39× speedup across various elliptic curves and GPU counts. An MSM task that previously took seconds on a single GPU can now be completed in mere tens of milliseconds. It showcases the substantial potential and efficiency of distributed multi-GPU systems in ZKP acceleration.

References

[1]
Inc Advanced Micro Devices. Amd rocm open software platform. https://rocm.docs.amd.com, 2023.
[2]
Sebastian Angel, Andrew J Blumberg, Eleftherios Ioannidis, and Jess Woods. Efficient representation of numerical optimization problems for snarks. In 31st USENIX Security Symposium, 2022.
[3]
Samuel Antao, Jean-Claude Bajard, and Leonel Sousa. Elliptic curve point multiplication on gpus. In ASAP 2010-21st IEEE International Conference on Application-specific Systems, Architectures and Processors, pages 192--199. IEEE, 2010.
[4]
Gautam Botrel and Youssef El Housni. Faster montgomery multiplication and multi-scalar-multiplication for snarks. 2023.
[5]
Henri Cohen, Atsuko Miyaji, and Takatoshi Ono. Efficient elliptic curve exponentiation using mixed coordinates. In Advances in Cryptology---ASIACRYPT'98: International Conference on the Theory and Application of Cryptology and Information Security Beijing, China, October 18--22, 1998 Proceedings, pages 51--65. Springer, 1998.
[6]
ZCash Crop. Zcash is cash for the new age. https://z.cash, 2023.
[7]
Cysic. Hardware accelerating zero-knowledge proofs. http://cysic.xyz, 2023.
[8]
George Danezis, Cedric Fournet, Markulf Kohlweiss, and Bryan Parno. Pinocchio coin: building zerocoin from a succinct pairing-based proof system. In First ACM workshop on Language support for privacy-enhancing technologies, pages 27--30, 2013.
[9]
Marwa Elteir, Heshan Lin, and Wu-chun Feng. Performance characterization and optimization of atomic operations on amd gpus. In 2011 IEEE International Conference on Cluster Computing, 2011.
[10]
Boyuan Feng, Lianke Qin, Zhenfei Zhang, Yufei Ding, and Shumo Chu. Zen: An optimizing compiler for verifiable, zero-knowledge neural network inferences. Cryptology ePrint Archive, 2021.
[11]
Filecoin. bellperson: Gpu parallel acceleration for zk-snark. https://github.com/filecoin-project/bellperson, 2023.
[12]
Mina Foundation. Gpu groth16 prover (3x faster than cpu). https://github.com/MinaProtocol/gpu-groth16-prover-3x, 2023.
[13]
Hisham S Galal and Amr M Youssef. Verifiable sealed-bid auction on the ethereum blockchain. In Financial Cryptography and Data Security: FC 2018 International Workshops, pages 265--278. Springer, 2019.
[14]
Lili Gao, Fangyu Zheng, Niall Emmart, Jiankuo Dong, Jingqiang Lin, and Charles Weems. Dpf-ecc: accelerating elliptic curve cryptography with floating-point computing power of gpus. In 2020 IEEE International Parallel and Distributed Processing Symposium, 2020.
[15]
Craig Gentry and Daniel Wichs. Separating succinct non-interactive arguments from all falsifiable assumptions. In 43th annual ACM symposium on Theory of computing, pages 99--108, 2011.
[16]
Oded Goldreich and Hugo Krawczyk. On the composition of zero-knowledge proof systems. SIAM Journal on Computing, 25(1):169--192, 1996.
[17]
Shafi Goldwasser, Silvio Micali, and Charles Rackoff. The knowledge complexity of interactive proof systems. SIAM J. COMPUT, 18(1):186--208, 1989.
[18]
Yinjie Gong, Yifei Jin, Yuchan Li, Ziyi Liu, and Zhiyi Zhu. Analysis and comparison of the main zero-knowledge proof scheme. In 2022 International Conference on Big Data, Information and Computer Network, pages 366--372. IEEE, 2022.
[19]
Jens Groth. Non-interactive zero-knowledge arguments for voting. In Applied Cryptography and Network Security: Third International Conference, pages 467--482. Springer, 2005.
[20]
Jens Groth. On the size of pairing-based non-interactive arguments. In 35th Annual International Conference on the Theory and Applications of Cryptographic Techniques, pages 305--326. Springer, 2016.
[21]
Icicle. a gpu library for zero-knowledge acceleration. https://github.com/ingonyama-zk/icicle, 2023.
[22]
Immutable. Immutable x: powering the next generation of web3 games. https://www.immutable.com/products/immutable-x, 2023.
[23]
C Kaya Koc, Tolga Acar, and Burton S Kaliski. Analyzing and comparing montgomery multiplication algorithms. IEEE, 1996.
[24]
Karl Leboeuf, Roberto Muscedere, and Majid Ahmadi. A gpu implementation of the montgomery multiplication algorithm for elliptic curve cryptography. In 2013 IEEE International Symposium on Circuits and Systems, pages 2593--2596. IEEE, 2013.
[25]
Honglei Li and Weilian Xue. A blockchain-based sealed-bid e-auction scheme with smart contract and zero-knowledge proof. Security and Communication Networks, 2021:1--10, 2021.
[26]
Shigang Li, Kazuki Osawa, and Torsten Hoefler. Efficient quantized sparse matrix operations on tensor cores. In SC22: International Conference for High Performance Computing, Networking, Storage and Analysis, pages 1--15. IEEE, 2022.
[27]
Supranational LLC. Zero-knowledge template library. https://github.com/supranational/sppark, 2023.
[28]
Loopring. zkrollup layer 2 for trading and payment. https://loopring.org, 2023.
[29]
Tao Lu, Chengkun Wei, Ruijing Yu, Chaochao Chen, Wenjing Fang, Lei Wang, Zeke Wang, and Wenzhi Chen. Cuzk: Accelerating zero-knowledge proof with a faster parallel multi-scalar multiplication algorithm on gpus. Cryptology ePrint Archive, 2022.
[30]
Weiliang Ma, Qian Xiong, Xuanhua Shi, Xiaosong Ma, Hai Jin, Haozhao Kuang, Mingyu Gao, Ye Zhang, Haichen Shen, and Weifang Hu. Gzkp: A gpu accelerated zero-knowledge proof system. In 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2, 2023.
[31]
Peter L Montgomery. Modular multiplication without trial division. Mathematics of computation, 44(170):519--521, 1985.
[32]
Steven Muchnick. Advanced compiler design implementation. Morgan kaufmann, 1997.
[33]
Ning Ni and Yongxin Zhu. Enabling zero knowledge proof by accelerating zk-snark kernels on gpu. Journal of Parallel and Distributed Computing, 173:20--31, 2023.
[34]
NVIDIA. Nvidia a100 tensor core gpu architecture. https://images.nvidia.com/aem-dam/en-zz/Solutions/data-center/nvidia-ampere-architecture-whitepaper.pdf, 2020.
[35]
Wuqiong Pan, Fangyu Zheng, Yuan Zhao, Wen-Tao Zhu, and Jiwu Jing. An efficient elliptic curve cryptography signature server with gpu acceleration. IEEE Transactions on Information Forensics and Security, 12(1):111--122, 2016.
[36]
Andy Ray, Ben Devlin, Fu Yong Quah, and Rahul Yesantharao. High performance, open source cryptographic solutions for large scale number theoretic transforms and multi-scalar multiplications in hardcaml. https://github.com/fyquah/hardcaml_zprize, 2023.
[37]
Nicolae Roşia, Virgil Cervicescu, and Mihai Togan. Efficient montgomery multiplication on gpus. In International Conference for Information Technology and Communications. Springer, 2015.
[38]
Howard Wu, Wenting Zheng, Alessandro Chiesa, Raluca Ada Popa, and Ion Stoica. {DIZK}: A distributed zero knowledge proof system. In 27th USENIX Security Symposium, pages 675--692, 2018.
[39]
Yrrid. https://www.yrrid.com, 2023.
[40]
Ye Zhang, Shuo Wang, Xian Zhang, Jiangbin Dong, Xingzhong Mao, Fan Long, Cong Wang, Dong Zhou, Mingyu Gao, and Guangyu Sun. Pipezk: Accelerating zero-knowledge proof with a pipelined architecture. In 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture, pages 416--428. IEEE, 2021.
[41]
Yupeng Zhang, Daniel Genkin, Jonathan Katz, Dimitrios Papadopoulos, and Charalampos Papamanthou. vsql: Verifying arbitrary sql queries over dynamic outsourced databases. In 2017 IEEE Symposium on Security and Privacy, pages 863--880. IEEE, 2017.
[42]
Kaiyong Zhao. Implementation of multiple-precision modular multiplication on gpu. In GPU Technology Conference, 2009.
[43]
Zprize. Accelerating the future of zero knowledge cryptography. https://www.zprize.io, 2023.

Cited By

View all
  • (2025)Accelerating Number Theoretic Transform with Multi-GPU Systems for Efficient Zero Knowledge ProofProceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3669940.3707241(1-14)Online publication date: 3-Feb-2025
  • (2024)zkStream: a Framework for Trustworthy Stream ProcessingProceedings of the 25th International Middleware Conference10.1145/3652892.3700763(252-265)Online publication date: 2-Dec-2024
  • (2024)High-speed batch verification for discrete-logarithm-based signatures via Multi-Scalar Multiplication AlgorithmJournal of Information Security and Applications10.1016/j.jisa.2024.10389887:COnline publication date: 1-Dec-2024

Index Terms

  1. Accelerating Multi-Scalar Multiplication for Efficient Zero Knowledge Proofs with Multi-GPU Systems

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      ASPLOS '24: Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3
      April 2024
      1106 pages
      ISBN:9798400703867
      DOI:10.1145/3620666
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      In-Cooperation

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 27 April 2024

      Check for updates

      Author Tags

      1. zero knowledge proof
      2. multi-scalar multiplication
      3. multi-GPU systems
      4. pippenger's algorithm

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      ASPLOS '24

      Acceptance Rates

      Overall Acceptance Rate 535 of 2,713 submissions, 20%

      Upcoming Conference

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)780
      • Downloads (Last 6 weeks)67
      Reflects downloads up to 19 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2025)Accelerating Number Theoretic Transform with Multi-GPU Systems for Efficient Zero Knowledge ProofProceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3669940.3707241(1-14)Online publication date: 3-Feb-2025
      • (2024)zkStream: a Framework for Trustworthy Stream ProcessingProceedings of the 25th International Middleware Conference10.1145/3652892.3700763(252-265)Online publication date: 2-Dec-2024
      • (2024)High-speed batch verification for discrete-logarithm-based signatures via Multi-Scalar Multiplication AlgorithmJournal of Information Security and Applications10.1016/j.jisa.2024.10389887:COnline publication date: 1-Dec-2024

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media