skip to main content
10.1145/3620666.3651364acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article
Free Access

Accelerating Multi-Scalar Multiplication for Efficient Zero Knowledge Proofs with Multi-GPU Systems

Published:27 April 2024Publication History

ABSTRACT

Zero-knowledge proof is a cryptographic primitive that allows for the validation of statements without disclosing any sensitive information, foundational in applications like verifiable outsourcing and digital currency. However, the extensive proof generation time limits its widespread adoption. Even with GPU acceleration, proof generation can still take minutes, with Multi-Scalar Multiplication (MSM) accounting for about 78.2% of the workload. To address this, we present DistMSM, a novel MSM algorithm tailored for distributed multi-GPU systems. At the algorithmic level, DistMSM adapts Pippenger's algorithm for multi-GPU setups, effectively identifying and addressing bottlenecks that emerge during scaling. At the GPU kernel level, DistMSM introduces an elliptic curve arithmetic kernel tailored for contemporary GPU architectures. It optimizes register pressure with two innovative techniques and leverages tensor cores for specific big integer multiplications. Compared to state-of-the-art MSM implementations, DistMSM offers an average 6.39× speedup across various elliptic curves and GPU counts. An MSM task that previously took seconds on a single GPU can now be completed in mere tens of milliseconds. It showcases the substantial potential and efficiency of distributed multi-GPU systems in ZKP acceleration.

References

  1. Inc Advanced Micro Devices. Amd rocm open software platform. https://rocm.docs.amd.com, 2023.Google ScholarGoogle Scholar
  2. Sebastian Angel, Andrew J Blumberg, Eleftherios Ioannidis, and Jess Woods. Efficient representation of numerical optimization problems for snarks. In 31st USENIX Security Symposium, 2022.Google ScholarGoogle Scholar
  3. Samuel Antao, Jean-Claude Bajard, and Leonel Sousa. Elliptic curve point multiplication on gpus. In ASAP 2010-21st IEEE International Conference on Application-specific Systems, Architectures and Processors, pages 192--199. IEEE, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  4. Gautam Botrel and Youssef El Housni. Faster montgomery multiplication and multi-scalar-multiplication for snarks. 2023.Google ScholarGoogle ScholarCross RefCross Ref
  5. Henri Cohen, Atsuko Miyaji, and Takatoshi Ono. Efficient elliptic curve exponentiation using mixed coordinates. In Advances in Cryptology---ASIACRYPT'98: International Conference on the Theory and Application of Cryptology and Information Security Beijing, China, October 18--22, 1998 Proceedings, pages 51--65. Springer, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  6. ZCash Crop. Zcash is cash for the new age. https://z.cash, 2023.Google ScholarGoogle Scholar
  7. Cysic. Hardware accelerating zero-knowledge proofs. http://cysic.xyz, 2023.Google ScholarGoogle Scholar
  8. George Danezis, Cedric Fournet, Markulf Kohlweiss, and Bryan Parno. Pinocchio coin: building zerocoin from a succinct pairing-based proof system. In First ACM workshop on Language support for privacy-enhancing technologies, pages 27--30, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Marwa Elteir, Heshan Lin, and Wu-chun Feng. Performance characterization and optimization of atomic operations on amd gpus. In 2011 IEEE International Conference on Cluster Computing, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Boyuan Feng, Lianke Qin, Zhenfei Zhang, Yufei Ding, and Shumo Chu. Zen: An optimizing compiler for verifiable, zero-knowledge neural network inferences. Cryptology ePrint Archive, 2021.Google ScholarGoogle Scholar
  11. Filecoin. bellperson: Gpu parallel acceleration for zk-snark. https://github.com/filecoin-project/bellperson, 2023.Google ScholarGoogle Scholar
  12. Mina Foundation. Gpu groth16 prover (3x faster than cpu). https://github.com/MinaProtocol/gpu-groth16-prover-3x, 2023.Google ScholarGoogle Scholar
  13. Hisham S Galal and Amr M Youssef. Verifiable sealed-bid auction on the ethereum blockchain. In Financial Cryptography and Data Security: FC 2018 International Workshops, pages 265--278. Springer, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Lili Gao, Fangyu Zheng, Niall Emmart, Jiankuo Dong, Jingqiang Lin, and Charles Weems. Dpf-ecc: accelerating elliptic curve cryptography with floating-point computing power of gpus. In 2020 IEEE International Parallel and Distributed Processing Symposium, 2020.Google ScholarGoogle ScholarCross RefCross Ref
  15. Craig Gentry and Daniel Wichs. Separating succinct non-interactive arguments from all falsifiable assumptions. In 43th annual ACM symposium on Theory of computing, pages 99--108, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Oded Goldreich and Hugo Krawczyk. On the composition of zero-knowledge proof systems. SIAM Journal on Computing, 25(1):169--192, 1996.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Shafi Goldwasser, Silvio Micali, and Charles Rackoff. The knowledge complexity of interactive proof systems. SIAM J. COMPUT, 18(1):186--208, 1989.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Yinjie Gong, Yifei Jin, Yuchan Li, Ziyi Liu, and Zhiyi Zhu. Analysis and comparison of the main zero-knowledge proof scheme. In 2022 International Conference on Big Data, Information and Computer Network, pages 366--372. IEEE, 2022.Google ScholarGoogle ScholarCross RefCross Ref
  19. Jens Groth. Non-interactive zero-knowledge arguments for voting. In Applied Cryptography and Network Security: Third International Conference, pages 467--482. Springer, 2005.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Jens Groth. On the size of pairing-based non-interactive arguments. In 35th Annual International Conference on the Theory and Applications of Cryptographic Techniques, pages 305--326. Springer, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  21. Icicle. a gpu library for zero-knowledge acceleration. https://github.com/ingonyama-zk/icicle, 2023.Google ScholarGoogle Scholar
  22. Immutable. Immutable x: powering the next generation of web3 games. https://www.immutable.com/products/immutable-x, 2023.Google ScholarGoogle Scholar
  23. C Kaya Koc, Tolga Acar, and Burton S Kaliski. Analyzing and comparing montgomery multiplication algorithms. IEEE, 1996.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Karl Leboeuf, Roberto Muscedere, and Majid Ahmadi. A gpu implementation of the montgomery multiplication algorithm for elliptic curve cryptography. In 2013 IEEE International Symposium on Circuits and Systems, pages 2593--2596. IEEE, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  25. Honglei Li and Weilian Xue. A blockchain-based sealed-bid e-auction scheme with smart contract and zero-knowledge proof. Security and Communication Networks, 2021:1--10, 2021.Google ScholarGoogle Scholar
  26. Shigang Li, Kazuki Osawa, and Torsten Hoefler. Efficient quantized sparse matrix operations on tensor cores. In SC22: International Conference for High Performance Computing, Networking, Storage and Analysis, pages 1--15. IEEE, 2022.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Supranational LLC. Zero-knowledge template library. https://github.com/supranational/sppark, 2023.Google ScholarGoogle Scholar
  28. Loopring. zkrollup layer 2 for trading and payment. https://loopring.org, 2023.Google ScholarGoogle Scholar
  29. Tao Lu, Chengkun Wei, Ruijing Yu, Chaochao Chen, Wenjing Fang, Lei Wang, Zeke Wang, and Wenzhi Chen. Cuzk: Accelerating zero-knowledge proof with a faster parallel multi-scalar multiplication algorithm on gpus. Cryptology ePrint Archive, 2022.Google ScholarGoogle Scholar
  30. Weiliang Ma, Qian Xiong, Xuanhua Shi, Xiaosong Ma, Hai Jin, Haozhao Kuang, Mingyu Gao, Ye Zhang, Haichen Shen, and Weifang Hu. Gzkp: A gpu accelerated zero-knowledge proof system. In 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2, 2023.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Peter L Montgomery. Modular multiplication without trial division. Mathematics of computation, 44(170):519--521, 1985.Google ScholarGoogle ScholarCross RefCross Ref
  32. Steven Muchnick. Advanced compiler design implementation. Morgan kaufmann, 1997.Google ScholarGoogle Scholar
  33. Ning Ni and Yongxin Zhu. Enabling zero knowledge proof by accelerating zk-snark kernels on gpu. Journal of Parallel and Distributed Computing, 173:20--31, 2023.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. NVIDIA. Nvidia a100 tensor core gpu architecture. https://images.nvidia.com/aem-dam/en-zz/Solutions/data-center/nvidia-ampere-architecture-whitepaper.pdf, 2020.Google ScholarGoogle Scholar
  35. Wuqiong Pan, Fangyu Zheng, Yuan Zhao, Wen-Tao Zhu, and Jiwu Jing. An efficient elliptic curve cryptography signature server with gpu acceleration. IEEE Transactions on Information Forensics and Security, 12(1):111--122, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Andy Ray, Ben Devlin, Fu Yong Quah, and Rahul Yesantharao. High performance, open source cryptographic solutions for large scale number theoretic transforms and multi-scalar multiplications in hardcaml. https://github.com/fyquah/hardcaml_zprize, 2023.Google ScholarGoogle Scholar
  37. Nicolae Roşia, Virgil Cervicescu, and Mihai Togan. Efficient montgomery multiplication on gpus. In International Conference for Information Technology and Communications. Springer, 2015.Google ScholarGoogle Scholar
  38. Howard Wu, Wenting Zheng, Alessandro Chiesa, Raluca Ada Popa, and Ion Stoica. {DIZK}: A distributed zero knowledge proof system. In 27th USENIX Security Symposium, pages 675--692, 2018.Google ScholarGoogle Scholar
  39. Yrrid. https://www.yrrid.com, 2023.Google ScholarGoogle Scholar
  40. Ye Zhang, Shuo Wang, Xian Zhang, Jiangbin Dong, Xingzhong Mao, Fan Long, Cong Wang, Dong Zhou, Mingyu Gao, and Guangyu Sun. Pipezk: Accelerating zero-knowledge proof with a pipelined architecture. In 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture, pages 416--428. IEEE, 2021.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Yupeng Zhang, Daniel Genkin, Jonathan Katz, Dimitrios Papadopoulos, and Charalampos Papamanthou. vsql: Verifying arbitrary sql queries over dynamic outsourced databases. In 2017 IEEE Symposium on Security and Privacy, pages 863--880. IEEE, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  42. Kaiyong Zhao. Implementation of multiple-precision modular multiplication on gpu. In GPU Technology Conference, 2009.Google ScholarGoogle Scholar
  43. Zprize. Accelerating the future of zero knowledge cryptography. https://www.zprize.io, 2023.Google ScholarGoogle Scholar

Index Terms

  1. Accelerating Multi-Scalar Multiplication for Efficient Zero Knowledge Proofs with Multi-GPU Systems

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ASPLOS '24: Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3
        April 2024
        1106 pages
        ISBN:9798400703867
        DOI:10.1145/3620666

        Copyright © 2024 Copyright is held by the owner/author(s). Publication rights licensed to ACM.

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 27 April 2024

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate535of2,713submissions,20%
      • Article Metrics

        • Downloads (Last 12 months)139
        • Downloads (Last 6 weeks)139

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader