Abstract
As quantum computers become more affordable and commonplace, existing security systems that are based on classical cryptographic primitives, such as RSA and Elliptic Curve Cryptography (ECC), will no longer be secure. Hence, there has been interest in designing post-quantum cryptographic (PQC) schemes, such as those based on lattice-based cryptography (LBC). The potential of LBC schemes is evidenced by the number of such schemes passing the selection of NIST PQC Standardization Process Round-3. One such scheme is the Crystals-Dilithium signature scheme, which is based on the hard module-lattice problem. However, there is no efficient implementation of the Crystals-Dilithium signature scheme. Hence, in this article, we present a compact hardware architecture containing elaborate modular multiplication units using the Karatsuba algorithm along with smart generators of address sequence and twiddle factors for NTT, which can complete polynomial addition/multiplication with the parameter setting of Dilithium in a short clock period. Also, we propose a fast software/hardware co-design implementation on Field Programmable Gate Array (FPGA) for the Dilithium scheme with a tradeoff between speed and resource utilization. Our co-design implementation outperforms a pure C implementation on a Nios-II processor of the platform Altera DE2-115, in the sense that our implementation is 11.2 and 7.4 times faster for signature and verification, respectively. In addition, we also achieve approximately 51% and 31% speed improvement for signature and verification, in comparison to the pure C implementation on processor ARM Cortex-A9 of ZYNQ-7020 platform.
- Carlos Aguilar-Melchor, Joris Barrier, Serge Guelton, Adrien Guinet, Marc-Olivier Killijian, and Tancrede Lepoint. 2016. NFLlib: NTT-based fast lattice library. In Cryptographers’ Track at the RSA Conference. Springer, 341–356. Google ScholarDigital Library
- Miklós Ajtai. 1996. Generating hard instances of lattice problems. In Proceedings of the 28th Annual ACM Symposium on Theory of Computing. 99–108. Google ScholarDigital Library
- Gorjan Alagic, Gorjan Alagic, Jacob Alperin-Sheriff, Daniel Apon, David Cooper, Quynh Dang, Yi-Kai Liu, Carl Miller, Dustin Moody, Rene Peralta, et al. 2020. Status report on the second round of the NIST PQC standardization process. U.S. Department of Commerce, National Institute of Standards and Technology.Google Scholar
- Erdem Alkim, Hülya Evkan, Norman Lahr, Ruben Niederhagen, and Richard Petri. 2020. ISA extensions for finite field arithmetic-accelerating kyber and NewHope on RISC-V.IACR Trans. Cryptogr. Hardw. Embed. Syst. 2020 (2020), 219–242.Google ScholarCross Ref
- Michael Baentsch. 2019. The Dilithium Implementation in pq-Crystals. Retrieved from https://github.com/pq-crystals/dilithium.Google Scholar
- Utsav Banerjee, Tenzin S. Ukyab, and Anantha P. Chandrakasan. 2019. Sapphire: A configurable crypto-processor for post-quantum lattice-based protocols (Extended Version). IACR Cryptol. ePrint Arch. 2019 (2019), 1140.Google Scholar
- Kanad Basu, Deepraj Soni, Mohammed Nabeel, and Ramesh Karri. 2019. NIST post-quantum cryptography-A hardware evaluation study. IACR Cryptology ePrint Archive 2019 (2019), 47.Google Scholar
- Günter Baszenski and Manfred Tasche. 1997. Fast polynomial multiplication and convolutions related to the discrete cosine transform. Linear Algebra Appl. 252, 1-3 (1997), 1–25.Google ScholarCross Ref
- Zvika Brakerski, Craig Gentry, and Vinod Vaikuntanathan. 2014. (Leveled) fully homomorphic encryption without bootstrapping. ACM Trans. Comput. Theory (TOCT) 6, 3 (2014), 1–36. Google ScholarDigital Library
- David G. Cantor and Erich Kaltofen. 1991. On fast multiplication of polynomials over arbitrary algebras. Acta Inf. 28, 7 (1991), 693–701. Google ScholarDigital Library
- Lily Chen, Stephen Jordan, Yi-Kai Liu, Dustin Moody, Rene Peralta, Ray Perlner, and Daniel Smith-Tone. 2016. Report on Post-Quantum Cryptography. Vol. 12. U.S. Department of Commerce, National Institute of Standards and Technology.Google Scholar
- James W. Cooley and John W. Tukey. 1965. An algorithm for the machine calculation of complex Fourier series. Math. Comput. 19, 90 (1965), 297–301.Google ScholarCross Ref
- Viet B. Dang, Farnoud Farahmand, Michal Andrzejczak, and Kris Gaj. 2019. Implementing and benchmarking three lattice-based post-quantum cryptography algorithms using software/hardware codesign. In 2019 International Conference on Field-Programmable Technology (ICFPT’19). 206–214.Google ScholarCross Ref
- Chaohui Du and Guoqiang Bai. 2016. Towards efficient polynomial multiplication for lattice-based cryptography. In 2016 IEEE International Symposium on Circuits and Systems (ISCAS’16). 1178–1181.Google ScholarDigital Library
- Léo Ducas, Eike Kiltz, Tancrède Lepoint, Vadim Lyubashevsky, Peter Schwabe, Gregor Seiler, and Damien Stehlé. 2017. Crystals-Dilithium.Google Scholar
- Morris J. Dworkin. 2015. SHA-3 Standard: Permutation-Based Hash and Extendable-Output Functions. Technical Report.Google Scholar
- Xiang Feng, Shuguo Li, and Sufen Xu. 2019. RLWE-oriented high-speed polynomial multiplier utilizing multi-lane Stockham NTT algorithm. IEEE Trans. Circ. Syst. II: Express Briefs 67, 3 (2019), 556--559.Google Scholar
- Tim Fritzmann, Georg Sigl, and Johanna Sepúlveda. 2020. RISQ-V: Tightly coupled RISC-V accelerators for post-quantum cryptography. IACR Cryptol. ePrint Arch. 2020 (2020), 446.Google Scholar
- W. Morven Gentleman and Gordon Sande. 1966. Fast Fourier transforms: For fun and profit. In Proceedings of the November 7–10, 1966, Fall Joint Computer Conference (AFIPS’66). 563–578. Google ScholarDigital Library
- Norman Göttert, Thomas Feller, Michael Schneider, Johannes Buchmann, and Sorin Huss. 2012. On the design of hardware building blocks for modern lattice-based encryption schemes. In International Workshop on Cryptographic Hardware and Embedded Systems. Springer, 512–529. Google ScholarDigital Library
- David Harvey and Joris van der Hoeven. 2019. Faster polynomial multiplication over finite fields using cyclotomic coefficient rings. Journal of Complexity 54 (2019), 101404. Google ScholarDigital Library
- James Howe, Ciara Moore, Máire O’Neill, Francesco Regazzoni, Tim Güneysu, and Kevin Beeden. 2016. Lattice-based encryption over standard lattices in hardware. In 2016 53rd ACM/EDAC/IEEE Design Automation Conference (DAC’16). IEEE, 1–6. Google ScholarDigital Library
- Anatolii Alekseevich Karatsuba and Yu P. Ofman. 1962. Multiplication of many-digital numbers by automatic computers. In Doklady Akademii Nauk, Vol. 145. Russian Academy of Sciences, 293–294.Google Scholar
- Po-Chun Kuo, Wen-Ding Li, Yu-Wei Chen, Yuan-Che Hsu, Bo-Yuan Peng, Chen-Mou Cheng, and Bo-Yin Yang. 2017. High performance post-quantum key exchange on FPGAs. Cryptology ePrint Archive.Google Scholar
- Weiqiang Liu, Sailong Fan, Ayesha Khalid, Ciara Rafferty, and Máire O’Neill. 2019. Optimized schoolbook polynomial multiplication for compact lattice-based cryptography on FPGA. IEEE Trans. VLSI Syst. 27, 10 (2019), 2459–2463.Google ScholarDigital Library
- Vadim Lyubashevsky. 2012. Lattice signatures without trapdoors. In Annual International Conference on the Theory and Applications of Cryptographic Techniques. Springer, 738–755. Google ScholarDigital Library
- Vadim Lyubashevsky and Daniele Micciancio. 2006. Generalized compact knapsacks are collision resistant. In International Colloquium on Automata, Languages, and Programming. Springer, 144–155. Google ScholarDigital Library
- Vadim Lyubashevsky, Chris Peikert, and Oded Regev. 2013. On ideal lattices and learning with errors over rings. J. ACM (JACM) 60, 6 (2013), 1–35. Google ScholarDigital Library
- Ahmet Can Mert, Erdinç Öztürk, and Erkay Savaş. 2019. Design and implementation of a fast and scalable NTT-based polynomial multiplier architecture. In 2019 22nd Euromicro Conference on Digital System Design (DSD’19). IEEE, 253–260.Google ScholarCross Ref
- Robert T. Moenck. 1976. Practical fast polynomial multiplication. In Proceedings of the 3rd ACM Symposium on Symbolic and Algebraic Computation. 136–148. Google ScholarDigital Library
- Peter L. Montgomery. 1985. Modular multiplication without trial division. Math. Comput. 44, 170 (1985), 519–521.Google ScholarCross Ref
- Karthikeyan Nagarajan, Sina Sayyah Ensan, Mohammad Nasim Imtiaz Khan, Swaroop Ghosh, and Anupam Chattopadhyay. 2019. SHINE: A novel SHA-3 implementation using ReRAM-based In-Memory computing. In 2019 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED’19). IEEE, 1–6.Google ScholarCross Ref
- Hamid Nejatollahi, Nikil Dutt, Sandip Ray, Francesco Regazzoni, Indranil Banerjee, and Rosario Cammarota. 2017. Software and hardware implementation of lattice-cased cryptography schemes. University of California Irvine, CECS TR 17, 4 (2017).Google Scholar
- Tobias Oder and Tim Güneysu. 2017. Implementing the NewHope-Simple key exchange on low-cost FPGAs. In International Conference on Cryptology and Information Security in Latin America. Springer, 128–142.Google Scholar
- Chris Peikert. 2009. Public-key cryptosystems from the worst-case shortest vector problem. In Proceedings of the 41st Annual ACM Symposium on Theory of Computing. 333–342. Google ScholarDigital Library
- John M. Pollard. 1971. The fast Fourier transform in a finite field. Math. Comput. 25, 114 (1971), 365–374.Google ScholarCross Ref
- Thomas Pöppelmann, Léo Ducas, and Tim Güneysu. 2014. Enhanced lattice-based signatures on reconfigurable hardware. In International Workshop on Cryptographic Hardware and Embedded Systems. Springer, 353–370. Google ScholarDigital Library
- Thomas Pöppelmann, Tobias Oder, and Tim Güneysu. 2015. High-performance ideal lattice-based cryptography on 8-bit ATxmega microcontrollers. In International Conference on Cryptology and Information Security in Latin America. Springer, 346–365. Google ScholarDigital Library
- Oded Regev. 2009. On lattices, learning with errors, random linear codes, and cryptography. J. ACM (JACM) 56, 6 (2009), 1–40. Google ScholarDigital Library
- Sujoy Sinha Roy, Frederik Vercauteren, Nele Mentens, Donald Donglong Chen, and Ingrid Verbauwhede. 2014. Compact ring-LWE cryptoprocessor. In International Workshop on Cryptographic Hardware and Embedded Systems. Springer, 371–391. Google ScholarDigital Library
- Michael Schneider. 2013. Sieving for shortest vectors in ideal lattices. In International Conference on Cryptology in Africa. Springer, 375–391.Google ScholarCross Ref
- Gregor Seiler. 2018. Faster AVX2 optimized NTT multiplication for Ring-LWE lattice cryptography. IACR Cryptology ePrint Archive 2018 (2018), 39.Google Scholar
- Peter W. Shor. 1999. Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer. SIAM Rev. 41, 2 (1999), 303–332. Google ScholarDigital Library
- Thom Wiggers. [n. d.]. PQClean: Clean, portable, tested implementations of post-quantum cryptography. Retrieved from https://github.com/PQClean/PQClean.Google Scholar
- Ming Ming Wong, Jawad Haj-Yahya, Suman Sau, and Anupam Chattopadhyay. 2018. A new high throughput and area efficient SHA-3 implementation. In 2018 IEEE International Symposium on Circuits and Systems (ISCAS’18). IEEE, 1–5.Google ScholarCross Ref
- Guozhu Xin, Jun Han, Tianyu Yin, Yuchao Zhou, Jianwei Yang, Xu Cheng, and Xiaoyang Zeng. 2020. VPQC: A domain-specific vector processor for post-quantum cryptography based on RISC-V architecture. IEEE Trans. Circ. Syst. I-Regular Papers 67, 8 (2020), 2672–2684.Google ScholarCross Ref
Index Terms
- A Software/Hardware Co-Design of Crystals-Dilithium Signature Scheme
Recommendations
Post-Quantum Signatures on RISC-V with Hardware Acceleration
CRYSTALS-Dilithium and Falcon are digital signature algorithms based on cryptographic lattices, which are considered secure even if large-scale quantum computers will be able to break conventional public-key cryptography. Both schemes have been selected ...
Implementing CRYSTALS-Dilithium Signature Scheme on FPGAs
ARES '21: Proceedings of the 16th International Conference on Availability, Reliability and SecurityIn July 2020, the lattice-based CRYSTALS-Dilithium digital signature scheme has been chosen as one of the three third-round finalists in the post-quantum cryptography standardization process by the National Institute of Standards and Technology (NIST). ...
High-performance and Configurable SW/HW Co-design of Post-quantum Signature CRYSTALS-Dilithium
CRYSTALS-Dilithium is a lattice-based post-quantum digital signature scheme that is resistant to attacks by quantum computers and has been selected to be standardized in the NIST post-quantum cryptography (PQC) standardization process. However, the speed ...
Comments