Skip to main content
Log in

Polynomial Multiplication Architecture with Integrated Modular Reduction for R-LWE Cryptosystems

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

The ring-learning with errors (R-LWE) problem is the basic building block of many ciphers resisting quantum-computing attacks and homomorphic encryption enabling computations on encrypted data. The most critical operation in these schemes is modular multiplication of long polynomials with large coefficients. The polynomial multiplication complexity can be reduced by the Karatsuba formula. In this work, a new method is proposed to integrate modular reduction into the Karatsuba polynomial multiplication. Modular reduction is carried out on intermediate segment products instead of the final product so that more substructure sharing is enabled. Moreover, this paper develops a complete architecture for the modular polynomial multiplication. Computation scheduling optimizations are proposed to reduce the memory access and number of clock cycles needed. Taking advantage of the additional shareable substructures, the proposed scheme reduces the size of the memories, which account for the majority of the modular polynomial multiplier silicon area, by 20% and 12.5%, when the Karatsuba decomposition factor is 2 and 3, respectively, and achieves shorter latency compared to prior designs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7

Similar content being viewed by others

References

  1. Avanzi, R., et al. (2019). CRYSTALS - Kyber: Algorithm specifications and supporting documentation. https://pq-crystals.org/. Accessed 30 July 2021.

  2. D’Anvers, J., et al. (2021). SABER: MLWR-based KEM. https://www.esat.kuleuven.be/cosic/pqcrypto/saber/. Accessed 30 July 2021.

  3. Brakerrski, Z., Gentry, C., & Vaikunranathan, V. (2012). Fully homomorphic encryption without bootstrapping. Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, 309–325.

  4. Brakerrski, Z., (2012). Fully homomorphic encryption without modulus switching from classical GapSVP. Advances in Cryptology – CRYPTO (pp. 868–886).

  5. Roy, S. S., Turan, F., Jarvinen, K., Vercauteren, F., & Verbauwhede I. (2019). FPGA-based high-performance parallel architecture for homomorphic computing on encrypted data. Proceeding of the IEEE International Symposium on High Performance Computer Architecture, 387–398.

  6. Freking, W. L. & Parhi, K. K. (1999). Parallel modular multiplication with application to VLSI RSA implementation. Proceeding of the IEEE International Symposium on Circuits and System, 490–495.

  7. Ding, J. & Li, S. (2020). A low-latency and low-cost Montgomery modular multiplier based on NLP multiplication. IEEE Transactions on Circuits and Systems-II, 67(7), 1319–1323.

  8. Roy, S. S., Vercauteren, F., Vliegen, J., & Verbauwhede, I. (2017). Hardware assisted fully homomorphic function evaluation and encrypted search. IEEE Transactions on Computers, 66(9), 1562–1572.

    Article  Google Scholar 

  9. Yarkin, D., Ozturk, E., & Sunar, B. (2014). Accelerating fully homomorphic encryption in hardware. IEEE Transactions on Computers, 64(6), 1509–1521.

    MathSciNet  MATH  Google Scholar 

  10. Tan, W., et al. (2019). An efficient polynomial multiplier architecture for the bootstrapping algorithm in a fully homomorphic encryption scheme. Proceeding of the IEEE International Conference on Signal Processing Systems, 85–90.

  11. Mert, A. C., Ozturk, E., & Savas, E. (2019). Design and implementation of a fast and scalable NTT-based polynomial multiplier architecture. Proceeding of the Euromicro Conference on Digital System Design, 253–260.

  12.  Riazi, M., et al. (2020). HEAX: an architecture for computing on encrypted data. Proceeding of the International Conference on Architectural Support for Programming Languages and Operating Systems, 1295–1309.

  13. Poppelmann, T., & Guneysu, T. (2012). Towards efficient arithmetic for lattice-based cryptography on reconfigurable hardware. Proceeding of the International Conference on Cryptology and Information Security in Latin America, 139–158.

  14. Zhang, Y., et al. (2020). An efficient and parallel R-LWE cryptoprocessor. IEEE Transactions on Circuits and System-II, 67(5), 886–890.

  15. Zhang, X., & Parhi, K. K. (2021). Reduced-complexity modular polynomial multiplication for R-LWE cryptosystems. Proceeding of International Conference on Acoustics, Speech and Signal Processing, 7853–7857.

  16. Karatsuba, A. & Ofman, Y. (1962). Multiplication of many-digital numbers by automatic computers. In Doklady Akademii Nauk (vol. 145, no. 2, pp. 293–294). Russian Academy of Sciences.

  17. Parker, D. A., & Parhi, K. K. (1997). Low area/power parallel FIR digital filter implementations. Journal of VLSI Signal Processing, 17(1), 75–92.

    Article  Google Scholar 

  18. Parhi, K. K. (1999). VLSI digital signal processing systems. John Wiley & Sons.

  19. Cheng, C. & Parhi, K. K. (2004). Hardware efficient fast parallel FIR filter structures based on iterated short convolution. IEEE Transactions on Circuits and Systems, Part-I: Regular Papers, 51(8), 1492–1500.

  20. Yabuuchi, M., et al. (2014). 20nm high-density single-port and dual-port SRAMs with wordline-voltage-adjustment system for read/write assists. Proceeding of IEEE International Solid-State Circuits Conference, 234–235.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zheang Huai.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work is supported in part by Semiconductor Research Corporation under contract number 2020-HW-2988.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, X., Huai, Z. & Parhi, K.K. Polynomial Multiplication Architecture with Integrated Modular Reduction for R-LWE Cryptosystems. J Sign Process Syst 94, 799–809 (2022). https://doi.org/10.1007/s11265-022-01746-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-022-01746-7

Keywords

Navigation