Abstract
Modular multiplication of long integers is a key component of elliptic curve cryptography and homomorphic encryption. The multiplication complexity can be reduced by applying the Karatsuba algorithm that decomposes the operands into shorter segments. Nevertheless, for long numbers, it takes many clock cycles in previous designs to calculate the final result by adding the segment products and then carrying out modular reduction. This paper considers Solinas prime moduli and proposes to integrate modular reduction into the segment products computed in the Karatsuba multiplication process. Accordingly, the intermediate results become much shorter and they can be added simultaneously using a Wallace-tree-based multi-input adder with small area overhead. Moduli of different formats are investigated in this paper. In addition, various optimization schemes are proposed to further reduce the latency and area requirement. Complexity analysis shows that, for 2, 3 and 4 decomposed multiplication with an example modulus, our design on average achieves 18.5% reduction on the latency with 5.5% increase in the area compared to the design that carries out modular reduction after final result of the multiplication is computed.
Similar content being viewed by others
References
Solinas, J. (1999). Generalized Mersenne numbers. Center for Applied Cryptographic Research: University of Waterloo.
National Institute of Standards and Technology (NIST). (2000). Federal Information Processing Standard (FIPS) 186-4. Digital Signature Standard.
Angel, J., & Morales-Luna, G. (2010). Solinas primes of small weight for fixed sizes. IACR Cryptol, ePrint Arch.
Montgomery, P. L. (1985). Modular multiplication without trial division. Mathematics of Computation, 44, 519–521.
Schönhage, A., & Strassen, V. (1971). Schnelle multiplikation großer Zahlen. Computing, 7(3), 281–292.
Karatsuba, A., & Ofman, Y. (1962). Multiplication of many-digital numbers by automatic computers. Proc. of the USSR Academy of Sciences, 145, pp. 293-294.
Yazaki, S., & Abe, K. (2009). VLSI design of Karatsuba integer multipliers and its evaluation. Electronics and Communications in Japan, 92(4).
Weimerskirch A., & Paar, C. (2003). Generalizations of the Karatsuba algorithm for efficient implementations. Technical Report, Ruhr-Universität-Bochum, Germany.
Rebeiro, C., & Mukhopadhyay, D. (2008). High speed compact elliptic curve cryptoprocessor for FPGA platforms. Proceeding International Conference on Cryptology in India, Springer, pp. 376-388.
Parhi, K. K. (1999). VLSI Digital Signal Processing Systems: Design and Implementations. Wiley.
Langhammer, M., & Pasca, B. (2021). Folded integer multiplication for FPGAs (pp. 160–170). New York, USA: Proc. ACM/SIGDA Intl. Symp. on Field-Programmable Gate Arrays.
Chung, J., & Hasan, A. (2003). More generalized Mersenne numbers. Proc. Intl Workshop on Selected Areas in Cryptography, pp. 335-347.
Bluemel, R., Laue, R., & Huss, S. A. (2005). A highly efficient modular multiplication algorithm for finite field arithmetic in GF(P). Proc. of ECRYPT Workshop, Cryptographic Advances in Secure Hardware.
Gu, Z., & Li, S. (2020). A novel method of modular multiplication based on Karatsuba-like multiplication. Proc: IEEE Symp. on Computer Arithmetic.
Tan, W., et al. (2021) High-speed modular multiplier for lattice-based cryptosystems. IEEE Transactions on Circuits and Systems II: Express Briefs, 68(8), pp. 2927-2931.
Zhang, X., & Parhi, K. K. (2021). Reduced-complexity modular polynomial multiplication for R-LWE cryptosystems. Speech and Signal Processing: Proc. of Intl. Conf. on Acoustics.
Huai, Z., Parhi, K. K., & Zhang, X. (2021). Efficient architecture for long integer modular multiplication over Solinas prime. IEEE Workshop on Signal Processing Syst: Proc.
Liu, R., & Li, S. (2019). A design and implementation of Montgomery modular multiplier. Sapporo, Japan: Proceeding of IEEE International Symposium on Circuits and Systems.
Funding
This work is supported in part by Semiconductor Research Corporation under contract number 2020-HW-2988.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Huai, Z., Zhou, J. & Zhang, X. Efficient Hardware Implementation Architectures for Long Integer Modular Multiplication over General Solinas Prime. J Sign Process Syst 94, 1067–1082 (2022). https://doi.org/10.1007/s11265-022-01794-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-022-01794-z