Abstract
There are a variety of ways of applying the Karatsuba idea to multi-digit multiplication. These apply particularly well in the context where digits do not use the full word-length of the computer, so that partial products can be safely accumulated without fear of overflow. Here we re-visit the “arbitrary degree” version of Karatsuba and show that the cost of this little-known variant has been over-estimated in the past. We also attempt to definitively answer the question as to the cross-over point where Karatsuba performs better than the classic method.
Similar content being viewed by others
Notes
1 This reference was brought to our attention by an anonymous reviewer of [4]
References
Bernstein, D.J., Chuengsatiansup, C., Lange, T.: Curve41417: Karatsuba revisited. Cryptology ePrint Archive, Report 2014/526. http://eprint.iacr.org/2014/526(2014)
Bernstein, D.J., Duif, N., Lange, T., Schwabe, P., Yang, B.-Y.: High-speed high-security signatures. Cryptology ePrint Archive, Report 2011/368. http://eprint.iacr.org/2011/368 (2011)
Brent, R., Zimmermann, P.: Modern computer arithmetic. Cambridge University Press, Cambridge (2010)
Granger, R., Scott, M.: Faster ECC over \({F}_{2^{521}-1}\) Public-Key Cryptography – PKC 2015, volume 9020 of Lecture Notes in Computer Science, pp 539–553. Springer, Berlin Heidelberg (2015)
Granger, R., Moss, A.: Generalised Mersenne numbers revisited. Math. Comput. 82, 2389–2420 (2013). arXiv:1108.3054
Torbjörn Granlund and the GMP development team. GNU MP: The GNU Multiple Precision Arithmetic Library, 6.1.0 edn., 2015. http://gmplib.org/
Hamburg, M.: Ed448-Goldilocks, a new elliptic curve. Cryptology ePrint Archive, Report 2015/625. http://eprint.iacr.org/2015/625 (2015)
Khachatrian, G., Kuregian, M., Ispiryan, K., Massey, J.: Faster multiplication of integers for public-key applications Selected Areas in Cryptography, volume 2259 of Lecture Notes in Computer Science, pp 245–254. Springer, Berlin Heidelberg (2001)
Montgomery, P.: Modular multiplication without trial division. Math. Comput. 44(170), 519–521 (1985)
Montgomery, P.: Five, six and seven term Karatsuba-like formulae. IEEE Trans. Comput. 54(3), 362–369 (2005)
Nogami, Y., Saito, A., Morikawa, Y.: Finite extension field with modulus of all-one polynomial and representation of its elements for fast arithmetic operations. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. E86-A(9), 2376–2387 (2003)
Weimerskirch, A., Paar, C.: Generalization of the Karatsuba algorithm for efficient implementations. Cryptology ePrint Archive, Report 2006/224. http://eprint.iacr.org/2006/224 (2006)
Zimmermann, P.: Personal communication, January 2015
Acknowledgments
The author would like to thank Rob Granger, Billy Bob Brumley and Paul Zimmermann for helpful comments on an earlier draft of this paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
This article is part of the Topical Collection on Recent Trends in Cryptography.
Appendix
Appendix
1.1 More results
We carried out further tests on a variety of platforms. In all cases we used the GCC compiler tools. Where the well known GMP library could be installed, we provide a comparison with its assembly language mpn_mul_basecase () packed-radix SB implementation. However it should be noted that whereas the GMP code is only partially unrolled, ours is fully unrolled.
First up is a rather old Intel Core i5 chip running under the Ubuntu OS, and using GCC version 5.2.1 (Table 4).
Next a more modern i5 variant, running on an Apple Mac Mini (Table 5).
Finally results for an old 32-bit Intel Atom processor, using GCC version 4.8.4 (Table 6).
1.2 Example Code
Here we present an example of the loop unrolled C code for the SB and ADK methods that we used in our tests. In this small example the number of limbs n in x and y is 5. Code for carry propagation is included. In practise this code is automatically generated by a small utility program for any value of n.

1.3 Application to Montgomery’s REDC function
This well known method carries out reduction modulo m where field elements are first converted to n-residue form by multiplying them by b −n mod m, where b n is larger than, and co-prime to, m. Assume that a product of a pair of n-residues is to be reduced modulo m, and that the value of w = −1/m mod b is precalculated. The following ADK-based method carries out the reduction. This function may be tightly combined with that of algorithm (1) to provide an integrated modular multiplication/squaring function z = x y mod m for n-residues, where each z i is processed as soon as it is calculated.

This implementation includes full carry propogation. Observe that divisions and remainders modulo b are carried out using simple shift and masking operations as b is a power of 2. As is well known the output of this algorithm may require one extra subtraction of the modulus m to get a fully reduced result. However in many contexts field elements will not need to be immediately fully reduced. The number of muls and adds compared with a straight-forward SB-based implementation is given in Table 7. In some cases the constant w may be equal to 1, which allows some extra saving.
Rights and permissions
About this article
Cite this article
Scott, M. Missing a trick: Karatsuba variations. Cryptogr. Commun. 10, 5–15 (2018). https://doi.org/10.1007/s12095-017-0217-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12095-017-0217-x