Skip to main content
Log in

Speeding scalar multiplication over binary elliptic curves using the new carry-less multiplication instruction

  • Regular Paper
  • Published:
Journal of Cryptographic Engineering Aims and scope Submit manuscript

Abstract

The availability of a new carry-less multiplication instruction in the latest Intel desktop processors significantly accelerates multiplication in binary fields and hence presents the opportunity for reevaluating algorithms for binary field arithmetic and scalar multiplication over elliptic curves. We describe how to best employ this instruction in field multiplication and the effect on performance of doubling and halving operations. Alternate strategies for implementing inversion and half-trace are examined to restore most of their competitiveness relative to the new multiplier. These improvements in field arithmetic are complemented by a study on serial and parallel approaches for Koblitz and random curves, where parallelization strategies are implemented and compared. The contributions are illustrated with experimental results improving the state-of-the-art performance of halving and doubling-based scalar multiplication on NIST curves at the 112- and 192-bit security levels and a new speed record for side-channel-resistant scalar multiplication in a random curve at the 128-bit security level. The algorithms presented in this work were implemented on Westmere and Sandy Bridge processors, the latest generation Intel microarchitectures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Ahmadi O., Hankerson D., Rodríguez-Henríquez F.: Parallel formulations of scalar multiplication on Koblitz curves. J. UCS 14(3), 481–504 (2008)

    MathSciNet  MATH  Google Scholar 

  2. Aranha, D.F., López, J., Hankerson, D.: Efficient software implementation of binary field arithmetic using vector instruction sets. In: Abdalla, M., Barreto, P.S.L.M. (eds.) The First International Conference on Cryptology and Information Security (LATINCRYPT 2010). Lecture Notes in Computer Science, vol. 6212, pp. 144–161 (2010)

  3. Avanzi, R.M.: Another look at square roots (and other less common operations) in fields of even characteristic. In: Adams, C.M., Miri, A., Wiener, M.J. (eds.) 14th International Workshop on Selected Areas in Cryptography (SAC 2007). Lecture Notes in Computer Science, vol. 4876, pp. 138–154. Springer (2007)

  4. Bellare, M. (ed.): Advances in Cryptology—CRYPTO 2000. Lecture Notes in Computer Science, vol. 1880. Springer (2000)

  5. Bernstein, D., Lange, T.: Analysis and optimization of elliptic-curve single-scalar multiplication. In: Proceedings 8th International Conference on Finite Fields and Applications (Fq8), vol. 461, pp. 1–20. AMS (2008)

  6. Bernstein, D.J.: Batch Binary Edwards. In: Halevi, S. (ed.) Advances in Cryptology—CRYPTO 2009. Lecture Notes in Computer Science, vol. 5677, pp. 317–336. Springer (2009)

  7. Bernstein, D.J., Lange, T. (eds.) eBACS: ECRYPT Benchmarking of Cryptographic Systems. http://bench.cr.yp.to. Accessed 25 Aug 2011

  8. Beuchat, J.-L., Díaz, J., Mitsunari, S., Okamoto, E., Rodríguez-Henríquez, F., Teruya, T.: High-speed software implementation of the optimal ate pairing over Barreto-Naehrig curves. In: Joye, M., Miyaji, A., Otsuka, A. (eds.) Pairing-Based Cryptography—Pairing 2010. Lecture Notes in Computer Science, vol. 6487, pp. 21–39 (2010)

  9. Blake I.F., Murty V.K., Xu G.: A note on window τ-NAF algorithm. Inf. Process. Lett. 95(5), 496–502 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  10. Bodrato, M.: Towards optimal Toom-Cook multiplication for univariate and multivariate polynomials in characteristic 2 and 0. In: Carlet, C., Sunar, B. (eds.) Arithmetic of Finite Fields (WAIFI 2007). Lecture Notes in Computer Science, vol. 4547, pp. 116–133. Springer (2007)

  11. Bos, J.W., Kleinjung, T., Niederhagen, R., Schwabe, P.: ECC2K-130 on Cell CPUs. In: Bernstein, D.J., Lange, T. (eds.) 3rd International Conference on Cryptology in Africa (AFRICACRYPT 2010). Lecture Notes in Computer Science, vol. 6055, pp. 225–242. Springer (2010)

  12. Comba P.G.: Exponentiation cryptosystems on the IBM PC. IBM Syst. J. 29(4), 526–538 (1990)

    Article  Google Scholar 

  13. Dahmen, E., Okeya, K., Schepers, D.: Affine precomputation with sole inversion in elliptic curve cryptography. In: Pieprzyk, J., Ghodosi, H., Dawson, E. (eds.) Information Security and Privacy (ACISP 2007). Lecture Notes in Computer Science, vol. 4586, pp. 245–258. Springer (2007)

  14. Firasta, N., Buxton, M., Jinbo, P., Nasri, K., Kuo, S.: Intel AVX: new frontiers in performance improvement and energy efficiency. White paper. http://software.intel.com/

  15. Fog, A.: Instruction tables: list of instruction latencies, throughputs and micro-operation breakdowns for Intel, AMD and VIA CPUs. http://www.agner.org/optimize/instruction_tables.pdf. Accessed 01 Mar 2011

  16. Fong K., Hankerson D., López J., Menezes A.: Field inversion and point halving revisited. IEEE Trans. Comput. 53(8), 1047–1059 (2004)

    Article  Google Scholar 

  17. Grabher, P., Großschädl, J., Page, D.: On software parallel implementation of cryptographic pairings. Cryptology ePrint Archive, Report 2008/205. http://eprint.iacr.org/ (2008)

  18. Guajardo J., Paar C.: Itoh-Tsujii inversion in standard basis and its application in cryptography and codes. Des. Codes Cryptogr. 25(2), 207–216 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  19. Gueron, S.: Intel Advanced Encryption Standard (AES) Instructions Set. White paper. http://software.intel.com/

  20. Gueron, S., Kounavis, M.E.: Carry-less multiplication and its usage for computing the GCM mode. White paper. http://software.intel.com/

  21. Hankerson D., Menezes A.J., Vanstone S.: Guide to Elliptic Curve Cryptography. Springer, Secaucus (2004)

    MATH  Google Scholar 

  22. Hu, Z., Longa P., Xu, M.: Implementing 4-dimensional GLV method on GLS elliptic curves with j-invariant 0. Des. Codes Cryptogr. (to appear)

  23. Intel.: Intel SSE4 Programming Reference. Technical Report. http://software.intel.com/

  24. Itoh T., Tsujii S.: A fast algorithm for computing multiplicative inverses in GF (2m) using normal bases. Inf. Comput. 78(3), 171–177 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  25. Järvinen, K., Optimized FPGA-based elliptic curve cryptography processor for high-speed applications. Integr. VLSI J. (to appear)

  26. Järvinen K.U., Skyttä J.: On parallelization of high-speed processors for elliptic curve cryptography. IEEE Trans. VLSI Syst. 16(9), 1162–1175 (2008)

    Article  Google Scholar 

  27. Järvinen K.U., Skyttä J.: Fast point multiplication on Koblitz curves: Parallelization method and implementations. Microprocess. Microsyst. Embedded Hardware Des. 33(2), 106–116 (2009)

    Article  Google Scholar 

  28. Karatsuba, A., Ofman, Y.: Multiplication of many-digital numbers by automatic computers. Doklady Akad. Nauk SSSR 145, 293–294 (1962). Translation in Physics-Doklady 7, 595–596 (1963)

    Google Scholar 

  29. Kim, K.H., Kim, S.I.: A new method for speeding up arithmetic on elliptic curves over binary fields. Cryptology ePrint Archive, Report 2007/181. http://eprint.iacr.org/ (2007)

  30. King, B. Rubin, B.: Improvements to the point halving algorithm. In: Wang, H., Pieprzyk, J., Varadharajan, V. (eds.) 9th Australasian Conference on Information Security and Privacy (ACISP 2004). Lecture Notes in Computer Science, vol. 3108, pp. 262–276. Springer (2004)

  31. Knudsen, E.: Elliptic scalar multiplication using point halving. In: Lam, K., Okamoto, E. (eds.) Advances in Cryptology—ASIACRYPT ’99. Lecture Notes in Computer Science, vol. 1716, pp. 135–149. Springer (1999)

  32. Koblitz, N.: CM-curves with good cryptographic properties. In: Feigenbaum, J. (ed.) Advances in Cryptology—CRYPTO ’91. Lecture Notes in Computer Science, vol. 576, pp. 279–287. Springer (1992)

  33. Lomont, C.: Introduction to Intel advanced vector extensions. Intel Software Network. http://software.intel.com/file/37205 (2011)

  34. Longa, P., Gebotys, C.H.: Efficient techniques for high-speed elliptic curve cryptography. In: Mangard, S., Standaert, F.-X. (eds.) Cryptographic Hardware and Embedded Systems (CHES 2010). Lecture Notes in Computer Science, vol. 6225, pp. 80–94. Springer (2010)

  35. López, J., Dahab, R.: Fast multiplication on elliptic curves over GF (2m)without precomputation. In: Koç, Ç.K., Paar, C. (eds.) First International Workshop on Cryptographic Hardware and Embedded Systems (CHES 99). Lecture Notes in Computer Science, vol. 1717, pp. 316–327. Springer (1999)

  36. López, J., Dahab, R.: High-speed software multiplication in GF(2m). In: Roy, B.K., Okamoto, E. (eds.) 1st International Conference in Cryptology in India (INDOCRYPT 2000). Lecture Notes in Computer Science, vol. 1977, pp. 203–212. Springer (2000)

  37. Lutz, J., Hasan, M.A.: High performance FPGA based elliptic curve cryptographic co-processor. In: International Conference on Information Technology: Coding and Computing (ITCC’04), vol. 2, pp. 486–492. IEEE Computer Society (2004)

  38. Montgomery P.L.: Five, six, and seven-term Karatsuba-like formulae. IEEE Trans. Comput. 54(3), 362–369 (2005)

    Article  MATH  Google Scholar 

  39. National Institute of Standards and Technology (NIST).: Recommended Elliptic Curves for Federal Government Use. NIST Special Publication. http://csrc.nist.gov/csrc/fedstandards.html. Accessed July 1999

  40. Schroeppel, R.: Elliptic curves: Twice as fast! Presentation at the CRYPTO 2000 [4] Rump Session (2000)

  41. Solinas J.A.: Efficient arithmetic on Koblitz curves. Des. Codes Cryptogr. 19(2-3), 195–249 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  42. Taverne, J., Faz-Hernández, A., Aranha, D.F., Rodríguez-Henríquez, F., Hankerson, D., López, J.: Software implementation of binary elliptic curves: impact of the carry-less multiplier on scalar multiplication. In: International Workshop on Cryptographic Hardware and Embedded Systems (CHES 2011). Lecture Notes in Computer Science, vol. 6917, Springer, New York (2011)

  43. Wall, D.W.: Limits of instruction-level parallelism. In: 4th International Conference on Architectural Support for Programming Languages and Operating System (ASPLOS 91), pp. 176–188. ACM, New York (1991)

  44. Wulf W.A., McKee S.A.: Hitting the memory wall: implications of the obvious. SIGARCH Comput. Architect. News 23(1), 20–24 (1995)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francisco Rodríguez-Henríquez.

Additional information

J. Taverne: This work was performed while the author was visiting CINVESTAV-IPN. D. F. Aranha: A portion of this work was performed while the author was visiting University of Waterloo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Taverne, J., Faz-Hernández, A., Aranha, D.F. et al. Speeding scalar multiplication over binary elliptic curves using the new carry-less multiplication instruction. J Cryptogr Eng 1, 187–199 (2011). https://doi.org/10.1007/s13389-011-0017-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13389-011-0017-8

Keywords

Navigation