Skip to main content

A GPU Parallel Implementation of the RSA Private Operation

  • Conference paper
  • First Online:
Book cover High Performance Computing (CARLA 2016)

Abstract

The implementation of the RSA private operation tends to be expensive since its computationally complexity is cubic with respect to the bit-size of its private key. As a consequence, considerable effort has been put into optimizing this operation. In this work, we present a parallel implementation of the RSA private operation using the Single Instruction Multiple Thread (SIMT) threading model of Graphics Processor Unit (GPU) platforms. The underlying modular arithmetic is performed by means of the Residue Number System (RNS) representation. By combining these two approaches, we present a GPU software library that achieves high-speed timings for the RSA private operation when using 1024-, 2048- and 3072-bit secret keys.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    A thread divergence occurs when the threads do not execute the same instruction at the same time. Thread divergence is an important limiting factor in the exploitation of the parallelism of a program and therefore it must be avoided as much as possible.

  2. 2.

    Henceforth, we are assuming that the cost of one integer squaring is the same of an integer multiplication.

  3. 3.

    We stress that the fixed-window method requires the precomputation of up to \(2^w\) values.

References

  1. Bajard, J.C., Didier, L.S., Kornerup, P.: An RNS montgomery modular multiplication algorithm. IEEE Trans. Comput. 47(7), 766–776 (1998). http://dx.doi.org/10.1109/12.709376

    Article  MathSciNet  Google Scholar 

  2. Bajard, J., Imbert, L.: A full RNS implementation of RSA. IEEE Trans. Comput. 53(6), 769–774 (2004)

    Article  Google Scholar 

  3. Barker, E.: Recommendation for key management, NIST special publication 800–57 part 1 revision 4. Technical report, Gaithersburg, MD, United States, January 2016. http://nvlpubs.nist.gov/nistpubsSpecialPublications/NIST.Spp.800-57pt1r4.pdf

  4. Bernstein, D.J.: Multidigit modular multiplication with the explicit Chinese remainder theorem. Technical report (1995)

    Google Scholar 

  5. Dierks, T., Rescorla, E.: The Transport Layer Security (TLS) protocol version 1.2, RFC 5246. Network Working Group, IETF (2008). https://tools.ietf.org/html/rfc5246#section-8.1.1

  6. Fadhil, H.M., Younis, M.I.: Parallelizing RSA algorithm on multicore CPU and GPU. Int. J. Comput. Appl. 87(6), 15–22 (2014)

    Google Scholar 

  7. Harris, M.: Optimizing parallel reduction in CUDA. Technical report, nVidia (2008). http://developer.download.nvidia.com/assets/cuda/files/reduction.pdf

  8. Jang, K., Han, S., Han, S., Moon, S., Park, K.: SSLShader: cheap SSL acceleration with commodity processors. In: Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation, NSDI 2011, pp. 1–14. USENIX Association, Berkeley (2011)

    Google Scholar 

  9. Jeljeli, H.: Accélérateurs logiciels et matériels pour l’algèbre linéaire creuse sur les corps finis. Ph.D. thesis, Inria Nancy - Grand Est, LORIA - ALGO - Department of Algorithms, Computation, Image and Geometry, July 2015. https://hal.inria.fr/tel-01178931

  10. Jeljeli, H.: Accelerating iterative SpMV for the discrete logarithm problem using GPUs. In: Koç, Ç.K., Mesnager, S., Savaş, E. (eds.) WAIFI 2014. LNCS, vol. 9061, pp. 25–44. Springer, Cham (2015). doi:10.1007/978-3-319-16277-5_2

    Google Scholar 

  11. Moss, A., Page, D., Smart, N.P.: Toward acceleration of RSA using 3D graphics hardware. In: Galbraith, S.D. (ed.) Cryptography and Coding 2007. LNCS, vol. 4887, pp. 364–383. Springer, Heidelberg (2007). doi:10.1007/978-3-540-77272-9_22

    Chapter  Google Scholar 

  12. Neves, S., Araujo, F.: On the performance of GPU public-key cryptography. In: 2011 IEEE Proceedings of the 22nd International Conference on Application-Specific Systems, Architectures and Processors, ASAP 2011, Santa Monica, CA, USA, pp. 133–140 (2011)

    Google Scholar 

  13. nVidia: Parallel thread execution ISA v5.0, application guide. Technical report, September 2016. http://docs.nvidia.com/cuda/pdf/ptx_isa_5.0.pdf

  14. Rivest, R.L., Shamir, A., Adleman, L.: A method for obtaining digital signatures and public-key cryptosystems. Commun. ACM 21(2), 120–126 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  15. Szerwinski, R., Güneysu, T.: Exploiting the power of GPUs for asymmetric cryptography. In: Oswald, E., Rohatgi, P. (eds.) CHES 2008. LNCS, vol. 5154, pp. 79–99. Springer, Heidelberg (2008). doi:10.1007/978-3-540-85053-3_6

    Chapter  Google Scholar 

  16. Xiao, S., Feng, W-C.: Inter-block GPU communication via fast barrier synchronization. In: 2010 IEEE Proceedings of the International Symposium on Parallel Distributed Processing, IPDPS 2010, Atlanta, GA, pp. 1–12 (2010)

    Google Scholar 

  17. Yang, Y., Guan, Z., Sun, H., Chen, Z.: Accelerating RSA with fine-grained parallelism using GPU. In: Lopez, J., Wu, Y. (eds.) ISPEC 2015. LNCS, vol. 9065, pp. 454–468. Springer, Cham (2015). doi:10.1007/978-3-319-17533-1_31

    Chapter  Google Scholar 

  18. Zheng, F., Pan, W., Lin, J., Jing, J., Zhao, Y.: Exploiting the floating-point computing power of GPUs for RSA. In: Chow, S.S.M., Camenisch, J., Hui, L.C.K., Yiu, S.M. (eds.) ISC 2014. LNCS, vol. 8783, pp. 198–215. Springer, Cham (2014). doi:10.1007/978-3-319-13257-0_12

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nareli Cruz-Cortés .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Cruz-Cortés, N., Ochoa-Jiménez, E., Rivera-Zamarripa, L., Rodríguez-Henríquez, F. (2017). A GPU Parallel Implementation of the RSA Private Operation. In: Barrios Hernández, C., Gitler, I., Klapp, J. (eds) High Performance Computing. CARLA 2016. Communications in Computer and Information Science, vol 697. Springer, Cham. https://doi.org/10.1007/978-3-319-57972-6_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-57972-6_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-57971-9

  • Online ISBN: 978-3-319-57972-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics