Skip to main content

Parallel FDFM Approach for Computing GCDs Using the FPGA

  • Conference paper
  • First Online:
Parallel Processing and Applied Mathematics (PPAM 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9573))

  • 1199 Accesses

Abstract

The main contribution of this paper is to present an FPGA-targeted architecture called the hierarchical GCD cluster, that computes the GCDs of all pairs in a set of numbers. It is designed based on the FDFM (Few DSP slices and Few Memory blocks) approach and consists of 1408 processors equipped with one block RAM and one DSP slice each. Every processor works in parallel and computes the GCDs independently. We have measured the performance of our architecture to compute all pairs of two numbers in RSA moduli. Implementation results show that it runs 0.057\(\mu \)s per one GCD computation of two 1024-bit RSA moduli in a Xilinx Virtex-7 family FPGA XC7VX485T-2. It is 6.0 times faster than the best GPU implementation and 500 times faster than a sequential implementation on the Intel Xeon CPU.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ago, Y., Ito, Y., Nakano, K.: An FPGA implementation for neural networks with the FDFM processor core approach. Int. J. Parallel Emergent Distrib. Syst. 28(4), 308–320 (2013)

    Article  Google Scholar 

  2. Bordim, J.L., Ito, Y., Nakano, K.: Accelerating the CKY parsing using FPGAs. IEICE Trans. Inf. Syst. E86–D(5), 803–810 (2003)

    MATH  Google Scholar 

  3. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms. MIT Press, Cambridge (2001)

    MATH  Google Scholar 

  4. Devi, R., Singh, J., Singh, M.: VHDL implementation of GCD processor with built in self test feature. Int. J. Comput. Appl. 25(2), 50–54 (2013)

    Google Scholar 

  5. Fujimoto, N.: High throughput multiple-precision GCD on the CUDA architecture. In: Proceedings of the International Symposium on Signal Processing and Information Technology, pp. 507–512, December 2009

    Google Scholar 

  6. Fujita, T., Nakano, K., Ito, Y.: Bulk GCD computation using a GPU to break weak RSA keys. In: Proceedings of the International Parallel and Distributed Processing Symposium Workshops, pp. 385–394, May 2015

    Google Scholar 

  7. Ito, Y., Nakano, K., Bo, S.: The parallel FDFM processor core approach for CRT-based RSA decryption. Int. J. Netw. Comput. 2(1), 79–96 (2012)

    Article  Google Scholar 

  8. Knuth, D.E.: The Art of Computer Programming, Volume 2: Seminumerical Algorithms. Addison-Wesley, Reading (1997)

    Google Scholar 

  9. Kohale, S.D., Jasutkar, R.W.: Power optimization of GCD processor using low power Spartan 6 FPGA family. Int. J. Conceptions Electron. Commun. Eng. 2(1), 1–6 (2014)

    Google Scholar 

  10. Lenstra, A.K., Hughes, J.P., Augier, M., Bos, J.W., Kleinjung, T., Wachter, C.: Ron was wrong, Whit is right. Cryptology ePrint Archive, Report 2012/064 (2012). http://eprint.iacr.org/

  11. Nakano, K., Yamagishi, Y.: Hardware n choose k counters with applications to the partial exhaustive search. IEICE Trans. Inf. Syst. E88–D(7), 1350–1359 (2005)

    Article  Google Scholar 

  12. Nakano, K., Kawakami, K., Shigemoto, K.: RSA encryption and decryption using the redundant number system on the FPGA. In: Proceedings of the International Symposium on Parallel and Distributed Processing Workshops, pp. 1–8, May 2009

    Google Scholar 

  13. Rivest, R.L., Shamir, A., Adleman, L.: A method for obtaining digital signatures and public-key cryptosystems. Commun. ACM 21, 120–126 (1978)

    Article  MathSciNet  Google Scholar 

  14. Scharfglass, K., Weng, D., White, J., Lupo, C.: Breaking weak 1024-bit RSA keys with CUDA. In: Proceedings of the Internatinal Conference on Parallel and Distributed Computing, Applications and Technologies, pp. 207–212, December 2012

    Google Scholar 

  15. White, J.R.: PARIS: A PArallel RSA-prime InSpection tool. Ph.D. thesis, California Polytechnic State University - San Luis Obispo, June 2013

    Google Scholar 

  16. Xilinx Inc.: 7 Series DSP48E1 Slice User Guide, November 2014

    Google Scholar 

  17. Xilinx Inc.: 7 Series FPGAs Configurable Logic Block User Guide, November 2014

    Google Scholar 

  18. Xilinx Inc.: 7 Series FPGAs Memory Resources User Guide, November 2014

    Google Scholar 

  19. Xilinx Inc.: VC707 Evaluation Board for the Virtex-7 FPGA User Guide (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Koji Nakano .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Zhou, X., Nakano, K., Ito, Y. (2016). Parallel FDFM Approach for Computing GCDs Using the FPGA. In: Wyrzykowski, R., Deelman, E., Dongarra, J., Karczewski, K., Kitowski, J., Wiatr, K. (eds) Parallel Processing and Applied Mathematics. PPAM 2015. Lecture Notes in Computer Science(), vol 9573. Springer, Cham. https://doi.org/10.1007/978-3-319-32149-3_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-32149-3_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-32148-6

  • Online ISBN: 978-3-319-32149-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics