GPU Acceleration of Dense Matrix and Block Operations for Lanczos Method for Systems over Large Prime Finite Field

Zamarashkin, Nikolai; Zheltkov, Dmitry

doi:10.1007/978-3-319-71255-0_2

Nikolai Zamarashkin¹¹ &
Dmitry Zheltkov¹¹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 793))

Included in the following conference series:

Russian Supercomputing Days

945 Accesses
4 Citations

Abstract

GPU based acceleration of computations with dense matrices and blocks over large prime finite field are studied. Particular attention is paid to the following algorithms:

multiplication of rectangular \(N \times K\) blocks with \(N \gg K;\)
multiplication of \(N \times K\) blocks by square \(K \times K\) matrices;
LU-decomposition of matrices.

Several approaches for optimal use of GPU resources are proposed.

Efficiency analysis of implemented algorithms is provided for prime finite field with number of elements about \(2^{512},\) \(2^{768},\) \(2^{1024}\) and GPUs of different computational performance and architecture generations. Numerical experiments prove efficiency of proposed solutions.

From numerical results it follows that GPU usage allows to accelerate block operations and to expand area of almost linear parallel scalability of Lanczos method implementation by INM RAS. Moreover, a sparse system of size about 2 millions, with 82 average nonzero elements per row, over field with about \(2^{512}\) elements, on 128 nodes of Lomonosov supercomputer will be solved 2 times faster in case of GPUs used.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Kleinjung, T., Aoki, K., Franke, J., Lenstra, A.K., Thomé, E., Bos, J.W., Gaudry, P., Kruppa, A., Montgomery, P.L., Osvik, D.A., te Riele, H., Timofeev, A., Zimmermann, P.: Factorization of a 768-Bit RSA modulus. In: Rabin, T. (ed.) CRYPTO 2010. LNCS, vol. 6223, pp. 333–350. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14623-7_18
Chapter Google Scholar
Thome, E., et al.: Factorization of RSA-704 with CADO-NFS. Preprint, pp. 1–4 (2012)
Google Scholar
Dorofeev, A.Ya.: Vychislenie logarifmov v konechnom prostom pole metodom lineinogo resheta. [Computation of logarithms over finite prime fields using number sieving]. Trudy po diskretnoi matematike, vol. 5. pp. 29–50 (2002)
Google Scholar
Dorofeev, A.Y.: Solving systems of linear equations arising in the computation of logarithms in a finite prime field. Math. Aspects Crypt. 3(1), 551 (2012). Russian
Google Scholar
Popovyan, I.A., Nestrenko, Y.V., Grechnikov, E.A.: Vychislitelno slozhnye zadachi teorii chisel. Uchebnoe posobie [Computationally hard problems of number theory. Study guide] Publishing of the Lomonosov Moscow State University (2012)
Google Scholar
Zamarashkin, N., Zheltkov, D.: Block Lanczos–Montgomery method with reduced data exchanges. In: Voevodin, V., Sobolev, S. (eds.) RuSCDays 2016. CCIS, vol. 687, pp. 15–26. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-55669-7_2
Chapter Google Scholar
Zamarashkin, N.L.: Algoritmy dlya razrezhennykh sistem lineinykh uravneniy v GF(2). Uchebnoe posobie [Algorithms for systems of linear equations over GF(2). Study guide]. Publishing of the Lomonosov Moscow State University (2013)
Google Scholar
Efficient basic linear algebra operations for solution of large sparse linear systems over finite fields. Russian Supercomputing Days (2016)
Google Scholar
Nath, R., Tomov, S., Dongarra, J.: An improved MAGMA GEMM for Fermi graphics processing units. Int. J. High Perform. Comput. Appl. 24(4), 511–515 (2010)
Article Google Scholar
Nvidia Corporation, CUDA C. Programming guide. http://docs.nvidia.com/cuda/cuda-c-programming-guide

Download references

Acknowledgments

The work was supported by the Russian Science Foundation, grant 14-11-00806.

Author information

Authors and Affiliations

INM RAS, Gubkina 8, Moscow, Russia
Nikolai Zamarashkin & Dmitry Zheltkov

Authors

Nikolai Zamarashkin
View author publications
You can also search for this author in PubMed Google Scholar
Dmitry Zheltkov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nikolai Zamarashkin .

Editor information

Editors and Affiliations

Research Computing Center (RCC), Moscow State University, Moscow, Russia
Vladimir Voevodin
Moscow State University, Moscow, Russia
Sergey Sobolev

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zamarashkin, N., Zheltkov, D. (2017). GPU Acceleration of Dense Matrix and Block Operations for Lanczos Method for Systems over Large Prime Finite Field. In: Voevodin, V., Sobolev, S. (eds) Supercomputing. RuSCDays 2017. Communications in Computer and Information Science, vol 793. Springer, Cham. https://doi.org/10.1007/978-3-319-71255-0_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-71255-0_2
Published: 15 November 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-71254-3
Online ISBN: 978-3-319-71255-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics