Abstract
This paper proposes a vectorized, cache efficient implementation of a floating-point version of the Lenstra-Lenstra-Lovász (LLL) algorithm, which is a key algorithm in many fields of computer science. We propose a re-arrangement of the data structures in LLL, which exposes parallelism and enables vectorization. We show that in one kernel, 128-bit SIMD vectorization works better than 256-bit, while in another kernel it is the other way around. In high lattice dimensions, this re-arrangement renders the implementation more cache friendly, thereby further increasing performance. Our floating-point LLL implementation is slightly slower than the implementation in the Number Theory Library (NTL) without vectorization, but 10% faster when vectorized, for lattices that require exhaustive computation with multi-precision. For larger lattices, we obtain a speedup factor of 35% over a non-vectorized implementation.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
- 2.
- 3.
- 4.
As the number of elements that are vectorized in the loop decreases, there may not be 4 elements, which are necessary to use 256-bit SIMD.
References
Backes, W., Wetzel, S.: Improving the parallel schnorr-euchner LLL algorithm. In: Xiang, Y., Cuzzocrea, A., Hobbs, M., Zhou, W. (eds.) ICA3PP 2011. LNCS, vol. 7016, pp. 27–39. Springer, Heidelberg (2011). doi:10.1007/978-3-642-24650-0_4
Koy, H., Schnorr, C.P.: Segment LLL-reduction of lattice bases. In: Silverman, J.H. (ed.) CaLC 2001. LNCS, vol. 2146, pp. 67–80. Springer, Heidelberg (2001). doi:10.1007/3-540-44670-2_7
Koy, H., Schnorr, C.P.: Segment LLL-reduction with floating point orthogonalization. In: Silverman, J.H. (ed.) CaLC 2001. LNCS, vol. 2146, pp. 81–96. Springer, Heidelberg (2001). doi:10.1007/3-540-44670-2_8
Lenstra, A., Lenstra, H., Lovász, L.: Factoring polynomials with rational coefficients. Math. Ann. 261, 515–534 (1982)
Nguên, P.Q., Stehlé, D.: Floating-point LLL revisited. In: Cramer, R. (ed.) EUROCRYPT 2005. LNCS, vol. 3494, pp. 215–233. Springer, Heidelberg (2005). doi:10.1007/11426639_13
Schnorr, C., et al.: Lattice basis reduction: Improved practical algorithms and solving subset sum problems. Math. Programm. 66, 181–191 (1993)
Stehlé, D.: Floating-point LLL: theoretical and practical aspects. In: Nguyen, P.Q., Vallée, B. (eds.) The LLL Algorithm - Survey and Applications, pp. 179–213. Springer, Heidelberg (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Mariano, A., Correia, F., Bischof, C. (2017). A Vectorized, Cache Efficient LLL Implementation. In: Dutra, I., Camacho, R., Barbosa, J., Marques, O. (eds) High Performance Computing for Computational Science – VECPAR 2016. VECPAR 2016. Lecture Notes in Computer Science(), vol 10150. Springer, Cham. https://doi.org/10.1007/978-3-319-61982-8_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-61982-8_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-61981-1
Online ISBN: 978-3-319-61982-8
eBook Packages: Computer ScienceComputer Science (R0)