Abstract
We consider computing tall-skinny QR factorizations on a large-scale parallel machine. We present a realistic performance model and analyze the difference of the parallel execution time between Householder QR and TSQR. Our analysis indicates the possibility that TSQR becomes slower than Householder QR as the number of columns of the target matrix increases. We aim for estimating the difference and selecting the faster algorithm by using models, which falls into auto-tuning. Numerical experiments on the K computer support our analysis and show our success in determining the faster algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Operated at the RIKEN Advanced Institute for Computational Science.
References
Bai, Z., Demmel, J., Dongarra, J., Ruhe, A., van der Vorst, H.: Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide. SIAM, Philadelphi (2000)
Gutknecht, M.H.: Block Krylov space methods for linear systems with multiple right-hand sides: An introduction (2006)
Sakurai, T., Sugiura, H.: A projection method for generalized eigenvalue problems using numerical integration. J. Comput. Appl. Math. 159, 119–128 (2003)
Ballard, G., Demmel, J., Holtz, O., Schwartz, O.: Minimizing communication in numerical linear algebra. SIAM J. Matrix Anal. Appl. 32, 866–901 (2011)
Demmel, J., Grigori, L., Hoemmen, M., Langou, J.: Communication-avoiding parallel and sequential QR factorizations. CoRR abs/0806.2159 (2008)
Demmel, J., Grigori, L., Hoemmen, M., Langou, J.: Communication-optimal parallel and sequential QR and LU factorizations. SIAM J. Sci. Comp 34, 206–239 (2012)
Golub, G.H., Van Loan, C.F.: Matrix Computations, 4th edn. The Johns Hopkins University Press, Baltimore (2012)
Agullo, E., Coti, C., Dongarra, J., Herault, T., Langou, J.: Qr factorization of tall and skinny matrices in a grid computing environment. In: 24th IEEE International Parallel and Distributed Processing Symposium, pp. 1–11. IEEE (2010)
Constantine, G., Gleich, D.: Tall and skinny qr factorizations in mapreduce architectures. In: 2nd international workshop on MapReduce and its applications. pp. 43–50 (2011)
Langou, J.: Computing the r of the qr factorization of tall and skinny matrices using MPI\_Reduce. arXiv:1002.4250 (2010)
Song, F., Ltaief, H., Hadri, B., Dongarra, J.: Scalable tile communication-avoiding QR factorization on multicore cluster systems. In: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2010). pp. 1–11 (2010)
Dongarra, J., Faverge, M., HéRault, T., Jacquelin, M., Langou, J., Robert, Y.: Hierarchical QR factorization algorithms for multi-core clusters. Parallel Comput. 39, 212–232 (2013)
Ballard, G., Demmel, J., Grigori, L., Jacquelin, M., Nguyen, H.D., Solomonik, E.: Reconstructing Householder vectors from tall-skinny QR. Technical Report UCB/EECS-2013-175, EECS Department, University of California, Berkeley (2013)
Hoemmen, M.: A communication-avoiding, hybrid-parallel, rank-revealing orthogonalization method. In: 23th IEEE International Parallel and Distributed Processing Symposium, pp. 966–977. IEEE (2011)
Acknowledgments
The authors would like to thank the anonymous referees for their valuable comments. The first author appreciates the fruitful discussion with Dr. Mark Hoemmen at iWAPT2014. This research was supported by JST, CREST and used computational resources of the K computer provided by the RIKEN AICS through the HPCI System Research project (Project ID:hp120170).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Fukaya, T., Imamura, T., Yamamoto, Y. (2015). Performance Analysis of the Householder-Type Parallel Tall-Skinny QR Factorizations Toward Automatic Algorithm Selection. In: Daydé, M., Marques, O., Nakajima, K. (eds) High Performance Computing for Computational Science -- VECPAR 2014. VECPAR 2014. Lecture Notes in Computer Science(), vol 8969. Springer, Cham. https://doi.org/10.1007/978-3-319-17353-5_23
Download citation
DOI: https://doi.org/10.1007/978-3-319-17353-5_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-17352-8
Online ISBN: 978-3-319-17353-5
eBook Packages: Computer ScienceComputer Science (R0)