Abstract
There exist algorithms, also called “fast” algorithms, which exploit the special structure of Toeplitz matrices so that, e.g., allow to solve a linear system of equations in O(n 2) flops. However, some implementations of classical algorithms that do not use this structure (O(n 3) flops) highly reduce the time to solution when several cores are available. That is why it is necessary to work on “fast” algorithms so that they do not lose track of the benefits of new hardware/software. In this work, we propose a new approach to the Generalized Schur Algorithm, a very known algorithm for the solution of Toeplitz systems, to work on a Block–Toeplitz matrix. Our algorithm is based on matrix–matrix multiplications, thus allowing to exploit an efficient implementation of this operation if it exists. Our algorithm also makes use of the thread level parallelism featured by multicores to decrease execution time.






Similar content being viewed by others
References
Alonso P, Badía JM, Vidal AM (2005) An efficient parallel algorithm to solve block–Toeplitz systems. J Supercomput 32:251–278
Alonso P, Argüeso F, Cortina R, Ranilla J, Vidal AM Non-linear parallel solver for detecting point sources in CMB maps using Bayesian techniques. J Math Chem. doi:10.1007/s10910-012-0078-7
Anderson E et al (1999) LAPACK users’ guide, 3rd edn. SIAM, Philadelphia
Bischof C, van Loan C (1987) The WY representation for products of householder matrices. SIAM J Sci Stat Comput 8(1):2–13
Chun J, Kailath T, Lev-Ari H (1987) Fast parallel algorithms for QR and triangular factorization. SIAM J Sci Stat Comput 8(6):899–913
Cybenko G, Berry M (1990) Hyperbolic householder algorithms for factoring structured matrices. SIAM J Matrix Anal Appl 11(4):499–520
Gallivan K, Thirumalai S, Van Dooren P (1994) On solving block Toeplitz systems using a block Schur algorithm. In: Proceedings of the 23rd international conference on parallel processing, vol 3. CRC Press, Boca Raton, pp 274–281
Gustavson FG (1997) Recursion leads to automatic variable blocking for dense linear-algebra algorithms. IBM J Res Dev 41(6):737–755
Intel MKL (2012) http://software.intel.com/en-us/articles/intel-mkl
Jin XQ (2002) Developments and applications of Block Toeplitz iterative solvers. Combinatorics and computer science. Science Press, Beijing
Kailath T, Sayed AH (1995) Displacement structure: theory and applications. SIAM Rev 37(3):297–386
PLASMA Project (2012) The parallel linear algebra for scalable multi-core architectures. http://icl.cs.utk.edu/plasma
StructPack (2012) A high performance computing library for structured matrices. http://www.inco2.upv.es/structpack.html
Acknowledgements
PROMETEO/2009/013, Generalitat Valenciana. Projects TEC2009-13741, TIN2010-14971 and TIN2011-15734-E of the Ministerio Español de Ciencia e Innovación, and TEC2012-38142-C04 of the Ministerio Español de Economía y Competitividad.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Alonso, P., Argüelles, D., Ranilla, J. et al. A multicore solution to Block–Toeplitz linear systems of equations. J Supercomput 65, 999–1009 (2013). https://doi.org/10.1007/s11227-012-0824-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-012-0824-4