Abstract
This paper proposes the parallel numerical library called ILIB which realises auto-tuning facilities with selectable calculation kernels, communication methods between processors, and various number of unrolling for loop expansion. This auto-tuning methodology has advantage not only in usability of library but also in performance of library. In fact, results of the performance evaluation show that the auto-tuning or auto-correction feature for the parameters is a crucial technique to attain high performance. A set of parameters which are auto-selected by this auto-tuning methodology gives us several kinds of important knowledge for highly efficient program production. These kinds of knowledge will help us to develop some other high-performance programs, in general.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Katagiri, T., Kuroda, H., Ohsawa, K. and Kanada, Y.: I-LIB: An Automatically Tuned Parallel Numerical Library and Its Performance Evaluation, JSPP 2000, pp. 27–34 (2000). in Japanese.
Katagiri, T., Kuroda, H., Kudoh, M. and Kanada, Y.: Performance Evaluation of an Auto-tuned Parallel Eigensolver with a Super-computer and a PC-Cluster, JSPP 2001, pp. 73–74 (2001). in Japanese.
Blackford, L., Choi, J., Cleary, A., D’Azevedo, E., Demmel, J., Dhillon, I., Dongarra, J., Hammarling, S., Henry, G., Petitet, A., Stanley, K., Walker, D. and Whaley, R.: ScaLAPACK Users’ Guide, SIAM (1997).
Bilmes, J., Asanović, K., Chin, C.-W. and Demmel, J.: Optimizing Matrix Multiply using PHiPAC: a Portable, High-Performance, ANSI C Coding Methodology, Proceedings of International Conference on Supercomputing 97, Vienna, Austria, pp. 340–347 (1997).
Whaley, R. C., Petitet, A. and Dongarra, J. J.: Automated Empirical Optimizations of Software and the ATLAS Project, Parallel Computing, Vol. 27, pp. 3–35 (2001).
Frigo, M.: A Fast Fourier Transform Compiler, Proceedings of the 1999 ACM SIGPLAN Conference on Programming Language Design and Implementation, Atlanta, Georgia, pp. 169–180 (1999).
Ohsawa, K., Katagiri, T., Kuroda, H. and Kanada, Y.: ILIB RLU: An Automatically Tuned Parallel Dense LU Factorization Routine and Its Performance Evaluation, IPSJ SIG Notes, 00-HPC-82, pp. 25–30 (2000). in Japanese.
Ohsawa, K.: Performance Evaluation of Auto-tuned Sparse Direct Solver called ILIB RLU, Super Computing News, Vol. 2, No. 5, pp. 23–36 (2000). Computer Centre Division, Information Technology Center, The University of Tokyo, in Japanese.
Kuroda, H., Katagiri, T. and Kanada, Y.: Performance of Automatically Tuned Parallel GMRES(m) Method on Distributed Memory Machines, Proceedings of Vector and Parallel Processing (VECPAR) 2000, Porto, Portugal, pp. 251–264 (2000).
Kuroda, H., Katagiri, T. and Kanada, Y.: Performance Evaluation of Linear Equations Library on Parallel Computers, IPSJ SIG Notes, 00-HPC-82, pp. 35–40 (2000). in Japanese.
Kudoh, M., Kuroda, H., Katagiri, T. and Kanada, Y.: A Proposal for GCR Methods with Less Memory Requirement, IPSJ SIG Notes, 2001-HPC-85, pp. 79–84 (2000). in Japanese.
Katagiri, T. and Kanada, Y.: A Parallel Implementation of Eigensolver and its Performance, IPSJ SIG Notes, 97-HPC-69, pp. 49–54 (1997). in Japanese.
Katagiri, T., Kuroda, H. and Kanada, Y.: A Methodology for Automatically Tuned Parallel Tri-diagonalization on Distributed Memory Parallel Machines, Proceedings of Vector and Parallel Processing (VECPAR) 2000, Porto, Portugal, pp. 265–277 (2000).
Katagiri, T.: A Study on Large Scale Eigensolvers for Distributed Memory Parallel Machines, Ph.D Thesis, The University of Tokyo (2001).
Kuroda, H. and Kanada, Y.: Performance of Automatically Tuned Parallel Sparse Linear Equations Solver, IPSJ SIG Notes, 99-HPC-76, pp. 13–18 (1999). in Japanese.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Kuroda, H., Katagiri, T., Kanada, Y. (2002). Knowledge Discovery in Auto-tuning Parallel Numerical Library. In: Arikawa, S., Shinohara, A. (eds) Progress in Discovery Science. Lecture Notes in Computer Science(), vol 2281. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45884-0_48
Download citation
DOI: https://doi.org/10.1007/3-540-45884-0_48
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43338-5
Online ISBN: 978-3-540-45884-5
eBook Packages: Springer Book Archive