Skip to main content

Knowledge Discovery in Auto-tuning Parallel Numerical Library

  • Chapter
  • First Online:
Progress in Discovery Science

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2281))

Abstract

This paper proposes the parallel numerical library called ILIB which realises auto-tuning facilities with selectable calculation kernels, communication methods between processors, and various number of unrolling for loop expansion. This auto-tuning methodology has advantage not only in usability of library but also in performance of library. In fact, results of the performance evaluation show that the auto-tuning or auto-correction feature for the parameters is a crucial technique to attain high performance. A set of parameters which are auto-selected by this auto-tuning methodology gives us several kinds of important knowledge for highly efficient program production. These kinds of knowledge will help us to develop some other high-performance programs, in general.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Katagiri, T., Kuroda, H., Ohsawa, K. and Kanada, Y.: I-LIB: An Automatically Tuned Parallel Numerical Library and Its Performance Evaluation, JSPP 2000, pp. 27–34 (2000). in Japanese.

    Google Scholar 

  2. Katagiri, T., Kuroda, H., Kudoh, M. and Kanada, Y.: Performance Evaluation of an Auto-tuned Parallel Eigensolver with a Super-computer and a PC-Cluster, JSPP 2001, pp. 73–74 (2001). in Japanese.

    Google Scholar 

  3. Blackford, L., Choi, J., Cleary, A., D’Azevedo, E., Demmel, J., Dhillon, I., Dongarra, J., Hammarling, S., Henry, G., Petitet, A., Stanley, K., Walker, D. and Whaley, R.: ScaLAPACK Users’ Guide, SIAM (1997).

    Google Scholar 

  4. Bilmes, J., Asanović, K., Chin, C.-W. and Demmel, J.: Optimizing Matrix Multiply using PHiPAC: a Portable, High-Performance, ANSI C Coding Methodology, Proceedings of International Conference on Supercomputing 97, Vienna, Austria, pp. 340–347 (1997).

    Google Scholar 

  5. Whaley, R. C., Petitet, A. and Dongarra, J. J.: Automated Empirical Optimizations of Software and the ATLAS Project, Parallel Computing, Vol. 27, pp. 3–35 (2001).

    Article  MATH  Google Scholar 

  6. Frigo, M.: A Fast Fourier Transform Compiler, Proceedings of the 1999 ACM SIGPLAN Conference on Programming Language Design and Implementation, Atlanta, Georgia, pp. 169–180 (1999).

    Google Scholar 

  7. Ohsawa, K., Katagiri, T., Kuroda, H. and Kanada, Y.: ILIB RLU: An Automatically Tuned Parallel Dense LU Factorization Routine and Its Performance Evaluation, IPSJ SIG Notes, 00-HPC-82, pp. 25–30 (2000). in Japanese.

    Google Scholar 

  8. Ohsawa, K.: Performance Evaluation of Auto-tuned Sparse Direct Solver called ILIB RLU, Super Computing News, Vol. 2, No. 5, pp. 23–36 (2000). Computer Centre Division, Information Technology Center, The University of Tokyo, in Japanese.

    Google Scholar 

  9. Kuroda, H., Katagiri, T. and Kanada, Y.: Performance of Automatically Tuned Parallel GMRES(m) Method on Distributed Memory Machines, Proceedings of Vector and Parallel Processing (VECPAR) 2000, Porto, Portugal, pp. 251–264 (2000).

    Google Scholar 

  10. Kuroda, H., Katagiri, T. and Kanada, Y.: Performance Evaluation of Linear Equations Library on Parallel Computers, IPSJ SIG Notes, 00-HPC-82, pp. 35–40 (2000). in Japanese.

    Google Scholar 

  11. Kudoh, M., Kuroda, H., Katagiri, T. and Kanada, Y.: A Proposal for GCR Methods with Less Memory Requirement, IPSJ SIG Notes, 2001-HPC-85, pp. 79–84 (2000). in Japanese.

    Google Scholar 

  12. Katagiri, T. and Kanada, Y.: A Parallel Implementation of Eigensolver and its Performance, IPSJ SIG Notes, 97-HPC-69, pp. 49–54 (1997). in Japanese.

    Google Scholar 

  13. Katagiri, T., Kuroda, H. and Kanada, Y.: A Methodology for Automatically Tuned Parallel Tri-diagonalization on Distributed Memory Parallel Machines, Proceedings of Vector and Parallel Processing (VECPAR) 2000, Porto, Portugal, pp. 265–277 (2000).

    Google Scholar 

  14. Katagiri, T.: A Study on Large Scale Eigensolvers for Distributed Memory Parallel Machines, Ph.D Thesis, The University of Tokyo (2001).

    Google Scholar 

  15. Kuroda, H. and Kanada, Y.: Performance of Automatically Tuned Parallel Sparse Linear Equations Solver, IPSJ SIG Notes, 99-HPC-76, pp. 13–18 (1999). in Japanese.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Kuroda, H., Katagiri, T., Kanada, Y. (2002). Knowledge Discovery in Auto-tuning Parallel Numerical Library. In: Arikawa, S., Shinohara, A. (eds) Progress in Discovery Science. Lecture Notes in Computer Science(), vol 2281. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45884-0_48

Download citation

  • DOI: https://doi.org/10.1007/3-540-45884-0_48

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43338-5

  • Online ISBN: 978-3-540-45884-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics