Abstract
Dense linear algebra kernels such as matrix multiplication have been used as benchmarks to evaluate the effectiveness of many automated compiler optimizations. However, few studies have looked at collectively applying the transformations and parameterizing them for external search. In this paper, we take a detailed look at the optimization space of three dense linear algebra kernels. We use a transformation scripting language (POET) to implement each kernel-level optimization as applied by ATLAS. We then extensively parameterize these optimizations from the perspective of a general-purpose compiler and use a stand-alone empirical search engine to explore the optimization space using several different search strategies. Our exploration of the search space reveals key interaction among several transformations that must be considered by compilers to approach the level of efficiency obtained through manual tuning of kernels.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Agakov, F., Bonilla, E., Cavazos, J., Franke, B., Fursin, G., O’Boyle, M., Thomson, J., Toussaint, M., Williams, C.: Using machine learning to focus iterative optimization. In: International Symposium on Code Generation and Optimization, 2006 (CGO 2006), New York, NY (2006)
Baumgartner, G., Auer, A., Bernholdt, D.E., Bibireata, A., Choppella, V., Cociorva, D., Gao, X., Harrison, R.J., Hirata, S., Krishnamoorthy, S., Krishnan, S., Lam, C.-C., Lu, Q., Nooijen, M., Pitzer, R.M., Ramanujam, J., Sadayappan, P., Sibiryakov, A.: Synthesis of high-performance parallel programs for a class of ab initio quantum chemistry models. Proc. IEEE, Special Issue on Program Generation, Optimization, and Adaptation 93(2) (2005)
Bientinesi, P., Gunnels, J.A., Myers, M.E., Quintana-Orti, E., van de Geijn, R.: The science of deriving dense linear algebra algorithms. ACM Transactions on Mathematical Software 31(1), 1–26 (2005)
Chen, C., Chame, J., Hall, M.: Combining models and guided empirical search to optimize for multiple levels of the memory hierarchy. In: CGO, San Jose, CA, USA (March 2005)
Demmel, J., Dongarra, J., Eijkhout, V., Fuentes, E., Petitet, A., Vuduc, R., Whaley, C., Yelick, K.: Self adapting linear algebra algorithms and software. Proc. IEEE, Special Issue on Program Generation, Optimization, and Adaptation 93(2) (2005)
Frigo, M., Johnson, S.G.: The design and implementation of FFTW3. Proc. IEEE, Special Issue on Program Generation, Optimization, and Adaptation 93(2) (2005)
Püschel, M., Moura, J.M.F., Johnson, J., Padua, D., Veloso, M., Singer, B.W., Xiong, J., Franchetti, F., Gačić, A., Voronenko, Y., Chen, K., Johnson, R.W., Rizzolo, N.: SPIRAL: Code generation for DSP transforms. Proc. IEEE, Special Issue on Program Generation, Optimization, and Adaptation 93(2) (2005)
Qasem, A., Kennedy, K., Mellor-Crummey, J.: Automatic tuning of whole applications using direct search and a performance-based transformation system. The Journal of Supercomputing 36(2), 183–196 (2006)
Stephenson, M., Amarasinghe, S.: Predicting unroll factors using supervised classification. In: CGO, San Jose, CA, USA (March 2005)
Whaley, R.C., Petitet, A., Dongarra, J.J.: Automated empirical optimization of software and the ATLAS project. Parallel Computing 27(1–2), 3–35 (2001)
Whaley, R.C., Whalley, D.B.: Tuning high performance kernels through empirical compilation. In: The 2005 International Conference on Parallel Processing (June 2005)
Yi, Q., Whaley, C.: Automated transformation for performance-critical kernels. In: ACM SIGPLAN Symposium on Library-Centric Software Design, Montreal, Canada (October 2007)
Yotov, K., Li, X., Ren, G., Cibulskis, M., DeJong, G., Garzaran, M., Padua, D., Pingali, K., Stodghill, P., Wu, P.: A comparison of empirical and model-driven optimization. In: Proceedings of the SIGPLAN 2003 Conference on Programming Language Design and Implementation, San Diego, CA (June 2003)
Zhao, Y., Yi, Q., Kennedy, K., Quinlan, D., Vuduc, R.: Parameterizing loop fusion for automated empirical tuning. Technical Report UCRL-TR-217808, Center for Applied Scientific Computing, Lawrence Livermore National Laboratory (December 2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yi, Q., Qasem, A. (2008). Exploring the Optimization Space of Dense Linear Algebra Kernels. In: Amaral, J.N. (eds) Languages and Compilers for Parallel Computing. LCPC 2008. Lecture Notes in Computer Science, vol 5335. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89740-8_24
Download citation
DOI: https://doi.org/10.1007/978-3-540-89740-8_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89739-2
Online ISBN: 978-3-540-89740-8
eBook Packages: Computer ScienceComputer Science (R0)