skip to main content
10.1145/1772954.1772983acmconferencesArticle/Chapter ViewAbstractPublication PagescgoConference Proceedingsconference-collections
research-article

Parameterized tiling revisited

Published: 24 April 2010 Publication History

Abstract

Tiling, a key transformation for optimizing programs, has been widely studied in literature. Parameterized tiled code is important for auto-tuning systems since they often execute a large number of runs with dynamically varied tile sizes. Previous work on tiled code generation has addressed parameterized tiling for the sequential context, and the parallel case with fixed compile-time constants for tile sizes. In this paper, we revisit the problem of generating tiled code using parametric tile sizes. We develop a systematic approach to formulate tiling transformations through manipulation of linear inequalities and develop a novel approach to overcoming the fundamental obstacle faced by previous approaches regarding generation of parallel parameterized tiled code. To the best of our knowledge, the approach proposed in this paper is the first compile-time solution to the problem of parallel parameterized code generation for affine imperfectly nested loops. Experimental results demonstrate the effectiveness of the implemented system.

References

[1]
C. Ancourt and F. Irigoin. Scanning polyhedra with do loops. In PPoPP'91, pages 39--50, 1991.
[2]
U. Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan. A practical automatic polyhedral program optimization system. In PLDI'08, 2008.
[3]
P. Boulet, A. Darte, T. Risset, and Y. Robert. (Pen)-ultimate tiling? Integration, the VLSI Journal, 17(1):33--51, 1994.
[4]
S. Carr and K. Kennedy. Compiler blockability of numerical algorithms. In Proc. Supercomputing '92, pages 114--124, 1992.
[5]
C. Chen, J. Chame, and M. Hall. Chill: A framework for composing high-level loop transformations. Technical Report 08-897, USC Computer Science Technical Report, June 2008.
[6]
S. Coleman and K. McKinley. Tile Size Selection Using Cache Organization and Data Layout. In PLDI'95, pages 279--290, 1995.
[7]
P. Feautrier. Parametric integer programming. Operationnelle/Operations Research, 22(3):243--268, 1988.
[8]
G. Goumas, M. Athanasaki, and N. Koziris. An Efficient Code Generation Technique for Tiled Iteration Spaces. IEEE Trans. Parallel Distrib. Syst., 14(10):1021--1034, 2003.
[9]
G. I. Goumas, N. Drosinos, M. Athanasaki, and N. Koziris. Automatic parallel code generation for tiled nested loops. In Symposium on Applied Computing, pages 1412--1419, 2004.
[10]
A. Hartono, M. M. Baskaran, C. Bastoul, A. Cohen, S. Krishnamoorthy, B. Norris, J. Ramanujam, and P. Sadayappan. Parametric multi-level tiling of imperfectly nested loops. In ACM International Conference on Supercomputing (ICS), 2009.
[11]
A. Hartono, M. M. Baskaran, J. Ramanujam, and P. Sadayappan. Parametric tiled loop generation for effective parallel execution on multicore processors. In IPDPS '10: Proceedings of the 2010 IEEE International Symposium on Parallel & Distributed Processing, 2010.
[12]
HiTLoG: Hierarchical Tiled Loop Generator. http://www.cs.colostate.edu/MMAlpha/tiling/.
[13]
K. Hogstedt, L. Carter, and J. Ferrante. Selecting tile shape for minimal execution time. In SPAA, pages 201--211, 1999.
[14]
F. Irigoin and R. Triolet. Supernode partitioning. In ACM SIGPLAN Principles of Programming Languages, pages 319--329, 1988.
[15]
M. Jiménez, J. Llabería, and A. Fernández. Register tiling in nonrectangular iteration spaces. ACM Trans. Program. Lang. Syst., 24(4):409--453, 2002.
[16]
M. Jiménez, J. Llabería, and A. Fernández. A cost-effective implementation of multilevel tiling. IEEE Trans. Parallel Distrib. Syst., 14(10):1006--1020, 2003.
[17]
D. Kim and S. Rajopadhye. Parameterized tiling for imperfectly nested loops. Technical Report CS-09-101, Colorado State University, Department of Computer Science, February 2009.
[18]
D. Kim, L. Renganarayanan, M. Strout, and S. Rajopadhye. Multi-level tiling: 'm' for the price of one. In SC, 2007.
[19]
PIP: The Parametric Integer Programming Library. http://www.piplib.org.
[20]
Pluto: A polyhedral automatic parallelizer and locality optimizer for multicores. http://pluto-compiler.sourceforge.net.
[21]
PrimeTile: A Parametric Multi-Level Tiler for Imperfect Loop Nests. http://www.cse.ohio-state.edu/hartonoa/primetile/.
[22]
J. Ramanujam and P. Sadayappan. Tiling multidimensional iteration spaces for multicomputers. JPDC, 16(2):108--230, 1992.
[23]
L. Renganarayana. Scalable and Efficient Tools for Multi-level Tiling. PhD thesis, Colorado State University, Fort Collins, CO, February 2008.
[24]
L. Renganarayana, D. Kim, S. Rajopadhye, and M. Strout. Parameterized tiled loops for free. In PLDI'07, pages 405--414, 2007.
[25]
L. Renganarayana and S. Rajopadhye. A geometric programming framework for optimal multi-level tiling. In SC, 2004.
[26]
G. Rivera and C. Tseng. Locality optimizations for multi-level caches. In Supercomputing '99, page 2, 1999.
[27]
R. Schreiber and J. Dongarra. Automatic blocking of nested loops. Tech. Report 90.38, RIACS, NASA Ames Research Center, 1990.
[28]
Y. Song and Z. Li. New tiling techniques to improve cache temporal locality. In PLDI, pages 215--228, 1999.
[29]
TLoG: A Parametrized Tiled Loop Generator. http://www.cs.colostate.edu/MMAlpha/tiling/.
[30]
R. C. Whaley and J. J. Dongarra. Automatically tuned linear algebra software. In Proceedings of the ACM/IEEE SC98 Conference, pages 1--27. IEEE Computer Society, 1998.
[31]
R. C. Whaley, A. Petitet, and J. J. Dongarra. Automated empirical optimization of software and the ATLAS project. Parallel Computing, 27(1---2):3--35, 2001.
[32]
J. Xue. Loop tiling for parallelism. Kluwer Academic Publishers, Norwell, MA, USA, 2000.
[33]
Q. Yi, K. Kennedy, and V. Adve. Transforming complex loop nests for locality. J. Supercomput., 27(3):219--264, 2004.

Cited By

View all
  • (2023)Time and Energy Benefits of Using Automatic Optimization Compilers for NPDP TasksElectronics10.3390/electronics1217357912:17(3579)Online publication date: 24-Aug-2023
  • (2022)A Methodology for Efficient Tile Size Selection for Affine Loop KernelsInternational Journal of Parallel Programming10.1007/s10766-022-00734-550:3-4(405-432)Online publication date: 23-May-2022
  • (2022)An Analytical Model for Loop Tiling TransformationEmbedded Computer Systems: Architectures, Modeling, and Simulation10.1007/978-3-031-04580-6_7(95-107)Online publication date: 27-Apr-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CGO '10: Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
April 2010
300 pages
ISBN:9781605586359
DOI:10.1145/1772954
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • IEEE CS uArch

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 April 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. code generation
  2. compile-time optimization
  3. tiling

Qualifiers

  • Research-article

Conference

CGO '10

Acceptance Rates

Overall Acceptance Rate 312 of 1,061 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)19
  • Downloads (Last 6 weeks)1
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Time and Energy Benefits of Using Automatic Optimization Compilers for NPDP TasksElectronics10.3390/electronics1217357912:17(3579)Online publication date: 24-Aug-2023
  • (2022)A Methodology for Efficient Tile Size Selection for Affine Loop KernelsInternational Journal of Parallel Programming10.1007/s10766-022-00734-550:3-4(405-432)Online publication date: 23-May-2022
  • (2022)An Analytical Model for Loop Tiling TransformationEmbedded Computer Systems: Architectures, Modeling, and Simulation10.1007/978-3-031-04580-6_7(95-107)Online publication date: 27-Apr-2022
  • (2021)Tile size selection of affine programs for GPGPUs using polyhedral cross-compilationProceedings of the 35th ACM International Conference on Supercomputing10.1145/3447818.3460369(13-26)Online publication date: 3-Jun-2021
  • (2021)PolyDLACM Transactions on Architecture and Code Optimization10.1145/343310318:1(1-27)Online publication date: 7-Jan-2021
  • (2021)Monoparametric Tiling of Polyhedral ProgramsInternational Journal of Parallel Programming10.1007/s10766-021-00694-2Online publication date: 18-Mar-2021
  • (2020)Deriving parametric multi-way recursive divide-and-conquer dynamic programming algorithms using polyhedral compilersProceedings of the 18th ACM/IEEE International Symposium on Code Generation and Optimization10.1145/3368826.3377916(317-329)Online publication date: 22-Feb-2020
  • (2020)Efficient Execution of Dynamic Programming Algorithms on Apache Spark2020 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER49012.2020.00044(337-348)Online publication date: Sep-2020
  • (2019)Triton: an intermediate language and compiler for tiled neural network computationsProceedings of the 3rd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages10.1145/3315508.3329973(10-19)Online publication date: 22-Jun-2019
  • (2018)Revisiting Loop Tiling for DatacentersProceedings of the 2018 International Conference on Supercomputing10.1145/3205289.3205306(328-340)Online publication date: 12-Jun-2018
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media