skip to main content
10.1145/3489517.3530610acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article

Optimizing parallel PREM compilation over nested loop structures

Published: 23 August 2022 Publication History

Abstract

We consider automatic parallelization of a computational kernel executed according to the PRedictable Execution Model (PREM), where each thread is divided into execution and memory phases. We target a scratchpad-based architecture, where memory phases are executed by a dedicated DMA component. We employ data analysis and loop tiling to split the kernel execution into segments, and schedule them based on a DAG representation of data and execution dependencies. Our main observation is that properly selecting tile sizes is key to optimize the makespan of the kernel. We thus propose a heuristic that efficiently searches for optimized tile size and core assignments over deeply nested loops, and demonstrate its applicability and performance compared to the state-of-the-art in PREM compilation using the PolyBench-NN benchmark suite.

References

[1]
M. Benabderrahmane, L. Pouchet, A. Cohen, and C. Bastoul. 2010. The Polyhedral Model Is More Widely Applicable Than You Think. In Compiler Construction, R. Gupta (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 283--303.
[2]
N. Binkert, B. Beckmann, G. Black, S. Reinhardt, A. Saidi, A. Basu, J. Hestness, D. Hower, T. Krishna, S. Sardashti, R. Sen, K. Sewell, M. Shoaib, N. Vaish, M. Hill, and D. Wood. 2011. The Gem5 Simulator. SIGARCH Comput. Archit. News 39, 2 (aug 2011), 1--7.
[3]
U. Bondhugula, M. Baskaran, S. Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan. 2008. Automatic Transformations for Communication-Minimized Parallelization and Locality Optimization in the Polyhedral Model. In International Conference on Compiler Construction (ETAPS CC). http://drona.csa.iisc.ernet.in/~uday/publications/uday-cc08.pdf
[4]
P. Feautrier and C. Lengauer. 2011. The Polyhedron Model. In Encyclopedia of Parallel Computing, David Padua (Ed.). Springer, 1581--1592.
[5]
B. Forsberg, L. Benini, and A. Marongiu. 2018. HePREM: Enabling predictable GPU execution on heterogeneous SoC. In 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE.
[6]
B. Forsberg, M. Mattheeuws, A. Kurth, A. Marongiu, and L. Benini. 2020. A synergistic approach to predictable compilation and scheduling on commodity multi-cores. In Proceedings of Languages, Compilers, Tools and Theory of Embedded Systems (LCTES'20).
[7]
J. Matějka, B. Forsberg, M. Sojka, Z. Hanzálek, L. Benini, and A. Marongiu. 2018. Combining PREM Compilation and ILP Scheduling for High-performance and Predictable MPSoC Execution. In Proceedings of the 9th International Workshop on Programming Models and Applications for Multicores and Manycores (Vienna, Austria) (PMAM'18). ACM, New York, NY, USA, 11--20.
[8]
R. Pellizzoni, E. Betti, S. Bak, G. Yao, J. Criswell, M. Caccamo, and R. Kegley. 2011. A Predictable Execution Model for COTS-Based Embedded Systems. In Proceedings of the 2011 17th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS '11). IEEE Computer Society, Washington, DC, USA, 269--279.
[9]
J. Rivas, J. Goossens, X. Poczekajlo, and A. Paolillo. 2019. Implementation of Memory Centric Scheduling for COTS Multi-Core Real-Time Systems. In 31st Euromicro Conference on Real-Time Systems (ECRTS 2019), Vol. 133.
[10]
M. Soliman, G. Gracioli, R. Tabish, R. Pellizzoni, and M. Caccamo. 2019. Segment Streaming for the Three-Phase Execution Model: Design and Implementation. In Proceedings of the 40th Real-Time Systems Symposium (RTSS'19).
[11]
M. R. Soliman and R. Pellizzoni. 2019. PREM-based Optimal Task Segmentation under Fixed Priority Scheduling. In 2019 31th Euromicro Conference on Real-Time Systems (ECRTS). 1--23.
[12]
H. Vaidya, A. Patwardhan, R. Upadrasta, and A. Badrinaaraayanan. 24th IEEE International Conference on High Performance Computing, Data, and Analytics, 2017. When Polyhedral Optimizations Meet Deep Learning Kernels.
[13]
S. Verdoolaege, J. Carlos Juega, A. Cohen, J. Ignacio Gómez, C. Tenllado, and F. Catthoor. 2013. Polyhedral Parallel Code Generation for CUDA. ACM Trans. Archit. Code Optim. 9, 4, Article 54 (jan 2013), 23 pages.
[14]
S. Verdoolaege and T. Grosser. 2012. Polyhedral Extraction Tool.
[15]
S. Verdoolaege, S. Guelton, T. Grosser, and A. Cohen. 2014. Schedule Trees.
[16]
S. Verdoolaege and G. Janssens. 2017. Scheduling for PPCG.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DAC '22: Proceedings of the 59th ACM/IEEE Design Automation Conference
July 2022
1462 pages
ISBN:9781450391429
DOI:10.1145/3489517
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 August 2022

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

Conference

DAC '22
Sponsor:
DAC '22: 59th ACM/IEEE Design Automation Conference
July 10 - 14, 2022
California, San Francisco

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Upcoming Conference

DAC '25
62nd ACM/IEEE Design Automation Conference
June 22 - 26, 2025
San Francisco , CA , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 149
    Total Downloads
  • Downloads (Last 12 months)21
  • Downloads (Last 6 weeks)4
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media