skip to main content
10.1145/3640537.3641562acmconferencesArticle/Chapter ViewAbstractPublication PagesccConference Proceedingsconference-collections
research-article

If-Convert as Early as You Must

Published:20 February 2024Publication History

ABSTRACT

Optimizing compilers employ a rich set of transformations that generate highly efficient code for a variety of source languages and target architectures. These transformations typically operate on general control flow constructs which trigger a range of optimization opportunities, such as moving code to less frequently executed paths, and more. Regular loop nests are specifically relevant for accelerating certain domains, leveraging architectural features including vector instructions, hardware-controlled loops and data flows, provided their internal control-flow is eliminated. Compilers typically apply predicating if-conversion late, in their backend, to remove control-flow undesired by the target. Until then, transformations triggered by control-flow constructs that are destined to be removed may end up doing more harm than good. We present an approach that leverages the existing powerful and general optimization flow of LLVM when compiling for targets without control-flow in loops. Rather than trying to teach various transformations how to avoid misoptimizing for such targets, we propose to introduce an aggressive if-conversion pass as early as possible, along with carefully addressing pass-ordering implications. This solution outperforms the traditional compilation flow with only a modest tuning effort, thereby offering a robust and promising compilation approach for branch-restricted targets.

References

  1. John R. Allen, Ken Kennedy, Carrie Porterfield, and Joe Warren. 1983. Conversion of Control Dependence to Data Dependence. In Proceedings of the 10th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages (POPL ’83). Association for Computing Machinery, New York, NY, USA. 177–189. isbn:0897910907 https://doi.org/10.1145/567067.567085 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. David I. August, Wen Mei W. Hwu, and Scott A. Mahlke. 1999. Partial reverse if-conversion framework for balancing control flow and predication. International Journal of Parallel Programming, 27, 5 (1999), 381–423. issn:0885-7458 https://doi.org/10.1023/A:1018787007582 Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. David I. August, Wen-mei W. Hwu, and Scott A. Mahlke. 1997. A Framework for Balancing Control Flow and Predication. In Proceedings of 30th Annual International Symposium on Microarchitecture (Micro ’97). IEEE Computer Society, USA. 92–103. https://doi.org/10.1109/MICRO.1997.645801 Google ScholarGoogle ScholarCross RefCross Ref
  4. Christopher Barton, Arie Tal, Bob Blainey, and José Nelson Amaral. 2005. Generalized Index-Set Splitting. In Compiler Construction, Rastislav Bodik (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg. 106–120. isbn:978-3-540-31985-6 Google ScholarGoogle Scholar
  5. Muthu Manikandan Baskaran, Uday Bondhugula, Sriram Krishnamoorthy, J. Ramanujam, Atanas Rountev, and P. Sadayappan. 2008. A Compiler Framework for Optimization of Affine Loop Nests for Gpgpus. In Proceedings of the 22nd Annual International Conference on Supercomputing (ICS ’08). Association for Computing Machinery, New York, NY, USA. 225–234. isbn:9781605581583 https://doi.org/10.1145/1375527.1375562 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Yishen Chen, Charith Mendis, and Saman Amarasinghe. 2022. All You Need is Superword-Level Parallelism: Systematic Control-Flow Vectorization with SLP. In Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation (PLDI 2022). Association for Computing Machinery, New York, NY, USA. 301–315. isbn:9781450392655 https://doi.org/10.1145/3519939.3523701 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Shuhan Ding and Soner Önder. 2010. Unrestricted Code Motion: A Program Representation and Transformation Algorithms Based on Future Values. In Compiler Construction, Rajiv Gupta (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg. 26–45. isbn:978-3-642-11970-5 Google ScholarGoogle Scholar
  8. Kemal Ebcioğlu. 1987. A Compilation Technique for Software Pipelining of Loops with Conditional Jumps. In Proceedings of the 20th Annual Workshop on Microprogramming (Micro 20). Association for Computing Machinery, New York, NY, USA. 69–79. isbn:0897912500 https://doi.org/10.1145/255305.255317 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Alexandre E. Eichenberger, Kathryn O’Brien, Kevin O’Brien, Peng Wu, Tong Chen, Peter H. Oden, Daniel A. Prener, Janice C. Shepherd, Byoungro So, Zehra Sura, Amy Wang, Tao Zhang, Peng Zhao, and Michael Gschwind. 2005. Optimizing Compiler for the CELL Processor. In Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques (PACT ’05). IEEE Computer Society, USA. 161–172. isbn:076952429X https://doi.org/10.1109/PACT.2005.33 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Alexander Jordan, Nikolai Kim, and Andreas Krall. 2013. IR-Level versus Machine-Level If-Conversion for Predicated Architectures. In Proceedings of the 10th Workshop on Optimizations for DSP and Embedded Systems (ODES ’13). Association for Computing Machinery, New York, NY, USA. 3–10. isbn:9781450319058 https://doi.org/10.1145/2443608.2443611 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Hyesoon Kim, Onur Mutlu, Jared Stark, and Yale Patt. 2006. Wish Branches: Enabling Adaptive and Aggressive Predicated Execution. IEEE Micro, 26 (2006), 48–58. https://api.semanticscholar.org/CorpusID:6838785 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. JinYing Kong, Lin Han, JinLong Xu, and Kai Nie. 2022. Research on control flow conversion technique based on Domestic Sunway compiler. In 2022 7th International Conference on Intelligent Computing and Signal Processing (ICSP). IEEE Computer Society, Xi’an, China. 1340–1344. https://doi.org/10.1109/ICSP54964.2022.9778356 Google ScholarGoogle ScholarCross RefCross Ref
  13. Samuel Larsen and Saman Amarasinghe. 2000. Exploiting Superword Level Parallelism with Multimedia Instruction Sets. In Proceedings of the ACM SIGPLAN 2000 Conference on Programming Language Design and Implementation (PLDI ’00). Association for Computing Machinery, New York, NY, USA. 145–156. isbn:1581131992 https://doi.org/10.1145/349299.349320 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Tanya M. Lattner. 2005. An Implementation of Swing Modulo Scheduling with Extensions for Superblocks. Master’s thesis. Computer Science Dept., University of Illinois at Urbana-Champaign. Urbana, IL. See http://llvm.cs.uiuc.edu. Google ScholarGoogle Scholar
  15. LLVM. 2023. Auto-Vectorization in LLVM. https://llvm.org/docs/Vectorizers.html Google ScholarGoogle Scholar
  16. LLVM. 2023. Vectorization Plan. https://llvm.org/docs/VectorizationPlan.html Google ScholarGoogle Scholar
  17. Dragan Milicev and Zoran Jovanovic. 2002. Control Flow Regeneration for Software Pipelined Loops with Conditions. International Journal of Parallel Programming, 30 (2002), 06, 149–179. https://doi.org/10.1023/A:1015453520790 Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Simon Moll. 2020. Vector Predication Roadmap. https://llvm.org/docs/Proposals/VectorPredication.html Google ScholarGoogle Scholar
  19. Simon Moll and Sebastian Hack. 2018. Partial Control-Flow Linearization. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2018). Association for Computing Machinery, New York, NY, USA. 543–556. isbn:9781450356985 https://doi.org/10.1145/3192366.3192413 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Simon Moll, Shrey Sharma, Matthias Kurtenacker, and Sebastian Hack. 2019. Multi-Dimensional Vectorization in LLVM. In Proceedings of the 5th Workshop on Programming Models for SIMD/Vector Processing (WPMVP’19). Association for Computing Machinery, New York, NY, USA. Article 3, 8 pages. isbn:9781450362917 https://doi.org/10.1145/3303117.3306172 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Jaime H. Moreno, Victor V. Zyuban, Uzi Shvadron, Fredy D. Neeser, Jeff H. Derby, Malcolm S. Ware, Krishnan Kailas, Ayal Zaks, Amir B. Geva, Shay Ben-David, Sameh W. Asaad, Thomas W. Fox, Daniel Littrell, Marina Biberstein, Dorit Naishlos, and Hillery C. Hunter. 2003. An innovative low-power high-performance programmable signal processor for digital communications. IBM J. Res. Dev., 47, 2-3 (2003), 299–326. https://doi.org/10.1147/RD.472.0299 Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Todd C. Mowry, Monica S. Lam, and Anoop Gupta. 1992. Design and Evaluation of a Compiler Algorithm for Prefetching. In Proceedings of the Fifth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS V). Association for Computing Machinery, New York, NY, USA. 62–73. isbn:0897915348 https://doi.org/10.1145/143365.143488 Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Dorit Nuzman, Ira Rosen, and Ayal Zaks. 2006. Auto-Vectorization of Interleaved Data for SIMD. In Proceedings of the 27th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’06). Association for Computing Machinery, New York, NY, USA. 132–143. isbn:1595933204 https://doi.org/10.1145/1133981.1133997 Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Vasileios Porpodas and Pushkar Ratnalikar. 2021. PostSLP: Cross-Region Vectorization of Fully or Partially Vectorized Code. In Languages and Compilers for Parallel Computing, Santosh Pande and Vivek Sarkar (Eds.). Springer International Publishing, Cham. 15–31. isbn:978-3-030-72789-5 Google ScholarGoogle Scholar
  25. Rodrigo C. O. Rocha, Vasileios Porpodas, Pavlos Petoumenos, Luís F. W. Góes, Zheng Wang, Murray Cole, and Hugh Leather. 2020. Vectorization-Aware Loop Unrolling with Seed Forwarding. In Proceedings of the 29th International Conference on Compiler Construction (CC 2020). Association for Computing Machinery, New York, NY, USA. 1–13. isbn:9781450371209 https://doi.org/10.1145/3377555.3377890 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Charitha Saumya, Kirshanthan Sundararajah, and Milind Kulkarni. 2022. DARM: Control-Flow Melding for SIMT Thread Divergence Reduction. In 2022 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). 1–13. https://doi.org/10.1109/CGO53902.2022.9741285 Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Fabian Schuiki, Florian Zaruba, Torsten Hoefler, and Luca Benini. 2021. Stream Semantic Registers: A Lightweight RISC-V ISA Extension Achieving Full Compute Utilization in Single-Issue Cores. IEEE Trans. Comput., 70, 2 (2021), feb, 212–227. issn:0018-9340 https://doi.org/10.1109/TC.2020.2987314 Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Jaewook Shin, Mary Hall, and Jacqueline Chame. 2005. Superword-Level Parallelism in the Presence of Control Flow. In Proceedings of the International Symposium on Code Generation and Optimization (CGO ’05). IEEE Computer Society, USA. 165–175. isbn:076952298X https://doi.org/10.1109/CGO.2005.33 Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. James E. Smith. 1982. Decoupled Access/Execute Computer Architectures. In Proceedings of the 9th Annual Symposium on Computer Architecture (ISCA ’82). IEEE Computer Society Press, Washington, DC, USA. 112–119. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. TI. 2023. C7000 C/C++ Optimization Guide. www.ti.com Google ScholarGoogle Scholar
  31. Gang-Ryung Uh, Yuhong Wang, Sanjay Jinturkar, Chris Burns, and Vincent Cao. 2000. Techniques for Effectively Exploiting a Zero Overhead Loop Buffer. In Proceedings of the 9th International Conference on Compiler Construction. 157–172. isbn:978-3-540-67263-0 https://doi.org/10.1007/3-540-46423-9_11 Google ScholarGoogle ScholarCross RefCross Ref
  32. Janek van Oirschot. 2022. Hardware Loops in the IPU Backend. https://llvm.org/devmtg/2022-05/slides/ Google ScholarGoogle Scholar
  33. Nicolas Vasilache, Cédric Bastoul, and Albert Cohen. 2006. Polyhedral Code Generation in the Real World. In Proceedings of the 15th International Conference on Compiler Construction (CC’06). Springer-Verlag, Berlin, Heidelberg. 185–201. isbn:354033050X https://doi.org/10.1007/11688839_16 Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Miao Wang, Rongcai Zhao, Jianmin Pang, and Guoming Cai. 2008. Reconstructing Control Flow in Modulo Scheduled Loops. In Seventh IEEE/ACIS International Conference on Computer and Information Science (ICIS 2008). IEEE, Portland, OR. 539–544. isbn:978-0-7695-3131-1 https://doi.org/10.1109/ICIS.2008.16 Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Zhengrong Wang and Tony Nowatzki. 2019. Stream-Based Memory Access Specialization for General Purpose Processors. In Proceedings of the 46th International Symposium on Computer Architecture (ISCA ’19). Association for Computing Machinery, New York, NY, USA. 736–749. isbn:9781450366694 https://doi.org/10.1145/3307650.3322229 Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Nancy J. Warter, Scott A. Mahlke, Wen-Mei W. Hwu, and B. Ramakrishna Rau. 1993. Reverse If-Conversion. In Proceedings of the ACM SIGPLAN 1993 Conference on Programming Language Design and Implementation (PLDI ’93). Association for Computing Machinery, New York, NY, USA. 290–299. isbn:0897915984 https://doi.org/10.1145/155090.155118 Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Baofen Yuan, Jianfeng Zhu, Xingchen Man, Zijiao Ma, Shouyi Yin, Shaojun Wei, and Leibo Liu. 2022. Dynamic-II Pipeline: Compiling Loops With Irregular Branches on Static-Scheduling CGRA. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 41, 9 (2022), 2929–2942. https://doi.org/10.1109/TCAD.2021.3121346 Google ScholarGoogle ScholarCross RefCross Ref
  38. Han-saem Yun, Jihong Kim, and Soo-mook Moon. 2001. A First Step Towards Time Optimal Software Pipelining of Loops with Control Flows. In Proceedings of the 10th International Conference on Compiler Construction. Springer-Verlag, Berlin, Heidelberg, Genove, Italy. isbn:978-3-540-41861-0 https://doi.org/10.1007/3-540-45306-7_13 Google ScholarGoogle ScholarCross RefCross Ref
  39. Han-Saem Yun, Jihong Kim, and Soo-Mook Moon. 2002. Optimal Software Pipelining of Loops with Control Flows. In Proceedings of the 16th International Conference on Supercomputing (ICS ’02). Association for Computing Machinery, New York, NY, USA. 117–128. isbn:1581134835 https://doi.org/10.1145/514191.514210 Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Eric Zimmerman. 2005. Profile-directed If-Conversion in Superscalar Microprocessors. Master’s thesis. Computer Science Dept., University of Illinois at Urbana-Champaign. https://llvm.org/pubs/2005-07-ZimmermanMSThesis.html Google ScholarGoogle Scholar

Index Terms

  1. If-Convert as Early as You Must

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          CC 2024: Proceedings of the 33rd ACM SIGPLAN International Conference on Compiler Construction
          February 2024
          261 pages
          ISBN:9798400705076
          DOI:10.1145/3640537

          Copyright © 2024 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 20 February 2024

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
        • Article Metrics

          • Downloads (Last 12 months)128
          • Downloads (Last 6 weeks)42

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader