A Practical and Aggressive Loop Fission Technique

Zhao, Bo; Li, Yingying; Han, Lin; Zhao, Jie; Gao, Wei; Zhao, Rongcai; Yahyapour, Ramin

doi:10.1007/978-3-030-05234-8_9

Bo Zhao^16,17,
Yingying Li¹⁷,
Lin Han¹⁷,
Jie Zhao^17,18,
Wei Gao¹⁷,
Rongcai Zhao¹⁷ &
…
Ramin Yahyapour¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11338))

Included in the following conference series:

International Conference on Algorithms and Architectures for Parallel Processing

926 Accesses

Abstract

Loop fission is an effective loop optimization for exploiting fine-grained parallelism. Currently, loop fission is widely used in existing parallelizing compilers. To fully exploit the optimization, we proposed and implemented a practical and aggressive loop fission technique. First, we present an aggressive dependence graph pruning method to eliminate pseudo dependences caused by the conservativeness of compilers. Second, we introduce a topological sort based loop fission algorithm to distribute loops correctly. Finally, to enhance the performance of the generated programs which have potential of loop fission, we propose an advanced loop fission strategy. We evaluate these techniques and algorithms in the experimental section.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Distributing and Parallelizing Non-canonical Loops

Using OpenMP to Detect and Speculate Dynamic DOALL Loops

Enhancing the Effectiveness of Inlining in Automatic Parallelization

Article 06 August 2021

References

Culler, D.E., Singh, J.P., Gupta, A.: Parallel Computer Architecture: A Hardware/software Approach. Gulf Professional Publishing, Houston (1999)
Google Scholar
Kirk, D.B., Wen-Mei, W.H.: Programming Massively Parallel Processors: A Hands-on Approach. Morgan Kaufmann, Burlington (2016)
Google Scholar
Kumar, V., et al.: Introduction to Parallel Computing: Design and Analysis of Algorithms, vol. 400. Benjamin/Cummings, Redwood City (1994)
MATH Google Scholar
Pugh, W.: The Omega test: a fast and practical integer programming algorithm for dependence analysis. In: Proceedings of the 1991 ACM/IEEE Conference on Supercomputing. ACM (1991)
Google Scholar
Luporini, F., et al.: Cross-loop optimization of arithmetic intensity for finite element local assembly. ACM Trans. Archit. Code Optim. (TACO) 11(4), 57 (2015)
Google Scholar
Kennedy, K., McKinley, K.S.: Optimizing for parallelism and data locality. In: ACM International Conference on Supercomputing 25th Anniversary Volume. ACM (2014)
Google Scholar
Allen, J.R., Kennedy, K.: Automatic loop interchange. ACM Sigplan Notices 19(6), 233–246 (1984)
Article Google Scholar
Banerjee, U.: Loop Parallelization. Springer, Heidelberg (2013)
MATH Google Scholar
open64-5.0 compiler source code. http://sourceforge.net/projects/open64/files/open64/Open64-5.0
McFarling, S.: Program optimization for instruction caches. ACM SIGARCH Comput. Archit. News 17(2), 183–191 (1989)
Article Google Scholar
Allen, R., Kennedy, K.: Optimizing Compilers for Modern Architectures: a Dependence-Based Approach, vol. 1. Morgan Kaufmann, San Francisco (2002)
Google Scholar
Pouchet, L.-N., et al.: Loop transformations: convexity, pruning and optimization. ACM SIGPLAN Notices 46(1), 549–562 (2011)
Article Google Scholar
Kong, M., et al.: When polyhedral transformations meet SIMD code generation. ACM Sigplan Notices. 48(6), 127–138 (2013)
Article Google Scholar
Maleki, S., et al.: An evaluation of vectorizing compilers. In: 2011 International Conference on Parallel Architectures and Compilation Techniques (PACT). IEEE (2011)
Google Scholar
Bastoul, C., Cohen, A., Girbal, S., Sharma, S., Temam, O.: Putting polyhedral loop transformations to work. In: Rauchwerger, L. (ed.) LCPC 2003. LNCS, vol. 2958, pp. 209–225. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24644-2_14
Chapter Google Scholar
Hoefler, T., Lumsdaine, A., Dongarra, J.: Towards efficient mapreduce using MPI. In: Ropo, M., Westerholm, J., Dongarra, J. (eds.) EuroPVM/MPI 2009. LNCS, vol. 5759, pp. 240–249. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03770-2_30
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen, 37077, Göttingen, Germany
Bo Zhao & Ramin Yahyapour
State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou, 450001, China
Bo Zhao, Yingying Li, Lin Han, Jie Zhao, Wei Gao & Rongcai Zhao
French Institute for Research in Computer Science and Automation, Rocquencourt, France
Jie Zhao

Authors

Bo Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Yingying Li
View author publications
You can also search for this author in PubMed Google Scholar
Lin Han
View author publications
You can also search for this author in PubMed Google Scholar
Jie Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Wei Gao
View author publications
You can also search for this author in PubMed Google Scholar
Rongcai Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Ramin Yahyapour
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bo Zhao .

Editor information

Editors and Affiliations

Memorial University, St. John’s, NL, Canada
Ting Hu
Wuhan University, Wuhan, China
Feng Wang
University of Electronic Science and Technology of China, Chengdu, China
Hongwei Li
School of Cyber Science and Engineering, Wuhan, China
Qian Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhao, B. et al. (2018). A Practical and Aggressive Loop Fission Technique. In: Hu, T., Wang, F., Li, H., Wang, Q. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2018. Lecture Notes in Computer Science(), vol 11338. Springer, Cham. https://doi.org/10.1007/978-3-030-05234-8_9

Download citation

DOI: https://doi.org/10.1007/978-3-030-05234-8_9
Published: 30 December 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05233-1
Online ISBN: 978-3-030-05234-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Practical and Aggressive Loop Fission Technique

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Distributing and Parallelizing Non-canonical Loops

Using OpenMP to Detect and Speculate Dynamic DOALL Loops

Enhancing the Effectiveness of Inlining in Automatic Parallelization

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Practical and Aggressive Loop Fission Technique

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Distributing and Parallelizing Non-canonical Loops

Using OpenMP to Detect and Speculate Dynamic DOALL Loops

Enhancing the Effectiveness of Inlining in Automatic Parallelization

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation