Abstract
Loop fission is an effective loop optimization for exploiting fine-grained parallelism. Currently, loop fission is widely used in existing parallelizing compilers. To fully exploit the optimization, we proposed and implemented a practical and aggressive loop fission technique. First, we present an aggressive dependence graph pruning method to eliminate pseudo dependences caused by the conservativeness of compilers. Second, we introduce a topological sort based loop fission algorithm to distribute loops correctly. Finally, to enhance the performance of the generated programs which have potential of loop fission, we propose an advanced loop fission strategy. We evaluate these techniques and algorithms in the experimental section.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Culler, D.E., Singh, J.P., Gupta, A.: Parallel Computer Architecture: A Hardware/software Approach. Gulf Professional Publishing, Houston (1999)
Kirk, D.B., Wen-Mei, W.H.: Programming Massively Parallel Processors: A Hands-on Approach. Morgan Kaufmann, Burlington (2016)
Kumar, V., et al.: Introduction to Parallel Computing: Design and Analysis of Algorithms, vol. 400. Benjamin/Cummings, Redwood City (1994)
Pugh, W.: The Omega test: a fast and practical integer programming algorithm for dependence analysis. In: Proceedings of the 1991 ACM/IEEE Conference on Supercomputing. ACM (1991)
Luporini, F., et al.: Cross-loop optimization of arithmetic intensity for finite element local assembly. ACM Trans. Archit. Code Optim. (TACO) 11(4), 57 (2015)
Kennedy, K., McKinley, K.S.: Optimizing for parallelism and data locality. In: ACM International Conference on Supercomputing 25th Anniversary Volume. ACM (2014)
Allen, J.R., Kennedy, K.: Automatic loop interchange. ACM Sigplan Notices 19(6), 233–246 (1984)
Banerjee, U.: Loop Parallelization. Springer, Heidelberg (2013)
open64-5.0 compiler source code. http://sourceforge.net/projects/open64/files/open64/Open64-5.0
McFarling, S.: Program optimization for instruction caches. ACM SIGARCH Comput. Archit. News 17(2), 183–191 (1989)
Allen, R., Kennedy, K.: Optimizing Compilers for Modern Architectures: a Dependence-Based Approach, vol. 1. Morgan Kaufmann, San Francisco (2002)
Pouchet, L.-N., et al.: Loop transformations: convexity, pruning and optimization. ACM SIGPLAN Notices 46(1), 549–562 (2011)
Kong, M., et al.: When polyhedral transformations meet SIMD code generation. ACM Sigplan Notices. 48(6), 127–138 (2013)
Maleki, S., et al.: An evaluation of vectorizing compilers. In: 2011 International Conference on Parallel Architectures and Compilation Techniques (PACT). IEEE (2011)
Bastoul, C., Cohen, A., Girbal, S., Sharma, S., Temam, O.: Putting polyhedral loop transformations to work. In: Rauchwerger, L. (ed.) LCPC 2003. LNCS, vol. 2958, pp. 209–225. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24644-2_14
Hoefler, T., Lumsdaine, A., Dongarra, J.: Towards efficient mapreduce using MPI. In: Ropo, M., Westerholm, J., Dongarra, J. (eds.) EuroPVM/MPI 2009. LNCS, vol. 5759, pp. 240–249. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03770-2_30
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhao, B. et al. (2018). A Practical and Aggressive Loop Fission Technique. In: Hu, T., Wang, F., Li, H., Wang, Q. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2018. Lecture Notes in Computer Science(), vol 11338. Springer, Cham. https://doi.org/10.1007/978-3-030-05234-8_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-05234-8_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05233-1
Online ISBN: 978-3-030-05234-8
eBook Packages: Computer ScienceComputer Science (R0)