Abstract
One of the factors increasing the execution time of computational programs is the loops, and parallelization of the loops is used to decrease this time. One of the steps of parallelizing compilers is uniformization of non-uniform loops in wavefront method which is considered as a NP-hard problem. In this paper, a new method has been presented to make uniform the non-uniform two-level perfect nested loops using the frog-leaping algorithm, called UTFLA, which is a combination of deterministic and stochastic methods, because the challenge most of loop paralleling methods, old or dynamic or new ones, face is the high algorithm execution time. UTFLA has been designed in a way to find the best results with the lowest amount of basic dependency cone size in the minimum possible time and gives more appropriate results in a more reasonable time compared to other methods.
Similar content being viewed by others
References
Manekar S, Kawadkar P, Nagle M (2012) A review on new paradigm’s of parallel programming models in high performance computing. Int J Comput Sci Netw (IJCSN) 1(4):153–156
Deepa KV (2012) Evaluating speedup in parallel compilers. UNF theses and dissertations
Kenneth LC (1997) Compiler construction principles and practice, edn 1, January 1997
Satish Kumar PJ, Rajesh Khanna M, Shine H, Arun S (2011) Implementing high performance lexical analyzer using CELL broadband engine processor. Int J Eng Sci Technol 3:6907–6913
Seshadri V, Wortman DB (1991) An investigation into concurrent semantic analysis. Softw Pract Exp 21(12):1323–1348
Srikanth GU (2010) Parallel lexical analyzer on the cell processor. In: Fourth international conference on secure software integration and reliability improvement companion (SSIRI-C), pp 28–29
Wei L et al (2006) POSH: A TLS compiler that exploits program structure. In: Proceedings of the 11th ACM SIGPLAN symposium on principles and practice of parallel programming, New York
Hang JC, Leng T (1999) Generalized loop-unrolling: a method for program speed up, the University of Houston. In: Proceedings of IEEE symposium on application-specific systems and software engineering and technology
Booshehri M, Malekpour A, Luksch P (2013) An improving method for loop unrolling. Int J Comput Sci Info Secur (IJCSIS) 11(5):73–76
Chen DK, Yew PC, Torrellas J (1994) An efficient algorithm for the run-time parallelization of doacross loops. In: Proceedings of supercomputing, pp 518–527
Leung ST, Zahorjan J (1993) Improving the performance of runtime parallelization, Department of Computer Science and Engineering at the University of Washington, technical report no 92-10-05
Xu CZ, Chaudhary V (2001) Time stamp algorithms for runtime parallelization of DOACROSS loops with dynamic dependences. IEEE Trans Parallel Distrib Syst 12(5):433–450
Bacon DF, Graham SL, Sharp OJ (1993) Compiler transformations for high-performance computing, University of California, Berkeley, technical report no. UCB/CSD-93-781
Iwasawa K (2010) Detecting method of parallelism from nested loops with loop carried data dependences. In: Fifth international multi-conference on computing in the global information technology (ICCGI)
Lim AW, Liao S, Lam MS (2001) Blocking and array contraction across arbitrarily nested loops using affine partitioning. In: Proceedings of the eighth ACMSIGPLAN symposium on principles and practices of parallel programming
Rauchwerger L, Padua D (1999) The LRPD test: speculative runtime parallelization of loops with privatization and reduction parallelization. In: Proceedings of the ACMSIGPLAN conference on programming language design and implementation
Das D, Peng W (2010) Experiences of using a dependence profiler to assist parallelization for multi-cores. In: IEEE international symposium on parallel and distributed processing
Minjang K, Hyesoon K, Keung L (2010) SD3: a scalable approach to dynamic data dependence profiling. In: 43rd annual IEEE/ACM international symposium on microarchitecture
Nawaz Z, Marconi T, Bertels K, Stefanov T (2009) Flexible pipelining design for recursive variable expansion. In: Parallel and distributed processing symposium
Huang J, Jablin TB, Beard SR, Johnson NP, August DI (2013) Automatically exploiting cross-invocation parallelism using runtime information. In: IEEE/ACM international symposium on code generation and optimization
Kotha A, Anand K, Smithson M, Yellareddy G, Barua R (2010) Automatic parallelization in a binary rewriter. In: 43rd annual IEEE/ACM international symposium on microarchitecture
Liu D et al (2012) Optimally maximizing iteration-level loop parallelism. In: IEEE transactions on parallel and distributed systems
Parsa S, Lotfi S (2007) Wave-front parallelization and scheduling. In: 4th IEEE international conference on parallel processing, pp 382–386
Tarhini AA (2013) Automatic loop parallelization. Thesis, Department of Electrical and Computer Engineering, American University of Beirut
Niknam T, Narimani R, Jabbari M, Malekpour M (2011) A modified shuffle frog leaping algorithm for multi-objective optimal power flow. Energy 36(11):6420–6432
Jialin Ju, Chaudhary V (1997) Unique sets oriented parallelization of loops with non-uniform dependences. Comput J 40(6):322–339
Eusuff M, Kevin L, Fayzul P (2006) Shuffled frog-leaping algorithm: a memetic meta-heuristic for discrete optimization. Eng Optim 38(2):129–154
Elbehairy H, Elbeltagi E, Hegazy T (2006) Comparison of two evolutionary algorithms for optimization of bridge deck repairs. Comput Aided Civil Infrastr Eng 21(8):561–572
Elbeltagi E, Hegazy T, Grierson D (2005) Comparison among five evolutionary-based optimization algorithms. Adv Eng Inform 19(1):43–53
Mahjoub Sh, Lotfi Sh (2011) The UTLEA: uniformization of non-uniform iteration spaces in three level perfect nested loops using an evolutionary algorithm. The 2nd international conference on software engineering and computer systems (ICSECS), Malaysia, LNCS Springer
Nobahari S, Lotfi Sh (2009) Uniformization of non-uniformiteration spaces in loop parallelization using an evolutionary approach. M.Sc. Thesis, Department of Computer Engineering
MATLAB (2012) MATLAB—the language of technical computing, [Internet]. http://www.mathworks.com/products/matlab/
Acknowledgments
The authors wish to thank Islamic Azad University-Langaroud Branch for its financial support based on Grant No. 17/14/10710.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Mahjoub, S., Vojoudi, H. The UTFLA: uniformization of non-uniform iteration spaces in two-level perfect nested loops using SFLA. J Supercomput 72, 2221–2234 (2016). https://doi.org/10.1007/s11227-016-1725-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-016-1725-8