The UTFLA: uniformization of non-uniform iteration spaces in two-level perfect nested loops using SFLA

Mahjoub, Shabnam; Vojoudi, Hakimeh

doi:10.1007/s11227-016-1725-8

The UTFLA: uniformization of non-uniform iteration spaces in two-level perfect nested loops using SFLA

Published: 11 May 2016

Volume 72, pages 2221–2234, (2016)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Shabnam Mahjoub¹ &
Hakimeh Vojoudi¹

128 Accesses
4 Citations
Explore all metrics

Abstract

One of the factors increasing the execution time of computational programs is the loops, and parallelization of the loops is used to decrease this time. One of the steps of parallelizing compilers is uniformization of non-uniform loops in wavefront method which is considered as a NP-hard problem. In this paper, a new method has been presented to make uniform the non-uniform two-level perfect nested loops using the frog-leaping algorithm, called UTFLA, which is a combination of deterministic and stochastic methods, because the challenge most of loop paralleling methods, old or dynamic or new ones, face is the high algorithm execution time. UTFLA has been designed in a way to find the best results with the lowest amount of basic dependency cone size in the minimum possible time and gives more appropriate results in a more reasonable time compared to other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parallelizing the dual revised simplex method

Article Open access 14 December 2017

Shared Memory Parallelism in Modern C++ and HPX

Article 20 April 2024

An energy-efficient GMRES–multigrid solver for space-time finite element computation of dynamic poroelasticity

Article Open access 13 April 2024

References

Manekar S, Kawadkar P, Nagle M (2012) A review on new paradigm’s of parallel programming models in high performance computing. Int J Comput Sci Netw (IJCSN) 1(4):153–156
Google Scholar
Deepa KV (2012) Evaluating speedup in parallel compilers. UNF theses and dissertations
Kenneth LC (1997) Compiler construction principles and practice, edn 1, January 1997
Satish Kumar PJ, Rajesh Khanna M, Shine H, Arun S (2011) Implementing high performance lexical analyzer using CELL broadband engine processor. Int J Eng Sci Technol 3:6907–6913
Google Scholar
Seshadri V, Wortman DB (1991) An investigation into concurrent semantic analysis. Softw Pract Exp 21(12):1323–1348
Article Google Scholar
Srikanth GU (2010) Parallel lexical analyzer on the cell processor. In: Fourth international conference on secure software integration and reliability improvement companion (SSIRI-C), pp 28–29
Wei L et al (2006) POSH: A TLS compiler that exploits program structure. In: Proceedings of the 11th ACM SIGPLAN symposium on principles and practice of parallel programming, New York
Hang JC, Leng T (1999) Generalized loop-unrolling: a method for program speed up, the University of Houston. In: Proceedings of IEEE symposium on application-specific systems and software engineering and technology
Booshehri M, Malekpour A, Luksch P (2013) An improving method for loop unrolling. Int J Comput Sci Info Secur (IJCSIS) 11(5):73–76
Google Scholar
Chen DK, Yew PC, Torrellas J (1994) An efficient algorithm for the run-time parallelization of doacross loops. In: Proceedings of supercomputing, pp 518–527
Leung ST, Zahorjan J (1993) Improving the performance of runtime parallelization, Department of Computer Science and Engineering at the University of Washington, technical report no 92-10-05
Xu CZ, Chaudhary V (2001) Time stamp algorithms for runtime parallelization of DOACROSS loops with dynamic dependences. IEEE Trans Parallel Distrib Syst 12(5):433–450
Article Google Scholar
Bacon DF, Graham SL, Sharp OJ (1993) Compiler transformations for high-performance computing, University of California, Berkeley, technical report no. UCB/CSD-93-781
Iwasawa K (2010) Detecting method of parallelism from nested loops with loop carried data dependences. In: Fifth international multi-conference on computing in the global information technology (ICCGI)
Lim AW, Liao S, Lam MS (2001) Blocking and array contraction across arbitrarily nested loops using affine partitioning. In: Proceedings of the eighth ACMSIGPLAN symposium on principles and practices of parallel programming
Rauchwerger L, Padua D (1999) The LRPD test: speculative runtime parallelization of loops with privatization and reduction parallelization. In: Proceedings of the ACMSIGPLAN conference on programming language design and implementation
Das D, Peng W (2010) Experiences of using a dependence profiler to assist parallelization for multi-cores. In: IEEE international symposium on parallel and distributed processing
Minjang K, Hyesoon K, Keung L (2010) SD3: a scalable approach to dynamic data dependence profiling. In: 43rd annual IEEE/ACM international symposium on microarchitecture
Nawaz Z, Marconi T, Bertels K, Stefanov T (2009) Flexible pipelining design for recursive variable expansion. In: Parallel and distributed processing symposium
Huang J, Jablin TB, Beard SR, Johnson NP, August DI (2013) Automatically exploiting cross-invocation parallelism using runtime information. In: IEEE/ACM international symposium on code generation and optimization
Kotha A, Anand K, Smithson M, Yellareddy G, Barua R (2010) Automatic parallelization in a binary rewriter. In: 43rd annual IEEE/ACM international symposium on microarchitecture
Liu D et al (2012) Optimally maximizing iteration-level loop parallelism. In: IEEE transactions on parallel and distributed systems
Parsa S, Lotfi S (2007) Wave-front parallelization and scheduling. In: 4th IEEE international conference on parallel processing, pp 382–386
Tarhini AA (2013) Automatic loop parallelization. Thesis, Department of Electrical and Computer Engineering, American University of Beirut
Niknam T, Narimani R, Jabbari M, Malekpour M (2011) A modified shuffle frog leaping algorithm for multi-objective optimal power flow. Energy 36(11):6420–6432
Article Google Scholar
Jialin Ju, Chaudhary V (1997) Unique sets oriented parallelization of loops with non-uniform dependences. Comput J 40(6):322–339
Article Google Scholar
Eusuff M, Kevin L, Fayzul P (2006) Shuffled frog-leaping algorithm: a memetic meta-heuristic for discrete optimization. Eng Optim 38(2):129–154
Article MathSciNet Google Scholar
Elbehairy H, Elbeltagi E, Hegazy T (2006) Comparison of two evolutionary algorithms for optimization of bridge deck repairs. Comput Aided Civil Infrastr Eng 21(8):561–572
Article Google Scholar
Elbeltagi E, Hegazy T, Grierson D (2005) Comparison among five evolutionary-based optimization algorithms. Adv Eng Inform 19(1):43–53
Article Google Scholar
Mahjoub Sh, Lotfi Sh (2011) The UTLEA: uniformization of non-uniform iteration spaces in three level perfect nested loops using an evolutionary algorithm. The 2nd international conference on software engineering and computer systems (ICSECS), Malaysia, LNCS Springer
Nobahari S, Lotfi Sh (2009) Uniformization of non-uniformiteration spaces in loop parallelization using an evolutionary approach. M.Sc. Thesis, Department of Computer Engineering
MATLAB (2012) MATLAB—the language of technical computing, [Internet]. http://www.mathworks.com/products/matlab/

Download references

Acknowledgments

The authors wish to thank Islamic Azad University-Langaroud Branch for its financial support based on Grant No. 17/14/10710.

Author information

Authors and Affiliations

Department of Computer Engineering, Langaroud Branch, Islamic Azad University, Langaroud, Iran
Shabnam Mahjoub & Hakimeh Vojoudi

Authors

Shabnam Mahjoub
View author publications
You can also search for this author in PubMed Google Scholar
Hakimeh Vojoudi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shabnam Mahjoub.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mahjoub, S., Vojoudi, H. The UTFLA: uniformization of non-uniform iteration spaces in two-level perfect nested loops using SFLA. J Supercomput 72, 2221–2234 (2016). https://doi.org/10.1007/s11227-016-1725-8

Download citation

Published: 11 May 2016
Issue Date: June 2016
DOI: https://doi.org/10.1007/s11227-016-1725-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The UTFLA: uniformization of non-uniform iteration spaces in two-level perfect nested loops using SFLA

Abstract

Access this article

Similar content being viewed by others

Parallelizing the dual revised simplex method

Shared Memory Parallelism in Modern C++ and HPX

An energy-efficient GMRES–multigrid solver for space-time finite element computation of dynamic poroelasticity

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The UTFLA: uniformization of non-uniform iteration spaces in two-level perfect nested loops using SFLA

Abstract

Access this article

Similar content being viewed by others

Parallelizing the dual revised simplex method

Shared Memory Parallelism in Modern C++ and HPX

An energy-efficient GMRES–multigrid solver for space-time finite element computation of dynamic poroelasticity

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation