Skip to main content
Log in

The UTFLA: uniformization of non-uniform iteration spaces in two-level perfect nested loops using SFLA

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

One of the factors increasing the execution time of computational programs is the loops, and parallelization of the loops is used to decrease this time. One of the steps of parallelizing compilers is uniformization of non-uniform loops in wavefront method which is considered as a NP-hard problem. In this paper, a new method has been presented to make uniform the non-uniform two-level perfect nested loops using the frog-leaping algorithm, called UTFLA, which is a combination of deterministic and stochastic methods, because the challenge most of loop paralleling methods, old or dynamic or new ones, face is the high algorithm execution time. UTFLA has been designed in a way to find the best results with the lowest amount of basic dependency cone size in the minimum possible time and gives more appropriate results in a more reasonable time compared to other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Manekar S, Kawadkar P, Nagle M (2012) A review on new paradigm’s of parallel programming models in high performance computing. Int J Comput Sci Netw (IJCSN) 1(4):153–156

    Google Scholar 

  2. Deepa KV (2012) Evaluating speedup in parallel compilers. UNF theses and dissertations

  3. Kenneth LC (1997) Compiler construction principles and practice, edn 1, January 1997

  4. Satish Kumar PJ, Rajesh Khanna M, Shine H, Arun S (2011) Implementing high performance lexical analyzer using CELL broadband engine processor. Int J Eng Sci Technol 3:6907–6913

    Google Scholar 

  5. Seshadri V, Wortman DB (1991) An investigation into concurrent semantic analysis. Softw Pract Exp 21(12):1323–1348

    Article  Google Scholar 

  6. Srikanth GU (2010) Parallel lexical analyzer on the cell processor. In: Fourth international conference on secure software integration and reliability improvement companion (SSIRI-C), pp 28–29

  7. Wei L et al (2006) POSH: A TLS compiler that exploits program structure. In: Proceedings of the 11th ACM SIGPLAN symposium on principles and practice of parallel programming, New York

  8. Hang JC, Leng T (1999) Generalized loop-unrolling: a method for program speed up, the University of Houston. In: Proceedings of IEEE symposium on application-specific systems and software engineering and technology

  9. Booshehri M, Malekpour A, Luksch P (2013) An improving method for loop unrolling. Int J Comput Sci Info Secur (IJCSIS) 11(5):73–76

    Google Scholar 

  10. Chen DK, Yew PC, Torrellas J (1994) An efficient algorithm for the run-time parallelization of doacross loops. In: Proceedings of supercomputing, pp 518–527

  11. Leung ST, Zahorjan J (1993) Improving the performance of runtime parallelization, Department of Computer Science and Engineering at the University of Washington, technical report no 92-10-05

  12. Xu CZ, Chaudhary V (2001) Time stamp algorithms for runtime parallelization of DOACROSS loops with dynamic dependences. IEEE Trans Parallel Distrib Syst 12(5):433–450

    Article  Google Scholar 

  13. Bacon DF, Graham SL, Sharp OJ (1993) Compiler transformations for high-performance computing, University of California, Berkeley, technical report no. UCB/CSD-93-781

  14. Iwasawa K (2010) Detecting method of parallelism from nested loops with loop carried data dependences. In: Fifth international multi-conference on computing in the global information technology (ICCGI)

  15. Lim AW, Liao S, Lam MS (2001) Blocking and array contraction across arbitrarily nested loops using affine partitioning. In: Proceedings of the eighth ACMSIGPLAN symposium on principles and practices of parallel programming

  16. Rauchwerger L, Padua D (1999) The LRPD test: speculative runtime parallelization of loops with privatization and reduction parallelization. In: Proceedings of the ACMSIGPLAN conference on programming language design and implementation

  17. Das D, Peng W (2010) Experiences of using a dependence profiler to assist parallelization for multi-cores. In: IEEE international symposium on parallel and distributed processing

  18. Minjang K, Hyesoon K, Keung L (2010) SD3: a scalable approach to dynamic data dependence profiling. In: 43rd annual IEEE/ACM international symposium on microarchitecture

  19. Nawaz Z, Marconi T, Bertels K, Stefanov T (2009) Flexible pipelining design for recursive variable expansion. In: Parallel and distributed processing symposium

  20. Huang J, Jablin TB, Beard SR, Johnson NP, August DI (2013) Automatically exploiting cross-invocation parallelism using runtime information. In: IEEE/ACM international symposium on code generation and optimization

  21. Kotha A, Anand K, Smithson M, Yellareddy G, Barua R (2010) Automatic parallelization in a binary rewriter. In: 43rd annual IEEE/ACM international symposium on microarchitecture

  22. Liu D et al (2012) Optimally maximizing iteration-level loop parallelism. In: IEEE transactions on parallel and distributed systems

  23. Parsa S, Lotfi S (2007) Wave-front parallelization and scheduling. In: 4th IEEE international conference on parallel processing, pp 382–386

  24. Tarhini AA (2013) Automatic loop parallelization. Thesis, Department of Electrical and Computer Engineering, American University of Beirut

  25. Niknam T, Narimani R, Jabbari M, Malekpour M (2011) A modified shuffle frog leaping algorithm for multi-objective optimal power flow. Energy 36(11):6420–6432

    Article  Google Scholar 

  26. Jialin Ju, Chaudhary V (1997) Unique sets oriented parallelization of loops with non-uniform dependences. Comput J 40(6):322–339

    Article  Google Scholar 

  27. Eusuff M, Kevin L, Fayzul P (2006) Shuffled frog-leaping algorithm: a memetic meta-heuristic for discrete optimization. Eng Optim 38(2):129–154

    Article  MathSciNet  Google Scholar 

  28. Elbehairy H, Elbeltagi E, Hegazy T (2006) Comparison of two evolutionary algorithms for optimization of bridge deck repairs. Comput Aided Civil Infrastr Eng 21(8):561–572

    Article  Google Scholar 

  29. Elbeltagi E, Hegazy T, Grierson D (2005) Comparison among five evolutionary-based optimization algorithms. Adv Eng Inform 19(1):43–53

    Article  Google Scholar 

  30. Mahjoub Sh, Lotfi Sh (2011) The UTLEA: uniformization of non-uniform iteration spaces in three level perfect nested loops using an evolutionary algorithm. The 2nd international conference on software engineering and computer systems (ICSECS), Malaysia, LNCS Springer

  31. Nobahari S, Lotfi Sh (2009) Uniformization of non-uniformiteration spaces in loop parallelization using an evolutionary approach. M.Sc. Thesis, Department of Computer Engineering

  32. MATLAB (2012) MATLAB—the language of technical computing, [Internet]. http://www.mathworks.com/products/matlab/

Download references

Acknowledgments

The authors wish to thank Islamic Azad University-Langaroud Branch for its financial support based on Grant No. 17/14/10710.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shabnam Mahjoub.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mahjoub, S., Vojoudi, H. The UTFLA: uniformization of non-uniform iteration spaces in two-level perfect nested loops using SFLA. J Supercomput 72, 2221–2234 (2016). https://doi.org/10.1007/s11227-016-1725-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-016-1725-8

Keywords

Navigation