Skip to main content
Log in

Optimal uniformization for non-uniform two-level loops using a hybrid method

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

This article has been updated

Abstract

The present study proposes a novel method based on evolutionary and fuzzy approaches for unifying two-level perfect nested loops. In this method, the Shuffled Frog Leaping Algorithm (SFLA) is used for achieving optimal answers, and simultaneously, three critical factors are applied as an input in determining basic dependence vectors. The use of fuzzy logic versus fixed coefficients for these three factors has led to the creation of optimal results with high variability and has solved the problem regarding the existence of the main vectors. In addition, the present algorithm has been proposed for many input data so that it can be used in parallel compilers automatically and with low complexity. After implementing and evaluating the proposed method, it was found that, compared to other existing methods, the results achieved were very close to optimal, in the least time, and with the lowest Dependence Cone Size (DCS) and highest number of input vectors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Availability of data and materials

Not applicable.

Change history

  • 25 February 2024

    References 43 and 44 have been updated to the correct format.

References

  1. Gunes OG, Sima UA (2010) Parallelization of an ant-based clustering approach. Kybernetes 39:656–677

    Article  Google Scholar 

  2. Ying VA (2019) Scaling sequential code with hardware-software co-design for fine-grain speculative parallelization (Doctoral dissertation, Massachusetts Institute of Technology)

  3. Maramzin A, Vasiladiotis C, Lozano RC, Cole M, Franke B (2019) It looks like you’re writing a parallel loop” a machine learning based parallelization assistant. In: AI-SEPS 2019—Proceedings of the 6th ACM SIGPLAN International Workshop on AI-Inspired and Empirical Methods for Software Engineering on Parallel Computing Systems, co-located with SPLASH 2019. New York, New York, USA: Association for Computing Machinery, Inc, pp. 1–10.

  4. Arabnejad H, Bispo J, Cardoso JMP, Barbosa JG (2019) Source-to-source compilation targeting OpenMP-based automatic parallelization of C applications. J Supercomput 76:6753–6785

    Article  Google Scholar 

  5. Liu H, Xu J, Ding L (2019) Coarse-grained automatic parallelization approach for branch nested loop. Int J Performability Eng 15:2871–2881.

  6. Harel R, Mosseri I, Levin H, Alon L or, Rusanovsky M, Oren G (2020) Source-to-source parallelization compilers for scientific shared-memory multi-core and accelerated multiprocessing: analysis, pitfalls, enhancement and potential. Int J Parallel Program 48:1–31.

  7. Iwasawa K (2010) Detecting method of parallelism from nested loops with loop carried data dependences. In: Proceedings—5th international multi-conference on Computing in the Global information technology, ICCGI 2010, pp 287–92.

  8. Bakhtin VA, Krukov VA (2019) DVM-approach to the automation of the development of parallel programs for clusters. Program Comp Softw 45:121–132

    Article  Google Scholar 

  9. Bondhugula U, Hartono A (2008) JR-P of the, 2008 undefined. Pluto: A practical and fully automatic polyhedral program optimization system. researchgate.net.

  10. Bielecki W, Pałkowski M (2016) Tiling arbitrarily nested loops by means of the transitive. Int J Appl Math Comp Sci 26:919–39.

  11. Palkowski M, Bielecki W (2018) Parallel tiled code generation with loop permutation within tiles. Comput Inform 36:1261–1282

    Article  MathSciNet  Google Scholar 

  12. Bielecki W, Skotnicki P (2019) Insight into tiles generated by means of a correction technique. J Supercomput 75:2665–2690.

  13. Prema S, Nasre R, Jehadeesan R, Panigrahi BK (2019) A study on popular auto-parallelization frameworks. Concurr Comput 31:e5168.

  14. Bielecki W, Poliwoda M (2021) Automatic Parallel Tiled Code Generation Based on Dependence Approximation. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Springer, Cham; 12942 LNCS, pp 260–75.

  15. Abdollahi-Kalkhoran A, Lotfi S, Izadkhah H (2022) TEA-SEA: Tiling and scheduling of non-uniform two-level perfectly nested loops using an evolutionary approach. Expert Syst Appl 191:116152

    Article  Google Scholar 

  16. Ding-Kai Chen, Torrellas J, Pen-Chung Yew (2002) An efficient algorithm for the run-time parallelization of DOACROSS loops. Institute of Electrical and Electronics Engineers (IEEE), pp 518–27.

  17. Mahjoub S, Lotfi S (2011) The UTLEA: Uniformization of non-uniform iteration spaces in three-level perfect nested loops using an evolutionary algorithm. Communications in Computer and Information Science. In Interna. Berlin, Springer, Heidelberg.

  18. Mahjoub S, Vojoudi H (2016) The UTFLA: uniformization of non-uniform iteration spaces in two-level perfect nested loops using SFLA. J Supercomp, 72.

  19. Tzen TH, Ni LM (1993) Dependence uniformization: a loop parallelization technique. IEEE Trans Parallel Distrib Syst 4:547–558

    Article  Google Scholar 

  20. Shang W, Hodzic E, Chen Z (1996) On uniformization of affine dependence algorithms. IEEE Trans Comput 45(7):827–840

    Article  MathSciNet  Google Scholar 

  21. Mahjoub S, Golsorkhtabaramiri, M., Salehi Amiri SS (2022) TLP: Towards three‐level loop parallelisation. IET Comput Digit; Tech., pp 1–13.

  22. Parsa S, Lotfi S (2007) Wave-fronts parallelization and scheduling. In: Innovations’07: 4th International Conference on Innovations in Information Technology, IIT. IEEE Computer Society, pp 382–386.

  23. Searles R, Chandrasekaran S, Joubert W, Hernandez O (2018) Abstractions and directives for adapting wavefront algorithms to future architectures. In: Proceedings of the Platform for Advanced Scientific Computing Conference, PASC 2018. New York, NY, USA: Association for Computing Machinery, Inc; pp 1–10.

  24. Li Y, Schwiebert L (2020) Memory-optimized wavefront parallelism on GPUs. Int J Parallel Program, pp 1–24.

  25. Tarhini AA (2013) Automatic loop parallelization (Doctoral dissertation)

  26. Pean DL, Chen C (2001) ODCHP: A new effective mechanism to maximize parallelism of nested loops with non-uniform dependences. J Syst Softw 56:279–297.

  27. Athanasaki M (2004) EK-12th EC, 2004 undefined. Scheduling of tiled nested loops onto a cluster with a fixed number of SMP nodes. ieeexplore.ieee.org.

  28. Athanasaki M, Sotiropoulos A, Tsoukalas G, Koziris N, Tsanakas P (2005) Hyperplane grouping and pipelined schedules: How to execute tiled loops fast on clusters of SMPs. J Supercomput 33:197–226

    Article  Google Scholar 

  29. Lee Y (2004) Software CC-J of S and, 2005 undefined. A two-level scheduling method: An effective parallelizing technique for uniform nested loops on a dsp multiprocessor. Elsevier, Amsterdam.

  30. Baskaran MM, Vydyanathan N, Bondhugula UK, Ramanujam J, Rountev A, Sadayappan P (2009) Compiler-assisted dynamic scheduling for effective parallelization of loop nests on multicore processors. ACM SIGPLAN Notices. Association for Computing Machinery (ACM), vol 44, pp 219–228.

  31. Beletska A, Bielecki W, Cohen A, Palkowski M, Siedlecki K (2011) Coarse-grained loop parallelization: Iteration Space Slicing vs affine transformations. Parallel Comput. North-Holland, pp 479–497.

  32. Hajieskandar A, Lotfi S (2011) Using an evolutionary algorithm for scheduling of two-level nested loops. In: International conference on Information and Electronics Engineering, pp 100–105

  33. Hajieskandar A, Lotfi S, Ghahramanian S (2012) Two level nested loops tiled iteration space scheduling by changing wave-front angles approach. Int J Ad Res Comp Commun Eng, pp 126–133.

  34. Hajieskandar A, Sohafi-Bonab J, Ghahramanian S (2015) Using of cuckoo search algorithm and wave-fronts approach with changing angle for tiled iteration space scheduling of two-level nested loops. In: International conference on Advances in Software, Control and Mechanical Engineering, pp 1–9

  35. Chen DK, Yew PC (1996) On effective execution of nonuniform DOACROSS loops. IEEE Trans Parallel Distrib Syst 7:463–476

    Article  Google Scholar 

  36. Zaafrani A, Ito MR (1994) Parallel region execution of loops with irregular dependencies. In: Internatonal conference on Parallel Processing, vol 2. IEEE, pp 11–19

  37. Ju J, Chaudhary V (1997) Unique sets oriented parallelization of loops with non-uniform dependences. Comput J 40:322–339

    Article  Google Scholar 

  38. Cho CK, Lee MH.(1997) A loop parallelization method for nested loops with non-uniform dependences. In: Proceedings international conference on Parallel and Distributed Systems. IEEE, pp 314–321

  39. Pean DL, Chen C (2001) An optimized three region partitioning technique to maximize parallelism of nested loops with non-uniform dependences. J Inf Sci Eng 17(3):463–489

    Google Scholar 

  40. Abdi Reyhan Z, Lotfi S, Isazadeh A, Karimpour J (2021) Intra-tile parallelization for two-level perfectly nested loops with non-uniform dependences. Comput J 64(9):1358–1383

    Article  MathSciNet  Google Scholar 

  41. Lotfi S, Parsa S (2009) Parallel loop generation and scheduling. J Supercomput 50:289–306

    Article  Google Scholar 

  42. Eusuff M, Lansey K, Pasha F (2006) Shuffled frog-leaping algorithm: A memetic meta-heuristic for discrete optimization. Eng Optim 38(2):129–154.

  43. Mortazavi A (2020) Large-scale structural optimization using a fuzzy reinforced swarm intelligence algorithm. Adv Eng Soft 142:102790.

    Article  Google Scholar 

  44. Mortazavi A (2022) Interactive fuzzy Bayesian search algorithm: A new reinforced swarm intelligence tested on engineering and mathematical optimization problems. Expert Syst Appl 187:115954.

    Article  Google Scholar 

  45. Cheng MY, Prayogo D (2017) A novel fuzzy adaptive teaching–learning-based optimization (FATLBO) for solving structural optimization problems. Eng Comput 33:55–69.

Download references

Funding

No funding.

Author information

Authors and Affiliations

Authors

Contributions

SM, MG*, SSSA wrote the main manuscript text. SM prepared figures. The Corresponding author of this manuscript is MG. All authors reviewed the manuscript.

Corresponding author

Correspondence to Mehdi Golsorkhtabaramiri.

Ethics declarations

Conflict of interest

We declare that the authors have no competing interests as defined by Springer, or other interests that might be perceived to influence the results and/or discussion reported in this paper.

Ethical Approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mahjoub, S., Golsorkhtabaramiri, M. & Amiri, S.S.S. Optimal uniformization for non-uniform two-level loops using a hybrid method. J Supercomput 79, 12791–12814 (2023). https://doi.org/10.1007/s11227-023-05194-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-023-05194-3

Keywords

Navigation