Skip to main content
Log in

Using hybrid MPI and OpenMP programming to optimize communications in parallel loop self-scheduling schemes for multicore PC clusters

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Recently, a series of parallel loop self-scheduling schemes have been proposed, especially for heterogeneous cluster systems. However, they employed the MPI programming model to construct the applications without considering whether the computing node is multicore architecture or not. As a result, every processor core has to communicate directly with the master node for requesting new tasks no matter the fact that the processor cores on the same node can communicate with each other through the underlying shared memory. To address the problem of higher communication overhead, in this paper we propose to adopt hybrid MPI and OpenMP programming model to design two-level parallel loop self-scheduling schemes. In the first level, each computing node runs an MPI process for inter-node communications. In the second level, each processor core runs an OpenMP thread to execute the iterations assigned for its resident node. Experimental results show that our method outperforms the previous works.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Baker M, Buyya R (1999) Cluster computing: the commodity supercomputer. Int J Softw Pract Exp 29(6):551–575

    Article  Google Scholar 

  2. Beaumont O, Casanova H, Legrand A, Robert Y, Yang Y (2005) Scheduling divisible loads on star and tree networks: results and open problems. IEEE Trans Parallel Distrib Syst 16:207–218

    Article  Google Scholar 

  3. Bennett BH, Davis E, Kunau T, Wren W (2000) Beowulf parallel processing for dynamic loadbalancing. In: Proceedings of IEEE aerospace conference, 2000, vol 4, pp 389–395

  4. Bohn CA, Lamont GB (2002) Load balancing for heterogeneous clusters of PCs. Future Gener Comput Syst 18:389–400

    Article  Google Scholar 

  5. Cheng K-W, Yang C-T, Lai C-L, Chang S-C (2004) A parallel loop self-scheduling on grid computing environments. In: Proceedings of the 2004 IEEE international symposium on parallel architectures, algorithms and networks, KH, China, May 2004, pp 409–414

  6. Chronopoulos AT, Andonie R, Benche M, Grosu D (2001) A class of loop self-scheduling for heterogeneous clusters. In: Proceedings of the 2001 IEEE international conference on cluster computing, 2001, pp 282–291

  7. He Y, Ding HQ (2002) MPI and OpenMP paradigms on cluster of SMP architectures: the vacancy tracking algorithm for multi-dimensional array transposition. In: Proceedings of the 2002 ACM/IEEE conference on supercomputing, 2002, pp 1–14

  8. Hummel SF, Schonberg E, Flynn LE (1992) Factoring: a method scheme for scheduling parallel loops. Commun ACM 35:90–101

    Article  Google Scholar 

  9. Introduction to the Mandelbrot set (2008) http://www.ddewey.net/mandelbrot/

  10. Li H, Tandri S, Stumm M, Sevcik KC (1993) Locality and loop scheduling on NUMA multiprocessors. In: Proceedings of the 1993 international conference on parallel processing, vol II, 1993, pp 140–147

  11. Polychronopoulos CD, Kuck D (1987) Guided self-scheduling: a practical scheduling scheme for parallel supercomputers. IEEE Trans Comput 36(12):1425–1439

    Article  Google Scholar 

  12. Post E, Goosen HA (2001) Evaluation the parallel performance of a heterogeneous system. In: Proceedings of 5th international conference and exhibition on high-performance computing in the Asia-Pacific region (HPC Asia 2001)

  13. Rosenberg R, Norton G, Novarini JC, Anderson W, Lanzagorta M (2006) Modeling pulse propagation and scattering in a dispersive medium: performance of MPI/OpenMP hybrid code. In: Proceedings of the ACM/IEEE conference on supercomputing, 2006, pp 47–47

  14. Shih W-C, Yang C-T, Tseng S-S (2007) A performance-based parallel loop scheduling on grid environments. J Supercomput 41(3):247–267

    Article  Google Scholar 

  15. Sterling T, Bell G, Kowalik JS (2002) Beowulf cluster computing with Linux. MIT Press, Cambridge

    Google Scholar 

  16. Tang P, Yew PC (1986) Processor self-scheduling for multiple-nested parallel loops. In: Proceedings of the 1986 international conference on parallel processing, 1986, pp 528–535

  17. The Scalable Computing Laboratory (SCL) (2008) http://www.scl.ameslab.gov/

  18. Tzen TH, Ni LM (1993) Trapezoid self-scheduling: a practical scheduling scheme for parallel compilers. IEEE Trans Parallel Distrib Syst 4:87–98

    Article  Google Scholar 

  19. Yang C-T, Chang S-C (2004) A parallel loop self-scheduling on extremely heterogeneous PC clusters. J Inf Sci Eng 20(2):263–273

    Google Scholar 

  20. Yang C-T, Cheng K-W, Li K-C (2005) An enhanced parallel loop self-scheduling scheme for cluster environments. J Supercomput 34(3):315–335

    Article  Google Scholar 

  21. Yang C-T, Cheng K-W, Shih W-C (2007) On development of an efficient parallel loop self-scheduling for grid computing environments. Parallel Comput 33(7–8):467–487

    Google Scholar 

  22. Yang C-T, Shih W-C, Tseng S-S (2008) Dynamic partitioning of loop iterations on heterogeneous PC clusters. J Supercomput 44(1):1–23

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chao-Chin Wu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, CC., Lai, LF., Yang, CT. et al. Using hybrid MPI and OpenMP programming to optimize communications in parallel loop self-scheduling schemes for multicore PC clusters. J Supercomput 60, 31–61 (2012). https://doi.org/10.1007/s11227-009-0271-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-009-0271-z

Keywords

Navigation