Skip to main content
Log in

A directive-based MPI code generator for Linux PC clusters

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Computation requirements in scientific fields are getting heavier and heavier. The advent of clustering systems provides an affordable alternative to expensive conventional supercomputers. However, parallel programming is not easy for noncomputer scientists to do. We developed the Directive-Based MPI Code Generator (DMCG) that transforms C program codes from sequential form to parallel message-passing form. We also introduce a loop scheduling method for load balancing that depends on a message-passing analyzer, and is easy and straightforward to use. This approach provides a completely different view of loop parallelism from that in the literature, which relies on dependence abstractions. Experimental results show our approach can achieve efficient outcomes, and DMCG could be a general-purpose tool to help parallel programming beginners construct programs quickly and port existing sequential programs to PC Clusters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Sterling TL, Salmon J, Becker DJ, Savarese DF (1999) How to build a Beowulf: a guide to the implementation and application of PC clusters, 2nd edn. MIT Press, Cambridge

    Google Scholar 

  2. Wilkinson B, Allen M (1999) Parallel programming: techniques and applications using networked workstations and parallel computers. Prentice Hall, New York

    Google Scholar 

  3. Buyya R (1999) High performance cluster computing: architectures and systems, vol. 1. Prentice Hall, New York

    Google Scholar 

  4. Message passing interface forum. http://www.mpi-forum.org/

  5. PVM—parallel virtual machine. http://www.epm.ornl.gov/pvm/

  6. TOP500 supercomputer sites. http://www.top500.org

  7. Wolfe M (1996) Parallelizing compilers. ACM Comput Surv 28(1):261–262

    Article  MathSciNet  Google Scholar 

  8. Wolfe M (1996) High performance compilers for parallel computing. Addison-Wesley, Reading

    MATH  Google Scholar 

  9. Boulet P, Darte A, Silber G-A, Vivien F (1998) Loop parallelization algorithms: from parallelism extraction to code generation. Parallel Comput 24:421–444

    Article  MATH  MathSciNet  Google Scholar 

  10. Banerjee U (1988) An introduction to a formal theory of dependence analysis. J Supercomput 2(2):133–149

    Article  Google Scholar 

  11. Wolfe M (1989) More iteration space tiling. In: Proceedings of supercomputing, pp 655–664

  12. Bacon DF et al. (1994) Compiler transformations for high-performance computing. ACM Comput Surv 26(4):245–320

    Article  Google Scholar 

  13. Yang CT, Tseng SS, Fan YW, Tsai TK, Hsieh MH, Wu CT (2001) Using knowledge-based systems for research on portable parallelizing compilers. Concurr Comput Pract Exper 13:181–208

    Article  MATH  Google Scholar 

  14. Hummel SF, Schonberg E, Flynn LE (1992) Factoring: a method for scheduling parallel loops. Commun ACM 35(8):90–101

    Article  Google Scholar 

  15. Kruskal CP, Weiss A (1985) Allocating independent subtasks on parallel processors. IEEE Trans Softw Eng 11(10):1001–1016

    Article  Google Scholar 

  16. Polychronopoulos CD, Kuck DJ (1987) Guided self-scheduling: a practical self-scheduling scheme for parallel supercomputers. IEEE Trans Comput 36(12):1425–1439

    Article  Google Scholar 

  17. Tzen TH, Ni LM (1993) Trapezoid self-scheduling: a practical scheduling scheme for parallel compilers. IEEE Trans Parallel Distrib Syst 4(1):87–98

    Article  Google Scholar 

  18. Tang P, Yew PC (1986) Processor self-scheduling for multiple-nested parallel loops. In: International conference on parallel processing, pp 528–535

  19. Li H, Tandri S, Stumm M, Sevcik KC (1993) Locality and loop scheduling on NUMA multiprocessors. In: International conference on parallel processing, vol II, pp 140–147

  20. LAM/MPI parallel computing. http://www.lam-mpi.org/

  21. MPICH—a portable implementation of MPI. http://www-unix.mcs.anl.gov/mpi/mpich/

  22. MPI software technology. http://www.mpi-softtech.com/

  23. McGarvey B, Cicconetti R, Bushyager N, Dalton E, Tentzeris M (2001) Beowulf cluster design for scientific PDE models. In: Proceedings of the 2001 annual Linux showcase, Oakland, CA, November 2001

  24. Sedgewick R (1992) Algorithms in C++. Addison-Wesley, Reading, pp 476–478

    Google Scholar 

  25. Gorlatch S (2002) Message passing without send–receive. Future Gener Comput Syst 18:797–805

    Article  MATH  Google Scholar 

  26. Luecke GR, Raffin B, Coyle JJ (1999) The performance of the MPI collective communication routines for large messages on the Cray T3E600, the Cray Origin 2000, and the IBM SP. J Perform Eval Model Comput Syst, July 1999

  27. Beletsky V, Bagaterenco A, Chemeris A (1995) A package for automatic parallelization of serial C-programs for distributed systems. In: Proceedings of the conference on programming models for massively parallel computers, pp 184–188

  28. Zhang F, D’Hollander EH (1994) Extracting the parallelism in programs with unstructured control statements. In: Proceedings of international conference on parallel and distributed systems. IEEE, New York, pp 264–270

    Chapter  Google Scholar 

  29. The Stanford SUIF compiler group. http://suif.stanford.edu

  30. Di Martino B, Mazzeo A, Mazzoccaa N, Villano U (2001) Parallel program analysis and restructuring by detection of point-to-point interaction patterns and their transformation into collective communication constructs. Sci Comput Program 40:235–263

    Article  MATH  Google Scholar 

  31. Allen JR, Kennedy K (1987) Automatic translation of Fortran programs to vector form. ACM Trans Program Lang Syst 9(4):491–542

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chao-Tung Yang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, CT., Lai, KC. A directive-based MPI code generator for Linux PC clusters. J Supercomput 50, 177–207 (2009). https://doi.org/10.1007/s11227-008-0258-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-008-0258-1

Keywords

Navigation