Skip to main content
Log in

Startup comparison for message passing libraries with DTM on linux clusters

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

In clusters containing heterogeneous systems, message passing libraries (distributed computing tools) are employed for harnessing the computing and other resources. Task is submitted to a tool and the actual execution is carried out on aggregated network resources. Tools take care of scheduling, distributing subtasks and gathering results along with synchronization and message exchange requirements. They need initialization and synchronization routines for the submitted task. These tools also provide many other features like transparency, fault tolerance and load balancing. Some times all these features or initialization may not be required. The aim of tool designers should be to provide quality performance with add-on request initialization and feature provision. Initialization routines and special features provision take their own time over core distributed computing, affecting overall computational cost. In this paper a two purpose tool (Distributed Task Measure: DTM) is implemented. DTM is primarily used for placing other distributed computing tools on a performance index, judging their startup and performance. DTM may also serve to achieve macro level parallelization where requirements are such.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Baker M (2000) Cluster computing white paper. Available: http://www.dcs.port.ac.in/~mab/tfcc/WhitePaper/WhitePaper.htm

  2. Beguelin A, Dongarra J, Geist GA, Manchek R, Sunderam VS (1999) Hence: a users’ guide version 2.0. Available: http://www.netlib.org/hence/hence-2.0-doc-html/hence-2.0-doc.html

  3. Biswas R (2004) NAS parallel benchmark. Available: http://www.nas.nasa.gov/Software/NPB/#info

  4. Casanova H, Dongarra J (1996) Netsolve: a network solver for solving computational science problems. Technical Report CS-95-313, University of Tennessee

  5. Chen SJ, Youn H, Yu C, Yoo S (1999) Heterogeneous computing using PVM with dynamically varying load. In: IASTED international conference on parallel and distributed computing systems, (PDCS ’99). Boston, USA, pp 794–799

  6. Dongarra J (2000) The successive overrelaxation method. Available: http://www.netlib.org/utk/papers/templates/node15.html

  7. El-Abd AE (2002) Load balancing in distributed computing systems using fuzzy expert systems. In: The IEEE international conference on modern problems of radio engineering. Lviv-Slavsko, Ukraine, pp 141–144

  8. Foster I (2002) The grid: a new infrastructure for 21st century science. Phys Today 55(2):42–47

    Article  Google Scholar 

  9. Geist A, Sunderam VS (1992) Network based concurrent computing on the PVM system. J. Concurrency: Pract Exp 4(4):293–311

    Article  Google Scholar 

  10. Geist A, Benuelin A, Dongarra J, Jiang W, Manchek R, Sunderam VS (1994) PVM3 user’s guide. Reference Manual ORNL/TM-12187, Oak Ridge National Laboratory

  11. Gropp W, Lusk E (1994) The MPI communication library: its design and a portable implementation. In: the scalable parallel libraries conference. Mississippi State, Mississippi, USA, pp 160–165

  12. Gropp W, Lusk E, Doss N, Skjellum A (1996) A high-performance, portable implementation of the MPI message passing interface standard. Parallel Comput 22(6): 789–828

    Article  MATH  Google Scholar 

  13. Gropp W, Lusk E (2002) Goals guiding design: PVM and MPI. In: IEEE international conference on cluster computing. Chicago, Illinois, pp 257–265

  14. Jacobson Van, L-C, McCanne S (2003) tcpdump—dump traffic on a network. Available: http://www.tcpdump.org/tcpdump_man.html

  15. Kalbarczyk ZT, Iyer RK, Bagchi S, Whisnant K (1999) Chameleon: a software infrastructure for adaptive fault tolerance. IEEE Trans Parallel Distrib Syst 10(6):560–579

    Article  Google Scholar 

  16. Kontothanassis L, Stets R, Hunt G, Rencuzoqullari U, Altekar G, Dwarkadas S, Scott ML (2005) Shared memory computing on clusters with symmetric multiprocessors and system area networks. ACM Trans Comput Syst (TOCS) 23(3):301–335

    Article  Google Scholar 

  17. Moon ES, Jhang ST, Jhon CS (2000) Experimental analysis of synchronization methods for CC-NUMA systems. In: IEEE international conference on computing and communications IPCCC’00. Phoenix, Arizona, pp 583–589

  18. Parashar M, Hariri S, Mohamed AG, Fox GC (1992) A requirement analysis for high performance distributed computing over LANs. In: The first international symposium on high-performance distributed computing. Syracuse, New York, USA, pp 142–151

  19. Peter D (2001) Online prediction of the running time of tasks. Kluwer Academic Publishers, pp 19–35

  20. Raynal M, Mizuno M, Neilsen ML (1992) Synchronization and concurrency measures for distributed computations. In: The 12th international conference on distributed computing systems. Yokohama, Japan, pp 700–707

  21. Schlagenhaft R, Ruhwandl M, Sporrer C, Bauer H (1995) Dynamic load balancing of a multi-cluster simulator on a network of workstations. In: Parallel and distributed simulation. Lake Placid, New York, USA, pp 175–180

    Google Scholar 

  22. Sharpe R, Warnicke E (2001) Ethereal user’s guide, V1.1 for Ethereal 0.9.7’. Available: http://www.ethereal.com/docs/user-guide/

  23. Stevens WR (1999) UNIX network programming, vol 2. Prentice Hall. Interprocess Communication

  24. Stivaros C (1992) A measure of fault-tolerance for distributed networks. In: Fourth international conference on computing and information. Toronto, Ontario, Canada, pp 426–429

  25. Sunderam VS (1990) PVM: a framework for parallel distributed computing. J. Concurrency: Pract Exper 2(4):293–311

    Google Scholar 

  26. Sunderam VS, Geist A, Dongarra J, Manchek R (1994) The PVM concurrent computing system: evolution experience and trends. Parallel Comput 20(4):531–535

    Article  MATH  Google Scholar 

  27. Thakur R, Gropp W, Lusk E (1996) An abstract-device interface for implementing portable parallel-I/O interfaces. In: Sixth symposium on the frontiers of massively parallel computing ’frontiers ’96’. Annapolis, Maryland, USA, pp 180–187

  28. Thakur R, Gropp W, Lusk E (1999) Data sieving and collective I/O in ROMIO. In: The seventh symposium on the frontiers of massively parallel computation. frontiers ’99’. Annapolis, Maryland, USA, pp 182–189

  29. Tierneyand B, Gunter D (2002) Netlogger: a toolkit for distributed system performance tuning and debugging. Technical report, Lawrance Berkley National Laboratory

  30. Wermer S (1999) Primeur: advancing european technology frontier. Available: http://www.hoise.com/primeur/99/articles/monthly/SW-PR-05-99-39.html

  31. Woo SC, Ohara M, Torrie E (1995) The SPLASH-2 programs: characterization and methodological considerations. In: 22nd international symposium on computer architecture. Santa Margherita Ligure, Italy, pp 24–36

  32. Wunderlich JT (2003) Functional verification of SMP, MPP, and vector-register supercomputers through controlled randomness. In: IEEE SoutheastCon. Ocho Rios, Jamaica, pp 117–122

  33. Xiaoqiang X, Shiyao J, Yuqin J, Linqi C (2000) The effect of communication performance on the speedup of Mpp In: 4th international conference/exhibition on high performance computing in the asia-pacific region. Beijing, China, pp 399–402

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nirved Pandey.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pandey, N., Sharma, G.K. Startup comparison for message passing libraries with DTM on linux clusters. J Supercomput 39, 59–72 (2007). https://doi.org/10.1007/s11227-006-0004-5

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-006-0004-5

Keywords

Navigation