Abstract
In clusters containing heterogeneous systems, message passing libraries (distributed computing tools) are employed for harnessing the computing and other resources. Task is submitted to a tool and the actual execution is carried out on aggregated network resources. Tools take care of scheduling, distributing subtasks and gathering results along with synchronization and message exchange requirements. They need initialization and synchronization routines for the submitted task. These tools also provide many other features like transparency, fault tolerance and load balancing. Some times all these features or initialization may not be required. The aim of tool designers should be to provide quality performance with add-on request initialization and feature provision. Initialization routines and special features provision take their own time over core distributed computing, affecting overall computational cost. In this paper a two purpose tool (Distributed Task Measure: DTM) is implemented. DTM is primarily used for placing other distributed computing tools on a performance index, judging their startup and performance. DTM may also serve to achieve macro level parallelization where requirements are such.
Similar content being viewed by others
References
Baker M (2000) Cluster computing white paper. Available: http://www.dcs.port.ac.in/~mab/tfcc/WhitePaper/WhitePaper.htm
Beguelin A, Dongarra J, Geist GA, Manchek R, Sunderam VS (1999) Hence: a users’ guide version 2.0. Available: http://www.netlib.org/hence/hence-2.0-doc-html/hence-2.0-doc.html
Biswas R (2004) NAS parallel benchmark. Available: http://www.nas.nasa.gov/Software/NPB/#info
Casanova H, Dongarra J (1996) Netsolve: a network solver for solving computational science problems. Technical Report CS-95-313, University of Tennessee
Chen SJ, Youn H, Yu C, Yoo S (1999) Heterogeneous computing using PVM with dynamically varying load. In: IASTED international conference on parallel and distributed computing systems, (PDCS ’99). Boston, USA, pp 794–799
Dongarra J (2000) The successive overrelaxation method. Available: http://www.netlib.org/utk/papers/templates/node15.html
El-Abd AE (2002) Load balancing in distributed computing systems using fuzzy expert systems. In: The IEEE international conference on modern problems of radio engineering. Lviv-Slavsko, Ukraine, pp 141–144
Foster I (2002) The grid: a new infrastructure for 21st century science. Phys Today 55(2):42–47
Geist A, Sunderam VS (1992) Network based concurrent computing on the PVM system. J. Concurrency: Pract Exp 4(4):293–311
Geist A, Benuelin A, Dongarra J, Jiang W, Manchek R, Sunderam VS (1994) PVM3 user’s guide. Reference Manual ORNL/TM-12187, Oak Ridge National Laboratory
Gropp W, Lusk E (1994) The MPI communication library: its design and a portable implementation. In: the scalable parallel libraries conference. Mississippi State, Mississippi, USA, pp 160–165
Gropp W, Lusk E, Doss N, Skjellum A (1996) A high-performance, portable implementation of the MPI message passing interface standard. Parallel Comput 22(6): 789–828
Gropp W, Lusk E (2002) Goals guiding design: PVM and MPI. In: IEEE international conference on cluster computing. Chicago, Illinois, pp 257–265
Jacobson Van, L-C, McCanne S (2003) tcpdump—dump traffic on a network. Available: http://www.tcpdump.org/tcpdump_man.html
Kalbarczyk ZT, Iyer RK, Bagchi S, Whisnant K (1999) Chameleon: a software infrastructure for adaptive fault tolerance. IEEE Trans Parallel Distrib Syst 10(6):560–579
Kontothanassis L, Stets R, Hunt G, Rencuzoqullari U, Altekar G, Dwarkadas S, Scott ML (2005) Shared memory computing on clusters with symmetric multiprocessors and system area networks. ACM Trans Comput Syst (TOCS) 23(3):301–335
Moon ES, Jhang ST, Jhon CS (2000) Experimental analysis of synchronization methods for CC-NUMA systems. In: IEEE international conference on computing and communications IPCCC’00. Phoenix, Arizona, pp 583–589
Parashar M, Hariri S, Mohamed AG, Fox GC (1992) A requirement analysis for high performance distributed computing over LANs. In: The first international symposium on high-performance distributed computing. Syracuse, New York, USA, pp 142–151
Peter D (2001) Online prediction of the running time of tasks. Kluwer Academic Publishers, pp 19–35
Raynal M, Mizuno M, Neilsen ML (1992) Synchronization and concurrency measures for distributed computations. In: The 12th international conference on distributed computing systems. Yokohama, Japan, pp 700–707
Schlagenhaft R, Ruhwandl M, Sporrer C, Bauer H (1995) Dynamic load balancing of a multi-cluster simulator on a network of workstations. In: Parallel and distributed simulation. Lake Placid, New York, USA, pp 175–180
Sharpe R, Warnicke E (2001) Ethereal user’s guide, V1.1 for Ethereal 0.9.7’. Available: http://www.ethereal.com/docs/user-guide/
Stevens WR (1999) UNIX network programming, vol 2. Prentice Hall. Interprocess Communication
Stivaros C (1992) A measure of fault-tolerance for distributed networks. In: Fourth international conference on computing and information. Toronto, Ontario, Canada, pp 426–429
Sunderam VS (1990) PVM: a framework for parallel distributed computing. J. Concurrency: Pract Exper 2(4):293–311
Sunderam VS, Geist A, Dongarra J, Manchek R (1994) The PVM concurrent computing system: evolution experience and trends. Parallel Comput 20(4):531–535
Thakur R, Gropp W, Lusk E (1996) An abstract-device interface for implementing portable parallel-I/O interfaces. In: Sixth symposium on the frontiers of massively parallel computing ’frontiers ’96’. Annapolis, Maryland, USA, pp 180–187
Thakur R, Gropp W, Lusk E (1999) Data sieving and collective I/O in ROMIO. In: The seventh symposium on the frontiers of massively parallel computation. frontiers ’99’. Annapolis, Maryland, USA, pp 182–189
Tierneyand B, Gunter D (2002) Netlogger: a toolkit for distributed system performance tuning and debugging. Technical report, Lawrance Berkley National Laboratory
Wermer S (1999) Primeur: advancing european technology frontier. Available: http://www.hoise.com/primeur/99/articles/monthly/SW-PR-05-99-39.html
Woo SC, Ohara M, Torrie E (1995) The SPLASH-2 programs: characterization and methodological considerations. In: 22nd international symposium on computer architecture. Santa Margherita Ligure, Italy, pp 24–36
Wunderlich JT (2003) Functional verification of SMP, MPP, and vector-register supercomputers through controlled randomness. In: IEEE SoutheastCon. Ocho Rios, Jamaica, pp 117–122
Xiaoqiang X, Shiyao J, Yuqin J, Linqi C (2000) The effect of communication performance on the speedup of Mpp In: 4th international conference/exhibition on high performance computing in the asia-pacific region. Beijing, China, pp 399–402
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Pandey, N., Sharma, G.K. Startup comparison for message passing libraries with DTM on linux clusters. J Supercomput 39, 59–72 (2007). https://doi.org/10.1007/s11227-006-0004-5
Issue Date:
DOI: https://doi.org/10.1007/s11227-006-0004-5