Skip to main content
Log in

Performance of parallel communication and spawning primitives on a Linux cluster

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

The Linux cluster considered in this paper, formed from shuttle box XPC nodes with 2 GHz Athlon processors connected by dual Gb Ethernet switches, is relatively easily constructed, but, while effective as a throughput engine, may result in disappointing results when running explicitly parallel software if weakly-performing communication mechanisms and process spawning are selected. This paper carefully compares the implementations of communication and spawning primitives in MPICH-2, openMosix, and Linux Remote Procedure Call, forking, and various lower-level communication mechanisms. The test selection compares the provision of both a message-passing library, and a single system image software package, with direct use of lower-level primitives. The information in the paper will be of interest to those considering the use of one of the well-known packages, or directly writing their own distributed applications, or constructing a distributed language by layering on top of an existing set of parallel primitives. The results expose a ranking in terms of process spawning and a similar ranking of communication software performance. They reveal poor performance in certain circumstances, well below the hardware specification, which it is as well that the developer is aware of. In general, the paper emphasizes the importance of efficient transport software to cluster machines.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. A. Geist, A. Beguelin, J. Dongarra, W. Jiang, R. Manchek, and V. Sunderam, PVM: Parallel Virtual Machine, A Users’ Guide and Tutorial for Networked Parallel Computing (MIT, Cambridge, MA, 1994).

    Google Scholar 

  2. C. Lin and L. Snyder, ZPL: An array sublanguage, in: 6th International Workshop on Languages and Compilers for Parallel Computing (1993) pp. 96–114.

  3. J. H. Reppy, Concurrent Programming in ML (Cambridge University Press, Cambridge, UK, 1999).

    MATH  Google Scholar 

  4. D. J. Johnston, M. Fleury, and A. C. Downton, Prototyping application models in concurrent ML, in: Euro-Par 2003 Parallel Processing (Springer, Berlin, 2003), pp. 750–759.

  5. W. Gropp, E. Lusk, and R. Thakur, Using MPI-2: Advanced Features of the Message-Passing Interface (MIT, Cambridge, MA, 1999).

    Google Scholar 

  6. T. Sterling, Beowulf Cluster Computing with Linux (MIT, Cambridge, MA, 2002).

    Google Scholar 

  7. G. A. Geist, J. A. Kohl, and P. M. Papadopoulos, PVM and MPI: a comparison of features, Calculateurs Paralleles 8(2) (1996) 137–150.

    Google Scholar 

  8. A. Barak, O. La’dan, and A. Shiloh, Scalable cluster computer with MOSIX on LINUX, in: Linux Expo’99 (1999) pp. 95–100.

  9. A. Barak, S. Guday, and R Wheeler, The MOSIX Distributed Operating System, Load Balancing for UNIX (Springer-Verlag, Berlin, 1993).

    MATH  Google Scholar 

  10. M. A. Baker, G. C. Fox, and H. W. Yau, A review of commercial and research cluster management software packages. NHSE Review Electronic Journal, 1(1) (1996), at http://nhse.cs. rice.edu/NHSEreview/96-1.html.

  11. A. Barak and O. La’adan, The MOSIX multicomputer operating system for high performance cluster computing, Journal of Future Generation Computer Systems 13(4–5) (1998) 361–372.

    Article  Google Scholar 

  12. M. J. Rochkind, Advanced UNIX Programming, 2nd edition, (Addison-Wesley, Boston, 2004).

    Google Scholar 

  13. W. R. Stevens, UNIX Network Programming, Interprocess Communication, 2nd edition, (Prentice Hall, Upper Saddle River, NJ, 1999) Vol. 2.

    Google Scholar 

  14. R. W. Hockney, The Science of Computer Benchmarking (SIAM, Philadelphia, PA, 1996).

    MATH  Google Scholar 

  15. I. Pyarali, T. H. Harrison, and D. C. Schmidt, Design and performance of an object-oriented framework for high-speed electronic medical imaging, USENIX Computing Systems 9(3) (1996) 265–298.

    Google Scholar 

  16. D. C. Schmidt and T. Suda, Transport system architectures for high-performance communication systems. IEEE Journal on Selected Areas in Communication 11(4) (1993) 489–506.

    Article  Google Scholar 

  17. T. Sterling, Node hardware, in: Beowulf Cluster Computing with Linux. (MIT, Cambridge, MA, 2002) pp. 31–60.

  18. D. Ridge, D. Becker, P. Merkey, and T. Sterling, Beowulf: Harnessing the power of parallelism in a pile-of-PCs, IEEE Aerospace 2 (1997) 79–91.

    Google Scholar 

  19. R. Breyer and S. Riley, Switched, Fast, and Gigabit Ethernet (Macmillan, San Francisco, CA, 1999).

    Google Scholar 

  20. T. Sterling, Network hardware, in: Beowulf Cluster Computing with Linux (MIT, Cambridge, MA, 2002) pp. 113–130.

  21. T. H. Dunigan Jr., J. S. Vetter, J. B. White III, and P. H. Worley, Performance evaluation of the Cray XI distributed shared-memory architecture. IEEE Micro 25(1) (2005) 30–40.

  22. M. Bar, openMOSIX, an open source Linux cluster project, (2002), at http://www.openmosix.org/.

  23. D. Ashton, W. Gropp, E. Lusk, R. Ross, and B. Ronen, MPICH2 design document. Technical report, Argonne National Laboratory (2003). Report # ANL/MCS-TM-00.

  24. W. R. Stevens, UNIX Network Programming: Networking APIs: Sockets and XTI, 2nd edition (Prentice Hall, Upper Saddle River, NJ, 1998).

    Google Scholar 

  25. N. Nupairoj and L. Ni, Performance evaluation of some MPI implementations on workstation clusters, in: Scalable Parallel Libraries Conference (1994) pp. 98–105.

  26. R. Chandra, L. Dagum, D. Kohr, D. Maydan, J. McDonald, and R. Menon, Parallel Programming in OpenMP (Morgan Kaufmann, San Francisco, CA, 2001).

    Google Scholar 

  27. J. Peacock, Gently down the STREAMS, UNIX Review 9 (1992) 33–38.

    Google Scholar 

  28. D. Ritchie, A stream input-output system, AT&T Bell Labs Technical Journal 63 (1984) 311–324.

    Google Scholar 

  29. W. R. Stevens, UNIX Network Programming, 2nd edition, Sockets and XTI (Prentice Hall, Upper Saddle River, NJ, 1999) Vol. 2.

  30. M. Snir, S. W. Otto, S. Huss-Lederman, D. W. Walker, and J. Dongarra, MPI—The Complete Reference: The MPI Core. 2nd edition (MIT, Cambridge, MA, 1998) Vol. 1.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Martin Fleury.

Additional information

David J. Johnston has worked as software engineer in research and development for 20 years, at ICL Ltd. and the Rutherford-Appleton Laboratory, UK. His strengths lie in generating and realizing algorithms for complex systems. His interests include languages and methodologies to shorten the software development process. He has recently completed a Ph.D. at the University of Essex, UK in position identification for augmented reality. He has co-authored a book on Computer Graphics.

Martin Fleury is a Senior Lecturer at the University of Essex, UK, where he was also awarded a Ph.D. in Parallel Image Processing. His first degree was from Oxford University, and he holds an MSc in Astrophysics from the University of London. He is the principal author of a book on parallel computing for embedded systems. He has authored thirty-five journal papers in the last ten years on parallel image and vision processing, performance prediction, real-time systems, reconfigurable computing, software engineering, and video and document compression.

Michael Lincoln has completed an M.Sc. and Ph.D. at the University of Essex, UK in the field of face recognition and face tracking. His work as a Senior Research Officer is concerned with radar control of aircraft landings. The cluster mentioned in the paper was constructed, configured, and commissioned by Michael.

Andrew C. Downton was educated at Southampton University, UK, where he obtained a first class honours degree in Electronic Engineering in 1974, and a Ph.D. in 1982, and where he was also a lecturer. In 1995 he was promoted to a personal Chair at the University of Essex, UK, and in 1999 he became Head of the Department of Electronic Systems Engineering at Essex. His research interests include pattern recognition and image analysis; parallel computer architectures; hardware-software co-design; handwriting recognition; and document analysis. He is a Chartered Engineer and Fellow of the Institution of Electrical Engineers (IEE) and a Senior Member of the IEEE.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Johnston, D.J., Fleury, M., Lincoln, M. et al. Performance of parallel communication and spawning primitives on a Linux cluster. Cluster Comput 9, 375–384 (2006). https://doi.org/10.1007/s10586-006-0007-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-006-0007-2

Keywords

Navigation