Skip to main content
Log in

Communication Benchmarking and Performance Modelling of MPI Programs on Cluster Computers

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

This paper gives an overview of two related tools that we have developed to provide more accurate measurement and modelling of the performance of message-passing communication and application programs on distributed memory parallel computers. MPIBench uses a very precise, globally synchronised clock to measure the performance of MPI communication routines. It can generate probability distributions of communication times, not just the average values produced by other MPI benchmarks. This allows useful insights to be made into the MPI communication performance of parallel computers, and in particular how performance is affected by network contention. The Performance Evaluating Virtual Parallel Machine (PEVPM) provides a simple, fast and accurate technique for modelling and predicting the performance of message-passing parallel programs. It uses a virtual parallel machine to simulate the execution of the parallel program. The effects of network contention can be accurately modelled by sampling from the probability distributions generated by MPIBench. These tools are particularly useful on clusters with commodity Ethernet networks, where relatively high latencies, network congestion and TCP problems can significantly affect communication performance, which is difficult to model accurately using other tools. Experiments with example parallel programs demonstrate that PEVPM gives accurate performance predictions on commodity clusters. We also show that modelling communication performance using average times rather than sampling from probability distributions can give misleading results, particularly for programs running on a large number of processors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. V. Adve. Analyzing the Behavior and Performance of Parallel Programs. PhD thesis, University of Wisconsin, Computer Sciences Department, December 1993.

  2. R. Alur, C. Courcoubetis, and D. Dill. Model-checking for probabilistic real-time systems. In Proceedings of the 18th International Conference on Automata, Languages and Programming (LNCS 510), 1991.

  3. R. Alur and D. Dill. A theory of timed automata. Theoretical Computer Science, 126:183–236, 1994.

    Article  Google Scholar 

  4. G. Amdahl. Validity of the single-processor approach to achieving large scale computing capabilities. Proceedings of the American Federation of Information Processing Societies, 30:483–485, 1967.

    Google Scholar 

  5. S. Fortune and J. Wylie. Parallelism in random access machines. In Proceedings of the 10th ACM Symposium on Theory of Computing, pp. 114–118, 1978.

  6. A. Grama, A. Gupta, and V. Kumar. Isoefficiency: Measuring the scalability of parallel algorithms and architectures. IEEE Parallel and Distributed Technology, 1(3):12–21, 1993.

    Article  Google Scholar 

  7. W. Gropp and E. Lusk. Reproducible measurements of MPI performance characteristics. In Proceedings of the PVMMPI Users’ Group Meeting (LNCS 1697), pp. 11–18, 1999.

  8. D. A. Grove and P. D. Coddington. Precise MPI performance measurement using MPIBench. In Proceedings of HPC Asia, September 2001.

  9. D. A. Grove. Performance Modelling of Message-Passing Parallel Programs. PhD thesis, University of Adelaide, Department of Computer Science, January 2003.

  10. D. A. Grove and P. D. Coddington. Modeling message-passing programs with a performance evaluating virtual parallel machine. Performance Evaluation: An International Journal, 60:165–187, 2005.

    Article  Google Scholar 

  11. K. Hawick, D. Grove, P. Coddington, and M. Buntine. Commodity cluster computing for computational chemistry. Internet Journal of Chemistry, 3(4), 2000.

  12. A. J. Hey, A. N. Dunlop, and E. Hernández. Realistic parallel performance estimation. Parallel Computing, 23(1/2):5–21, 1997.

    Article  Google Scholar 

  13. C. Hoare. Communicating Sequential Processes. Prentice-Hall, Englewood Cliffs, New Jersey, 1985.

  14. R. Hockney. Performance parameters and benchmarking of supercomputers. Parallel Computing, 17(10), 1991.

  15. C. J. Hughes, V. S. Pai, P. Ranganathan, and S. V. Adve. Rsim: Simulating shared-memory multiprocessors with ILP processors. IEEE Computer, February 2002.

  16. H. Jonkers. Performance Analysis of Parallel Systems: A Hybrid Approach. PhD thesis, Delft University of Technology, Information Technology and Systems, October 1995.

  17. D. Kranzlmüller and J. Volkert. NOPE: A nondeterministic program evaluator. In Proceedings of the 4th International ACPC Conference (LNCS 1557), pp. 490–499, 1992.

  18. J. Labarta, S. Girona, Pillet, C. A. T. V., and L. Gregoris. DiP: A parallel program development environment. In Proceedings of the 2nd International Euro-Par Conference, vol. II, pp. 665–674, August 1996.

  19. P. S. Magnusson, M. Christensson, J. Eskilson, D. Forsgren, G. Höllberg, J. Högberg, F. Larsson, A. Moestedt, and B. Werner. Simics: A full system simulation platform. IEEE Computer, February 2002.

  20. P. Mehra, C. Schulback, and J. Yan. A comparison of two model-based performance prediction techniques for message-passing parallel programs. In Proceedings of the ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, pp. 181–189, May 1994.

  21. R. Milner. A calculus of communicating systems. In Lecture Notes in Computer Science (92). Springer-Verlag, New York, 1980.

  22. P. J. Mucci, K. London, and J. Thurman. The MPBench report. Technical Report UT-CS-98-394, University of Tenessee, Department of Computer Science, November 1998.

  23. Pallas GmbH. Pallas MPI benchmarks home page. http://www.pallas.com/e/produces/pmb/.

  24. M. Parashar. Interpretive Performance Prediction for High Performance Parallel Computing. PhD thesis, Syracuse University, Department of Electrical and Computer Engineering, July 1994.

  25. R. Reussner, P. Sanders, and J. Larsson Träff. SKaMPI: A comprehensive benchmark for public benchmarking of MPI. Scientific Computing, 10, 2001.

  26. C. Schaubschläger. Automatic testing of nondeterministic programs in message passing systems. Master’s thesis, Johannes Kepler University Linz, Department for Computer Graphics and Parallel Processing, 2000.

  27. A. van Gemund. Performance Modeling of Parallel Systems. PhD thesis, Delft University of Technology, Information Technology and Systems, April 1996.

  28. F. Vaughan, D. Grove, and P. Coddington. Network performance issues in two high performance cluster computers. In Proceedings of the Australasian Computer Science Conference, February 2003.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to D. A. Grove.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Grove, D.A., Coddington, P.D. Communication Benchmarking and Performance Modelling of MPI Programs on Cluster Computers. J Supercomput 34, 201–217 (2005). https://doi.org/10.1007/s11227-005-2340-2

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-005-2340-2

Keywords

Navigation