Skip to main content

Debugging point-to-point communication in MPI and PVM

  • Conference paper
  • First Online:
Book cover Recent Advances in Parallel Virtual Machine and Message Passing Interface (EuroPVM/MPI 1998)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1497))

Abstract

Cyclic debugging of nondeterministic parallel programs requires some kind of record and replay technique, because successive executions may produce different results even if the same input is supplied. The NOndeterministic Program Evaluator NOPE is an implementation of record and replay for message-passing systems. During an initial record phase, ordering information about occurring events is stored in traces, which preserve an equivalent execution during follow-up replay phases. In comparison to other tools, NOPE produces less overhead in time and space by relying on certain properties of MPI and PVM. The key factor is the non-overtaking rule which simplifies not only tracing and replay but also race condition detection. In addition, an automatic approach to event manipulation allows extensive investigation of nondeterministic behavior.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Curtis, R.S. and Wittie, L.D.: BugNet: A Debugging System for Parallel Programming Environments. Proc. 3rd Intl. Conf. Distributed Computing Systems, Miami, FL, pp. 394–399 (Oct. 1982).

    Google Scholar 

  2. Damodaran-Kamal, S.K. and Francioni, J.M.: Testing Races in Parallel Programs with an OtOt Strategy. Proc. 1994 Intl. Symp. on Software Testing and Analysis, Seattle, WA (1994).

    Google Scholar 

  3. Geist, A., Beguelin, A., Dongarra, J., Joang, W., Manchek, R., Sunderam, V.: PVM 3 User's Guide and Reference Manual. Techn. Rep. ORNL/TM-12187, Oak Ridge Natl. Lab., Oak Ridge, TN (May 1994).

    Google Scholar 

  4. Kranzlmüller, D., Grabner, S. and Volkert, J.: Debugging with the MAD Environment. Parallel Computing, Vol. 23, Nos. 1–2, pp. 199–217 (Apr. 1997).

    Article  MATH  Google Scholar 

  5. Lamport, L.: Time, Clocks, and the Ordering of Events in a Distributed System. Comm. ACM, pp. 558–565 (July 1978).

    Google Scholar 

  6. LeBlanc, T.J. and Mellor-Crummey, J.M.: Debugging Parallel Programs with Instant Replay. IEEE Trans. on Comp., Vol. C-36, No. 4, pp. 471–481 (1987).

    Google Scholar 

  7. Leu, E., Schiper, A., and Zramdini, A.: Execution Replay on Distributed Memory Architectures. Proc. 2nd IEEE Symp. on Parallel & Distributed Processing, Dallas, TX, pp. 106–112 (Dec. 1990).

    Google Scholar 

  8. Message Passing Interface Forum: MPI: A Message-Passing Interface Standard — Version 1.1. http://www.mcs.anl.gov/mpi/ (June 1995).

    Google Scholar 

  9. Netzer, R.H.B. and Miller, B.P.: Optimal Tracing and Replay for Message-Passing Parallel Programs. Supercomputing '92, Minneapolis, MN (Nov. 1992).

    Google Scholar 

  10. Ronsse, M.A. and Kranzlmüller, D.: RoltMP — Replay of Lamport Timestamps for Message Passing Systems. Proc. 6th EUROMICRO Workshop on Parallel and Distributed Processing, Madrid, Spain, pp. 87–93, (Jan. 21–23, 1998).

    Google Scholar 

  11. Snelling, D.F. and Hoffmann, G.-R.: A comparative study of libraries for parallel processing. Proc. Intl. Conf. on Vector and Parallel Processors, Computational Science III, Parallel Computing, Vol. 8 (1–3), pp. 255–266 (1988).

    Article  Google Scholar 

  12. Wasserman, H. and Blum, M.: Program result-checking: a theory of testing meets a test of theory. Proc. 35th IEEE Symp. Foundations of Computer Science, pp. 382–392 (1994).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Vassil Alexandrov Jack Dongarra

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kranzlmüller, D., Volkert, J. (1998). Debugging point-to-point communication in MPI and PVM. In: Alexandrov, V., Dongarra, J. (eds) Recent Advances in Parallel Virtual Machine and Message Passing Interface. EuroPVM/MPI 1998. Lecture Notes in Computer Science, vol 1497. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0056584

Download citation

  • DOI: https://doi.org/10.1007/BFb0056584

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-65041-6

  • Online ISBN: 978-3-540-49705-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics