Skip to main content

Deadlock prevention in incremental replay of message-passing programs

  • Track C3: Computational Science
  • Conference paper
  • First Online:
High-Performance Computing and Networking (HPCN-Europe 1999)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1593))

Included in the following conference series:

  • 123 Accesses

Abstract

To support incremental replay of message-passing applications, processes must periodically checkpoint and must log some of the messages. The paper shows that known adaptive logging algorithms are likely to introduce deadlocks in replay and presents a new algorithm that prevents deadlocks and achieves better performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. O. Babaoglu, K. Marzullo, Consistent Global States of Distributed Systems, Distributed Systems, ACM press, Editor S. J. Mullender, Chapter 4, 55–96, 1993.

    Google Scholar 

  2. O. Gerstel, et al., On-the-Fly Replay: a Practical Paradigm and its Implementation for Distributed Debugging, 6th IEEE Symposium on Parallel and Distributed Processing, Dallas (TX), Oct. 1994.

    Google Scholar 

  3. T. J. LeBlanc, J. M. Mellor-Crummey, Debugging Parallel Programs with Instant Replay, IEEE Transactions on Computers, 36 (4), April 1987, 471–482.

    Google Scholar 

  4. B. P. Miller, R. H. B. Netzer, Optimal Tracing and Replay for Debugging Message Passing Programs, The Journal of Supercomputing, 8, 1995, 371–388.

    Article  Google Scholar 

  5. R. H. B. Netzer, S. Subramanian, J. Xu, Critical-Path-Based Message Logging for Incremental Replay of Message Passing Programs, International Conference on Distributed Computing Systems, Poznan (P), June 1994.

    Google Scholar 

  6. R. H. B. Netzer, J. Xu, Adaptive Message Logging for Incremental Program Replay, IEEE Parallel and Distributed Technology, Vol. 1, No. 3, November 1993.

    Google Scholar 

  7. R. H. B. Netzer, Y. Xu, Replaying Distributed Programs Without Message Logging, 6th IEEE Symposium on High-Performance Distributed Computing, Portland (OR), Aug. 1997.

    Google Scholar 

  8. L. D. Wittie, Debugging Distributed C Programs by Real-Time Replay, ACM Workshop on Parallel and Distributed Debugging, Madison (WI), May 1988.

    Google Scholar 

  9. F. Zambonelli, On the Effectiveness of Distributed Checkpoint Algorithms for Domino-free Recovery, 7th IEEE Symposium on High-Performance Distributed Computing, Chicago (IL), July 1998.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Peter Sloot Marian Bubak Alfons Hoekstra Bob Hertzberger

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag

About this paper

Cite this paper

Zambonelli, F. (1999). Deadlock prevention in incremental replay of message-passing programs. In: Sloot, P., Bubak, M., Hoekstra, A., Hertzberger, B. (eds) High-Performance Computing and Networking. HPCN-Europe 1999. Lecture Notes in Computer Science, vol 1593. Springer, Berlin, Heidelberg . https://doi.org/10.1007/BFb0100620

Download citation

  • DOI: https://doi.org/10.1007/BFb0100620

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-65821-4

  • Online ISBN: 978-3-540-48933-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics