Abstract
To support incremental replay of message-passing applications, processes must periodically checkpoint and must log some of the messages. The paper shows that known adaptive logging algorithms are likely to introduce deadlocks in replay and presents a new algorithm that prevents deadlocks and achieves better performance.
Preview
Unable to display preview. Download preview PDF.
References
O. Babaoglu, K. Marzullo, Consistent Global States of Distributed Systems, Distributed Systems, ACM press, Editor S. J. Mullender, Chapter 4, 55–96, 1993.
O. Gerstel, et al., On-the-Fly Replay: a Practical Paradigm and its Implementation for Distributed Debugging, 6th IEEE Symposium on Parallel and Distributed Processing, Dallas (TX), Oct. 1994.
T. J. LeBlanc, J. M. Mellor-Crummey, Debugging Parallel Programs with Instant Replay, IEEE Transactions on Computers, 36 (4), April 1987, 471–482.
B. P. Miller, R. H. B. Netzer, Optimal Tracing and Replay for Debugging Message Passing Programs, The Journal of Supercomputing, 8, 1995, 371–388.
R. H. B. Netzer, S. Subramanian, J. Xu, Critical-Path-Based Message Logging for Incremental Replay of Message Passing Programs, International Conference on Distributed Computing Systems, Poznan (P), June 1994.
R. H. B. Netzer, J. Xu, Adaptive Message Logging for Incremental Program Replay, IEEE Parallel and Distributed Technology, Vol. 1, No. 3, November 1993.
R. H. B. Netzer, Y. Xu, Replaying Distributed Programs Without Message Logging, 6th IEEE Symposium on High-Performance Distributed Computing, Portland (OR), Aug. 1997.
L. D. Wittie, Debugging Distributed C Programs by Real-Time Replay, ACM Workshop on Parallel and Distributed Debugging, Madison (WI), May 1988.
F. Zambonelli, On the Effectiveness of Distributed Checkpoint Algorithms for Domino-free Recovery, 7th IEEE Symposium on High-Performance Distributed Computing, Chicago (IL), July 1998.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1999 Springer-Verlag
About this paper
Cite this paper
Zambonelli, F. (1999). Deadlock prevention in incremental replay of message-passing programs. In: Sloot, P., Bubak, M., Hoekstra, A., Hertzberger, B. (eds) High-Performance Computing and Networking. HPCN-Europe 1999. Lecture Notes in Computer Science, vol 1593. Springer, Berlin, Heidelberg . https://doi.org/10.1007/BFb0100620
Download citation
DOI: https://doi.org/10.1007/BFb0100620
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65821-4
Online ISBN: 978-3-540-48933-7
eBook Packages: Springer Book Archive