Abstract
A major source of problems when debugging message passing programs is the nondeterministic behavior of the promiscuous receive and nonblocking test operations. This prohibits the use of cyclic debugging techniques because the intrusion caused by a debugger is often large enough to change the order in which processes interact. This paper describes the solutions we propose to efficiently record and replay the nondeterministic features of message passing libraries (MPL) like MPI or PVM. It turns out that for promiscuous receive operations it is sufficient to keep track of the sender of the message, and for nonblocking test-operations to keep track of the number of failed tests. The proposed solutions have been implemented for an existing MPI-library, and performance measurements reveal that the time overhead of both record and replay executions is very low with respect to the (nondeterministic) original execution while the size of the log files remains very small.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
J. Briat, I. Ginzburg, and M. Pasin. Athapascan-0 Reference and User Manuals. LMC-IMAG, B.P. 53, F-38041 Grenoble Cedex 9, March 1998. http://www-apache.imag.fr/software/ath0/.
J. Briat, I. Ginzburg, M. Pasin, and B. Plateau. Athapascan runtime: Efficiency for irregular problems. In Proceedings of the Europar’97 Conference, pages 590–599, Passau, Germany, Aug 1997. Springer Verlag.
Gerson G. H. Cavalheiro, François Galilée, and Jean-Louis Roch. Athapascan-1: Parallel Programming with Asynchronous Tasks. In Proceedings of the Yale Multithreaded Programming Workshop, Yale, USA, june 1998. http://www-apache.imag.fr/gersonc/publications/yale98.ps.gz.
J. Chassin de Kergommeaux, M. Ronsse, and K. De Bosschere. Efficient execution replay for athapascan-0 parallel programs. Research Report 3635, INRIA, March 1999. http://www.inria.fr/RRRT/publications-fra.html.
A. Fagot and J. Chassin de Kergommeaux. Formal and experimental validation of a low-overhead execution replay mechanism. In Proceedings of Euro-Par’95, Stockholm, Sweden, August 1995. Springer-Verlag, LNCS.
I. Foster, C. Kesselman, and S. Tuecke. The nexus approach to integrating multithreading and communication. Journal of Parallel and Distributed Computing, 37(1):70–82, 1996.
M. Hurfin, N. Plouzeau, and M. Raynal. EREBUS A debugger for asynchronous distributed computing systems. In Proceedings of the 3rd IEEE Workshop on Future Trends in Distributed Computing Systems, Taiwan, April 1992.
H. Jamrozik. Aide à la Mise au Point des Applications Parall eles et Réparties à base d’Objets Persistants. PhD thesis, Université Joseph Fourier, Grenoble, May 1993.
D. Kranzlmüller and J. Volkert. Debugging point-to-point communication in mpi and pvm. In Proc. EUROPVM/MPI 98 Intl. Conference, pages 265–272, September 1998.
Thomas J. LeBlanc and John M. Mellor-Crummey. Debugging parallel programs with Instant Replay. IEEE Transactions on Computers, C-36(4):471–482, April 1987.
E. Leu, A. Schiper, and A. Zramdini. Execution Replay on Distributed Memory Architectures. In Proceedings of the 2nd IEEE Symposium on Parallel and Distributed Processing, pages 106–112, Dallas, USA, December 1990.
Message Passing Interface Forum, University of Tennessee, Knoxville, Tennessee. MPI: A Message-Passing Standard, May 1994.
Frank Mueller. A library implementation of POSIX threads under UNIX. In Proc. of the Winter USENIX Conference, pages 29–41, San Diego, CA, January 1993.
R.H.B. Netzer and B.P. Miller. Optimal Tracing and Replay for Debugging Message-Passing Parallel Programs. In Proceedings of Supercomputing’ 92, Minneapolis, Minnesota, November 1992. Institute of Electrical Engineers Computer Society Press.
M. Ronsse and L. Levrouw. An experimental evaluation of a replay method for shared memory programs. In E. D’Hollander, G.R. Joubert, F.J. Peters, D. Trystram, K. De Bosschere, and J. Van Campenhout, editors, Parallel Computing: State-of-the-Art and Perspectives, pages 399–406. North-Holland, Gent, 1996.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
de Kergommeaux, J.C., Ronsse, M., De Bosschere, K. (1999). MPL: Efficient Record/Replay of nondeterministic features of message passing libraries. In: Dongarra, J., Luque, E., Margalef, T. (eds) Recent Advances in Parallel Virtual Machine and Message Passing Interface. EuroPVM/MPI 1999. Lecture Notes in Computer Science, vol 1697. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48158-3_18
Download citation
DOI: https://doi.org/10.1007/3-540-48158-3_18
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66549-6
Online ISBN: 978-3-540-48158-4
eBook Packages: Springer Book Archive