Abstract
Distributed application suffer from nondeterminism thus may behave in a different way for subsequent executions with the same input. To be able to ensure determinism of replay the sequence of received messages should be recorded for each process. The paper deals with comparison of various strategies for tracing PVM programs. It concerns centralised and distributed approach for tracing as well as techniques with and without race detection.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
K. M. R. Audenaert, L. J. Levrouw. Interrupt replay: A debugging method for parallel programs with interrupts. Microprocessors and Microsystems, 18(10):601–612, 1994.
J. Chassin de Kergommeaux, K. De Bosschere, M. Ronsse. Efficient execution replay for ATHAPASCAN-0 parallel programs. Technical Report 3635, INRIA, 1999.
A. Fagot, J. Chassin de Kergommeaux. Systematic assessment of the overhead of tracing parallel programs. E. L. Zapata, ed., Proceedings of 4th EUROMICRO Workshop on Parallel and Distributed Processing, PDP’96, pp. 179–185. IEEE Computer Society Press, 1996.
H. Krawczyk, P. Kuzora, M. Neyman, J. Proficz, B. Wiszniewski. STEPS — a tool for testing PVM programs. Proceedings of SEIHPC-3 Workshop, pp. 1–8, Spain, 1998.
H. Krawczyk, B. Wiszniewski, P. Kuzora, M. Neyman, J. Proficz. Integrated static and dynamic analysis of PVM programs with STEPS. Computers and Artificial Intelligence, 17(5):441–453, 1998.
T. J. LeBlanc, John M. Mellor-Crummey. Debugging parallel programs with Instant Replay. IEEE Transactions on Computers, C-36(4):471–482, 1987.
J. Lourenco, J. C. Cunha. Replaying distributed applications with RPVM. G. Kotis P. Kacsuk, ed., Proceedings of DAPSYS’98, pp. 121–126, Budapest, Hungary, 1998. Universität Wien.
J. Lourenço, J. C. Cunha, H. Krawczyk, P. Kuzora, M. Neyman, B. Wiszniewski. An integrated testing and debugging environment for parallel and distributed programs. Proceedings of 23rd EUROMICRO Conference, pp. 291–298, Budapest, Hungary, 1997. IEEE Computer Society Press.
F. Mattern. Virtual time and global states of distributed systems. M. Cosnard et. al., ed., Parallel and Distributed Algorithms: proceedings of the International Workshop on Parallel & Distributed Algorithms, pp. 215–226. Elsevier Science Publishers B. V., 1989.
R. H. B. Netzer, B. P. Miller. Optimal tracing and replay for debugging message-passing parallel programs. The Journal of Supercomputing, 8(4):371–388, 1995.
N. Neves, W. K. Fuchs. RENEW: A tool for fast and efficient implementation of checkpoint protocols. Proceedings of FTCS-28, pp. 58–67. IEEE, 1998.
M. Neyman. Non-deterministic recovery of computations in testing of distributed systems. Proceedings of Ninth European Workshop on Dependable Computing, pp. 114–117, Gdansk, 1998. Technical University of Gdansk. ISBN 83-907591-1-X.
M. Neyman, M. Bukowski, P. Kuzora. Efficient replay of PVM programs. J. Dongarra et. al., ed., Recent Advances in Parallel Virtual Machine and Message Passing Interface, Lecture Notes in Computer Science number 1697, pp. 83–90. Springer-Verlag, 1999.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Neyman, M. (2000). Comparison of Different Approaches to Trace PVM Program Execution. In: Dongarra, J., Kacsuk, P., Podhorszki, N. (eds) Recent Advances in Parallel Virtual Machine and Message Passing Interface. EuroPVM/MPI 2000. Lecture Notes in Computer Science, vol 1908. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45255-9_38
Download citation
DOI: https://doi.org/10.1007/3-540-45255-9_38
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41010-2
Online ISBN: 978-3-540-45255-3
eBook Packages: Springer Book Archive