Abstract
MPI has emerged as a popular way to write architecture-independent parallel programs. By modifying an MPI library and associated MPI run-time environment, transparent extraction of timestamped information is possible. The wall-clock time at which specific MPI communication events begin and end can be recorded, collected, and provided to a central scheduler. The infrastructure to create and collect these events has been implemented and tested, and a future architecture that can use this information is described.
This work was funded in part by NSF Grant No. EEC-8907070 and NSF Grant No. EEC-9730381.
Preview
Unable to display preview. Download preview PDF.
References
Mark A. Baker, Geoffrey C. Fox, and Hon W. Yau, “Cluster Computing Review”, Northeast Parallel Architectures Center, Syracuse University, 16 November 1995. Available via http://www.npac.syr.edu/techreports/hypertext/sccs-0748/cluster-review.html.
Samuel H. Russ, Jonathan Robinson, Dr. Brian K. Flachs, and Bjorn Heckel, “The Hector Distributed Run-Time Environment,” IEEE Transactions on Parallel and Distributed Systems, Vol. 9, No. 11, November 1998, pp. 1104–1112.
Samuel H. Russ, “An Architecture for Rapid Distributed Fault Tolerance”, 3rd International Workshop on Embedded High-Performance Computing, in J. Rolim, Editor, Parallel and Distributed Processing, Lecture Notes in Computer Science, vol. 1388, Springer-Verlag, 1998, pp. 925–930.
Samuel H. Russ, Brad Meyers, Chun-Heong Tan, and Bjørn Heckel, “User-Transparent Run-Time Performance Optimization”, Proceedings of the 2nd International Workshop on Embedded High Performance Computing, Associated with the 11th International Parallel Processing Symposium (IPPS 97), Geneva, April 1997.
James Harden, Cedell Alexander, Donna Reese, Marlene Evans, and Charles Hudnall, “In Search of a Standards-Based Approach to Hybrid Performance Monitoring,”, IEEE Parallel and Distributed Technology, Winter, 1995, pp. 61–71.
Rakesh Jha, Mustafa Muhammad, Sudhakar Yalamanchili, Karsten Schwan, Daniela Ivan-Rosu, and Chris de Castro, “Adaptive Resource Allocation for Embedded Parallel Applications”, Proceedings of the International Conference on High-Performance Computing, 1996, pp. 425–431.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1999 Springer-Verlag
About this paper
Cite this paper
Russ, S.H., Jean-Baptiste, R., Kumar, T.S.K., Harmon, M. (1999). Transparent real-time monitoring in MPI. In: Rolim, J., et al. Parallel and Distributed Processing. IPPS 1999. Lecture Notes in Computer Science, vol 1586. Springer, Berlin, Heidelberg . https://doi.org/10.1007/BFb0098012
Download citation
DOI: https://doi.org/10.1007/BFb0098012
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65831-3
Online ISBN: 978-3-540-48932-0
eBook Packages: Springer Book Archive