Abstract
In this paper we describe the dproc (distributed /proc) kernel-level mechanisms and abstractions, which provide the building blocks for implementation of efficient, cluster-wide, and application-specific performance monitoring. Such monitoring functionality may be constructed at any time, both before and during application invocation, and can include dynamic run-time extensions. This paper (i) presents dproc’s implementation in a Linux-based cluster of SMP-machines, and (ii) evaluates its utility by construction of sample monitoring functionality. Full version of this paper can be found at: http://www.cc.gatech.edu/systems/projects/dproc/
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
”Supporting Efficient Execution in Heterogeneous Distributed Computing Environments with Cactus and Globus”. G.Allen, T. Dramlitsch, I. Foster, T. Goodale, N. Karonis, M. Ripeanu, E. Seidel and B. Toonen, Proceedings of SC 2001, November 10–16, 2001.
”Distance Visualization: Data Exploration on the Grid”. I. Foster, J. Insley, G. vonLaszewski, C. Kesselman, M. Thiebaux, (IEEE Computer Magazine, 32(12):36–43, 1999).
Asmara Afework, Michael Benyon, Fabian E. Bustamante, Angelo DeMarzo, Renato Ferreira, Rovert Miller, Mark Silberman, Joel Saltz, Alan Sussman. ”Digital Dynamic Telepathology-the Virtual Microscope”, In Proc. of the 1998 AMIA Annual Fall Symposium, August, 1998.
K. Czajkowski, S. Fitzgerald, I. Foster, C. Kesselman, ”Grid Information Services for Distributed Resource Sharing.” Procedings of the Tenth IEEE International Symposium on High-Performance Distributed Computing (HPDC-10), IEEE Press, August 2001. Networks, Vol. 2, No. 3, 1999
A. DeWitt, T. Gross, B. Lowekamp, N. Miller, P. Steenkiste, J. Subhlok, D. Sutherland, ”ReMoS: A Resource Monitoring System for Network-Aware Applications”. Carnegie Mellon School of Computer Science, CMU-CS-07-194.
Greg Eisenhauer, Weiming Gu, Karsten Schwan and Niru Mallavarupu. ”Falcon-Toward Interactive Parallel Programs: The On-line Steering of a Molecular Dynamics Application”, In Proceedings of The Third International Symposium on High-Performance Distributed Computing (HPDC-3), San Francisco, August 1994. IEEE Computer Society
Greg Eisenhauer, Weiming Gu, Thomas Kindler, Karsten Schwan, Dilma Silva and Jeffrey Vetter. ”Opportunities and Tools for Highly Interactive Distributed and Parallel Computing”, chapter in Parallel Computer Systems: Performance Instrumentation and Visualization, Rebecca Koskela and Margaret Simmons, editors, ACM Press, 1996.
Greg Eisenhauer and Karsten Schwan. ”An Object-Based Infrastructure for Program Monitoring and Steering”, In Proceedings of the 2nd SIGMETRICS Symposium on Parallel and Distributed Tools (SPDT’98), pp. 10–20, August 1998
M.E. Fiuczynski, R.P. Martin, T. Owa, and B.N. Bershad. SPINE: An Operating System for Intelligent Network Adapters. Proceedings of the Eighth ACM SIGOPS European Workshop, pp. 7–12. Sintra, Portugal, September 1998.
”A Quality of Service Architecture that Combines Resource Reservation and Application Adaptation”. I. Foster, A. Roy, V. Sander, (8th International Workshop on Quality of Service, 2000).
Ch. Glasner, R. Hügl, B. Reitinger, D. Kranzlmüller, J. Volkert. ”The Monitoring and Steering Environment” Proc. ICCS 2001, Intl. Conference on Computational Science, San Francisco, CA, USA, pp. 781–790 (May 2001).
Hart, Delbert; Kraemer, Eileen; Roman, Gruia-Catalin”Interactive Visual Exploration of Distributed Computations,” In Proceedings of 11th International Parallel Processing Symposium, pp.
Jeffrey K. Hollingsworth. Finding Bottlenecks in Large-scale Parallel Programs. Ph.D. Dissertation, August 1994. 11-127, Geneva, Switzerland, April 1997
D. Kranzlmüller, N. Stankovic, J. Volkert. ”Debugging Parallel Programs with Visual Patterns” Proc. VL’99, 1999 IEEE Symposium on Visual Languages, Tokyo, Japan, pp. 180–181 (Sept. 1999).
Rajamar Krishnamurthy, Karsten Schwan, and Marcel Rosu, “A Network Co-Processor-Based Approach to Scalable Media Streaming in Servers”, International Conference on Parallel Processing (ICPP), August 2000.
Beth Plale, Volker Elling, Greg Eisenhauer, Karsten Schwan, Davis King, and Vernard Martin, Realizing Distributed Computational Laboratories, Int’l Journal of Parallel and Distributed Systems and Networks, Vol 2, Num 3, 1999.
Randy Ribler, Jeffrey Vetter, Huseyin Simitci, Daniel Reed. ”Autopilot: Adaptive Control of Distributed Applications”, High Performance Distributed Computing, August, 1999
Ariel Tamches, Barton P. Miller. Fine-Grained Dynamic Instrumentation of Commodity Operating System Kernels Operating Systems Design and Implementation, 1999
TotalView Monitoring software, Etnus LLC. http://www.etnus.com.
Jeffrey Vetter and Karsten Schwan, ”Techniques for High Performance Computa-tional Steering”, IEEE Concurrency, Oct-Dec 1999.
Rich Wolski, Neil Spring, and Jim Hayes.”The Network Weather Service: A Distributed Resource Performance Forecasting Service for Metacomputing”, Journal of Future Generation Computing Systems, 1998
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jancic, J., Poellabauer, C., Schwan, K., Wolf, M., Bright, N. (2002). dproc - Extensible Run-Time Resource Monitoring for Cluster Applications. In: Sloot, P.M.A., Hoekstra, A.G., Tan, C.J.K., Dongarra, J.J. (eds) Computational Science — ICCS 2002. ICCS 2002. Lecture Notes in Computer Science, vol 2330. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46080-2_94
Download citation
DOI: https://doi.org/10.1007/3-540-46080-2_94
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43593-8
Online ISBN: 978-3-540-46080-0
eBook Packages: Springer Book Archive