Abstract
Debugging is an essential part of parallel and distributed processing. However, developing parallel and distributed debugger is difficult. This is especially true for cluster computing where heterogeneity presents. In this paper, we first give a survey of the current debugging techniques and existing tools, and then present a client–server debugging model. Based on this model, we discuss the design and development of a practical scalable distributed debugging system for cluster computing in detail, and give two case studies to show how the distributed debugging system efficiently supports debugging message-passing programs such as various MPI and PVM programs. The newly developed distributed debugger is based on the sequential debugger gdb and dbx. It has the capability of scaling to handle hundreds of processes. Its interfaces are completely implemented in Java, and its graphical user interface is the same on all computing platforms. In addition, it is portable, easy to learn and use.
Similar content being viewed by others
References
R.F. Brender, J.E. Nelson and M.E. Arsenault, Debugging optimized code: Concepts and implementation on DIGITAL Alpha systems, Digital Technical Journal 10 (December 1998).
D. Cheng and R. Hood, A portable debugger for parallel and distributed Programs, in: Proc. of Supercomputing'94, November 1994. See also http://science.nas.nasa.gov/Groups/Tools/Projects /P2D2/.
S. Damodaran-Kamal, Xmdb Version 1.0 User Manual 1.2, Los Alamos National Laboratory (1995).
Etnus Inc., TotalView Debugger, http://www.etnus.com/.
J.M. Francioni and C.M. Pancake, High Performance Debugging Standards Effort, http:/www.ptools.org/hpdf/draft/article.htm.
A. Geist, A. Beguelin, J. Dongarra, W. Jiang, R. Manchek and V. Sunderam, PVM: Parallel Virtual Machine - A Users's Guide and Tutorial for Networked Parallel Computing (The MIT Press, 1994).
W. Groupp and E. Lusk, User's Guide for MPICH, a Portable Implementation of MPI, Argonne National Laboratory, USA (1994).
M. Hao, HP's distributed debugger, NAS New Technology Seminar, March 1994.
M.T. Heath and J.E. Finger, Paragraph: A Tool for Visualizing Performance of Parallel Programs, ParaGraph User's Guide, December 1994.
High Performance Debugging Forum, HPD (High Performance Debugging) Version 1 Standard: Command Interface for Parallel Debuggers, http:/www.ptools.org/hpdf/draft/, September 1998.
R. Hood, The P2D2 project: Building a portable distributed debugger, in: Proc. of the SIGMETRICS Symposium on Parallel and Distributed Tools, May 1996.
IBM Corporation, IBM AIX Parallel Environment: Programming Primer, Release 2.0, 1994.
IBM Distributed Debugger for Workstations, http://www-4.ibm.com/ software/webservers/appserv/doc/v35/ae/infocenter/olt/index.html. See also the URL http://www.cineca.it/manuali /sp3/idebug/.
D.C.P. LaFrance-Linden, Challenges in designing an HPF debugger, Digital Technical Journal, 29 January 1998.
S.S. Lumetta, Mantis: A debugger for the split-C language, University of California at Berkley, Technical report #CSD-95-865 (1995).
J. May and F. Berman, Designing a parallel debugger for portability, in: Proc. of IPPS'94, April 1994.
C.M. Pancake and R.H. Netzer, A bibliography of parallel debuggers, ACM SIGPLAN Notices 28(12), ACM/ONR Workshop on Parallel and Distributed Debugging (1993). See also http://www.cs.orst.edu/ ~pancake/papers/biblio.html.
S. Sistare, D. Allen, R. Bowker, K. Jourdenais, J. Simons and R. Title, A scalable debugger for massively parallel message-passing programs, IEEE Parallel and Distributed Technology 2(2) (Summer 1994).
R. Stallman and C. Support, Debugging with GDB, Cygnus Solutions, Inc. (1994).
Sunsoft, Inc., Solaris Application Developer's Guide (1997).
Think Machines Corporation, Prism 2.0 Release Notes, May 1994.
X. Wu, Q. Chen, X. Hu, Y. Hu, M. Zhu and J. Wu, Design and implementation of cluster system-oriented parallel programming environments, Technical report, National Research Center for Intelligent Computing Systems, Chinese Academy of Sciences (1998).
X. Wu, Performance Evaluation, Prediction and Visualization of Parallel Systems (Kluwer Academic Publishers, Boston, 1999).
XPDB, http://www.informatik.uni-stuttgart.de/ipvr/as/projekte/grids/ xpdb/xpdb-e.html.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Wu, X., Chen, Q. & Sun, XH. Design and Development of a Scalable Distributed Debugger for Cluster Computing. Cluster Computing 5, 365–375 (2002). https://doi.org/10.1023/A:1019708204283
Issue Date:
DOI: https://doi.org/10.1023/A:1019708204283