Skip to main content
Log in

Design and Development of a Scalable Distributed Debugger for Cluster Computing

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Debugging is an essential part of parallel and distributed processing. However, developing parallel and distributed debugger is difficult. This is especially true for cluster computing where heterogeneity presents. In this paper, we first give a survey of the current debugging techniques and existing tools, and then present a client–server debugging model. Based on this model, we discuss the design and development of a practical scalable distributed debugging system for cluster computing in detail, and give two case studies to show how the distributed debugging system efficiently supports debugging message-passing programs such as various MPI and PVM programs. The newly developed distributed debugger is based on the sequential debugger gdb and dbx. It has the capability of scaling to handle hundreds of processes. Its interfaces are completely implemented in Java, and its graphical user interface is the same on all computing platforms. In addition, it is portable, easy to learn and use.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. R.F. Brender, J.E. Nelson and M.E. Arsenault, Debugging optimized code: Concepts and implementation on DIGITAL Alpha systems, Digital Technical Journal 10 (December 1998).

  2. D. Cheng and R. Hood, A portable debugger for parallel and distributed Programs, in: Proc. of Supercomputing'94, November 1994. See also http://science.nas.nasa.gov/Groups/Tools/Projects /P2D2/.

  3. S. Damodaran-Kamal, Xmdb Version 1.0 User Manual 1.2, Los Alamos National Laboratory (1995).

  4. Etnus Inc., TotalView Debugger, http://www.etnus.com/.

  5. J.M. Francioni and C.M. Pancake, High Performance Debugging Standards Effort, http:/www.ptools.org/hpdf/draft/article.htm.

  6. A. Geist, A. Beguelin, J. Dongarra, W. Jiang, R. Manchek and V. Sunderam, PVM: Parallel Virtual Machine - A Users's Guide and Tutorial for Networked Parallel Computing (The MIT Press, 1994).

  7. W. Groupp and E. Lusk, User's Guide for MPICH, a Portable Implementation of MPI, Argonne National Laboratory, USA (1994).

    Google Scholar 

  8. M. Hao, HP's distributed debugger, NAS New Technology Seminar, March 1994.

  9. M.T. Heath and J.E. Finger, Paragraph: A Tool for Visualizing Performance of Parallel Programs, ParaGraph User's Guide, December 1994.

  10. High Performance Debugging Forum, HPD (High Performance Debugging) Version 1 Standard: Command Interface for Parallel Debuggers, http:/www.ptools.org/hpdf/draft/, September 1998.

  11. R. Hood, The P2D2 project: Building a portable distributed debugger, in: Proc. of the SIGMETRICS Symposium on Parallel and Distributed Tools, May 1996.

  12. IBM Corporation, IBM AIX Parallel Environment: Programming Primer, Release 2.0, 1994.

  13. IBM Distributed Debugger for Workstations, http://www-4.ibm.com/ software/webservers/appserv/doc/v35/ae/infocenter/olt/index.html. See also the URL http://www.cineca.it/manuali /sp3/idebug/.

  14. D.C.P. LaFrance-Linden, Challenges in designing an HPF debugger, Digital Technical Journal, 29 January 1998.

  15. S.S. Lumetta, Mantis: A debugger for the split-C language, University of California at Berkley, Technical report #CSD-95-865 (1995).

  16. J. May and F. Berman, Designing a parallel debugger for portability, in: Proc. of IPPS'94, April 1994.

  17. C.M. Pancake and R.H. Netzer, A bibliography of parallel debuggers, ACM SIGPLAN Notices 28(12), ACM/ONR Workshop on Parallel and Distributed Debugging (1993). See also http://www.cs.orst.edu/ ~pancake/papers/biblio.html.

  18. S. Sistare, D. Allen, R. Bowker, K. Jourdenais, J. Simons and R. Title, A scalable debugger for massively parallel message-passing programs, IEEE Parallel and Distributed Technology 2(2) (Summer 1994).

  19. R. Stallman and C. Support, Debugging with GDB, Cygnus Solutions, Inc. (1994).

  20. Sunsoft, Inc., Solaris Application Developer's Guide (1997).

  21. Think Machines Corporation, Prism 2.0 Release Notes, May 1994.

  22. X. Wu, Q. Chen, X. Hu, Y. Hu, M. Zhu and J. Wu, Design and implementation of cluster system-oriented parallel programming environments, Technical report, National Research Center for Intelligent Computing Systems, Chinese Academy of Sciences (1998).

  23. X. Wu, Performance Evaluation, Prediction and Visualization of Parallel Systems (Kluwer Academic Publishers, Boston, 1999).

    Google Scholar 

  24. XPDB, http://www.informatik.uni-stuttgart.de/ipvr/as/projekte/grids/ xpdb/xpdb-e.html.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, X., Chen, Q. & Sun, XH. Design and Development of a Scalable Distributed Debugger for Cluster Computing. Cluster Computing 5, 365–375 (2002). https://doi.org/10.1023/A:1019708204283

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1019708204283

Navigation