Abstract
A debugger is a crucial part of any programming system, and is especially crucial for those supporting a parallel programming paradigm, like OpenMP. A parallel, relaxed-consistency, distributed shared memory (DSM) system presents unique challenges to a debugger for several reasons: 1) the local copies of a given variable are not always consistent between distributed machines, so directly accessing the variable in the local memory by the debugger won’t always work as expected; 2) if the DSM and debugger both modify page protections, they will likely interfere with each other; and 3) since a large number of SIGSEGVs occur as part of the normal operation of a DSM program, a program error producing a SIGSEGV may be missed. In this paper, we discuss these problems and propose solutions.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Balle, S.M., Brett, B.R., Chen, C., LaFrance-Linden, D.: Extending a traditional debugger to debug massively parallel applications. Journal of Parallel and Distributed Computing 64(5), 617–628 (2004)
Carlson, W.W., Draper, J.M., Culler, D.E., Yelick, K., Brooks, E., Warren, K.: Introduction to UPC and Language Specification. Technical Report CCS-TR-99-157, Institute for Defense Analysis, Center for Computer Sciences, Bowie, Maryland (1999)
Chen, C.: The Parallel Debugging Architecture in the Intel® Debugger. In: Malyshkin, V.E. (ed.) PaCT 2003. LNCS, vol. 2763, pp. 444–451. Springer, Heidelberg (2003)
Cownie, J., Gropp, W.: A standard interface for debugger access to message queue information in MPI. In: Margalef, T., Dongarra, J., Luque, E. (eds.) PVM/MPI 1999. LNCS, vol. 1697, pp. 51–58. Springer, Heidelberg (1999)
Dubois, M., Scheurich, C., Briggs, F.A.: Memory Access Buffering in Multiprocessors. In: Proceedings of the Thirteenth Annual International Symposium on Computer Architecture, vol. 14(2), pp. 434–442 (June 1986)
Etnus LLC: TotalView Reference Guide, Version 6.0. Etnus LLC (2002)
Intel Corporation: Cluster OpenMP User’s Guide, Version 9.1, Intel Corporation (2005-2006)
Keleher, P., Cox, A.L., Zwaenepoel, W.: Lazy release consistency for software distributed shared memory. In: Proceedings of the 19th Annual International Symposium on Computer Architecture, May 1992, pp. 13–21 (1992)
Lamport, L.: How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs. IEEE Trans. Computers 28(9), 690–691 (1979)
Li, K., Hudak, P.: Memory Coherence in Shared Virtual Memory Systems. In: Proceedings of the 5th ACM Symposium on Principles of Distributed Computing (1989)
LeBlanc, T.J., Mellor-Crummey, J.M.: Debugging parallel programs with instant replay. IEEE Transaction on Computers 36(4), 471–482 (1987)
Lumetta, S.S., Culler, D.E.: The Mantis Parallel Debugger. In: Proceedings of SPDT 1996: SIGMETRICS Symposium on Parallel and Distributed Tools (1996)
Message Passing Interface Forum. MPI: A Message Passing Interface Standard. Version 1.1 (June 1995)
Miller, B.P., Choi, J.: Breakpoints and Halting in Distributed Programs. In: Proceedings of the 8th International Conference on Distributed Computing Systems (ICDCS) (1988)
Mittal, N., Garg, V.K.: Debugging Distributed Programs Using Controlled Re- execution. In: Proceedings of the 19th Annual ACM Symposium on Principles of Distributed Computing (PODC) (2000)
Netzer, R.H.B.: Optimal tracing and replay for debugging shared-memory parallel programs. In: Proceedings of ACM/ONR Workshop on Parallel and Distributed Debugging, San Diego, California, May 1993, pp. 1–11 (1993)
OpenMP Architecture Review Board: OpenMP Application Program Interface, Version 2.5. OpenMP Architecture Review Board (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Olivier, J., Chen, CP., Hoeflinger, J. (2006). Debugging Distributed Shared Memory Applications. In: Guo, M., Yang, L.T., Di Martino, B., Zima, H.P., Dongarra, J., Tang, F. (eds) Parallel and Distributed Processing and Applications. ISPA 2006. Lecture Notes in Computer Science, vol 4330. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11946441_75
Download citation
DOI: https://doi.org/10.1007/11946441_75
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68067-3
Online ISBN: 978-3-540-68070-3
eBook Packages: Computer ScienceComputer Science (R0)