skip to main content
10.1145/1375527.1375537acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
research-article

Preserving time in large-scale communication traces

Authors Info & Claims
Published:07 June 2008Publication History

ABSTRACT

Analyzing the performance of large-scale scientific applications is becoming increasingly difficult due to the sheer size of performance data gathered. Recent work on scalable communication tracing applies online interprocess compression to address this problem. Yet, analysis of communication traces requires knowledge about time progression that cannot trivially be encoded in a scalable manner during compression. We develop scalable time stamp encoding schemes for communication traces.

At the same time, our work contributes novel insights into the scalable representation of time stamped data. We show that our representations capture sufficient information to enable what-if explorations of architectural variations and analysis for path-based timing irregularities while not requiring excessive disk space. We evaluate the ability of several time-stamped compressed MPI trace approaches to enable accurate timed replay of communication events. Our lossless traces are orders of magnitude smaller, if not near constant size, regardless of the number of nodes while preserving timing information suitable for application tuning or assessing requirements of future procurements. Our results prove time-preserving tracing without loss of communication information can scale in the number of nodes and time steps, which is a result without precedent.

References

  1. The ASCI purple benchmarks.http://www.llnl.gov/asci/purple/benchmarks, 2002.]]Google ScholarGoogle Scholar
  2. N. Adiga and et al. An overview of the BlueGene/Lsupercomputer. In Supercomputing, November 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Dorian C. Arnold, Dong H. Ahn, Bronis R. de Supinski,Gregory L. Lee, Barton P. Miller, and Martin Schulz. Stack trace analysis for large scale debugging. In International Parallel and Distributed Processing Symposium, 2007.]]Google ScholarGoogle ScholarCross RefCross Ref
  4. Daniel Becker, Felix Wolf, Wolfgang Frings, Markus Geimer,Brian J.N. Wylie, and Bernd Mohr. Automatic trace-based performance analysis of metacomputing applications. In International Parallel and Distributed Processing Symposium, 2007.]]Google ScholarGoogle ScholarCross RefCross Ref
  5. Holger Brunst, Hans-Christian Hoppe, Wolfgang E. Nagel, and Manuela Winkler. Performance optimization for large scale computing: The scalable VAMPIR approach. In International Conference on Computational Science (2),pages 751--760, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Marc Casas, Rosa Badia, and Jesus Labarta. Automatic structure extraction from mpi applications tracefiles. In Euro-Par Conference, August 2007.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. JaeWoong Chung, Chi Cao Minh, Austen McDonald, Travis Skare, Hassan Chafi, Brian D. Carlstrom, Christos Kozyrakis, and Kunle Olukotun. Tradeoffs in transactional memory virtualization. In Architectural Support for Programming Languages and Operating Systems, 2006.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. F. Freitag, J. Caubet, and J. Labarta. On the scalability of tracing mechanisms. In Euro-Par Conference, pages 97--104, August 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Geimer, F. Wolf, B. Wylie, and B. Mohr. Scalable parallel trace-based performance analysis. In European PVM/MPI Users' Group Meeting, 2007.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Paul Havlak and Ken Kennedy. An implementation of interprocedural bounded regular section analysis. IEEE Transactions on Parallel and Distributed Systems, 2(3):350--360, July 1991.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Knu"pfer, R. Brendel, H. Brunst, H. Mix, and W. E. Nagel. Introducing the open trace format (OTF). In International Conference on Computational Science, pages 526--533, May 2006.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Andreas Knupfer. Construction and compression of complete call graphs for post-mortem program trace analysis. In International Conference on Parallel Processing, pages 165--172, 2005.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. D. E. Knuth. The Art of Computer Programming: Fundamental Algorithms, volume 2. Addison-Wesley, 2edition, 1973.]]Google ScholarGoogle Scholar
  14. J. Marathe, F. Mueller, T. Mohan, B. R. de Supinski, S. A.McKee, and A. Yoo. METRIC: Tracking down inefficiencies in the memory hierarchy via binary rewriting. In International Symposium on Code Generation and Optimization, pages 289-300, March 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. Mesnier, M. Wachs, R. Sambasivan, J. Lopez, J. Hendricks, and G. R. Ganger. //trace: Parallel trace replay with approximate causal events. In USENIX Conference on File and Storage Technologies, February 2007.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. W. E. Nagel, A. Arnold, M. Weber, H. C. Hoppe, and K. Solchenbach. VAMPIR: Visualization and analysis of MPIresources. Supercomputer, 12(1):69--80, 1996.]]Google ScholarGoogle Scholar
  17. Marcin Neyman, Michal Bukowski, and Piotr Kuzora.Efficient replay of PVM programs. In European PVM/MPI Users' Group Meeting on Recent Advances in Parallel VirtualMachine and Message Passing Interface, pages 83--90, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. Noeth, F. Mueller, M. Schulz, and B. R. de Supinski. Scalable compression and replay of communication traces in massively parallel environments. In International Parallel and Distributed Processing Symposium, April 2007.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. V. Pillet, J. Labarta, T. Cortes, and S. Girona. PARAVER: A tool to visualise and analyze parallel code. In Proceedings of WoTUG-18: Transputer and occam Developments,volume 44 of Transputer and Occam Engineering, pages 17--31, April 1995.]]Google ScholarGoogle Scholar
  20. Philip C. Roth, Dorian C. Arnold, and Barton P. Miller. MRNet: A software-based multicast/reduction network for scalable tools. In Supercomputing, pages 21--36, Washington, DC, USA, 2003. IEEE Computer Society.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Martin Schulz and Bronis R. de Supinski. PNMPI tools: A whole lot greater than the sum of their parts. In Supercomputing, 2007.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. Vetter and M. McCracken. Statistical scalability analysis of communication operations in distributed applications. In ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. F. Wong, R. Martin, R. Arpaci-Dusseau, and D. Culler. Architectural requirements and scalability of the NAS parallel benchmarks. In Supercomputing, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. O. Zaki, E. Lusk, W. Gropp, and D. Swider. Toward scalable performance visualization with Jumpshot. International Journal of High Performance Computing Applications,13(3):277--288, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Preserving time in large-scale communication traces

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          ICS '08: Proceedings of the 22nd annual international conference on Supercomputing
          June 2008
          390 pages
          ISBN:9781605581583
          DOI:10.1145/1375527

          Copyright © 2008 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 7 June 2008

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate584of2,055submissions,28%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader