Skip to main content
Log in

Dynamic Binary Instrumentation and Data Aggregation on Large Scale Systems

  • Special Issue on High-End Computing
  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Dynamic binary instrumentation for performance analysis on large scale architectures such as the IBM Blue Gene/L system (BG/L) poses unique challenges. Their unprecedented scale and often limited OS support require new mechanisms to organize binary instrumentation, to interact with the target application, and to collect the resulting data.

We describe the design and current status of a new implementation of the Dynamic Probe Class Library (DPCL) API for large scale systems. DPCL provides an easy to use layer for dynamic instrumentation on parallel MPI applications based on the DynInst dynamic instrumentation library for sequential platforms. Our work includes modifying DynInst to control instrumentation from remote I/O nodes and porting DPCL’s communication for performance data collection to use MRNet, a tree-based overlay network that (TBON) supports scalable multicast and data reduction. We describe extensions to the DPCL API that support instrumentation of task subsets and aggregation of collected performance data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Buck B., Hollingsworth J.(2000). An API for Runtime Code Patching. The International Journal of High Performance Computing Applications 14(4):317–329

    Article  Google Scholar 

  2. L. DeRose, T. Hoover, and J. Hollingsworth, The Dynamic Probe Class Library—An Infrastructure for Developing Instrumentation for Performance Tools, in Proceedings of the 15th International Parallel and Distributed Processing Symposium (April 2001).

  3. T. Ludwig, R. Wismüller, V. Sunderam, and A. Bode. OMIS—On-line Monitoring Interface Specification (Version 2.0), vol. 9, LRR-TUM Research Report Series, Shaker Verlag, Aachen, Germany (1997) ISBN 3-8265-3035-7.

  4. Miller B., Callaghan M., Cargille J., Hollingsworth J., Irvin R., Karavanic K., Kunchithapadam K., Newhall T. (November 1995). The Paradyn Parallel Performance Measurement Tool. IEEE Computer 28(11):37–46

    Google Scholar 

  5. J. May and J. Gyllenhaal, Tool Gear: Infrastructure for Parallel Tools, in Proceedings of the 2003 International Conference on Parallel and Distributed Techniques and Applications (June 2003).

  6. The Open|SpeedShop Team, Open|speedshop for Linux, http://www.openspeedshop.org/ (November 2006).

  7. U. of Mannheim, U. of Tennessee, and NERSC/LBNL. TOP500 Supercomputing Sites. http://www.top500.org/.

  8. N. Adiga et al., An overview of the bluegene/l supercomputer, in Proceedings of IEEE/ACM Supercomputing ’02 (Nov 2002).

  9. K. Davis, A. Hoisie, G. Johnson, D. Kerbyson, M. Lang, S. Pakin, and F. Petrini. A Performance and Scalability Analysis of the BlueGene/L Architecture, In Proceedings of IEEE/ACM Supercomputing ’04 (November 2004).

  10. J. DelSignore, TotalView on Blue Gene/L. Presented at “Blue Gene/L: Applications, Architecture and Software Workshop”, presentation available at http://www.llnl.gov/asci/platforms/bluegene/papers/26delsignore.pdf.

  11. P. J. Mucci, DynaProf, http://www.cs.utk.edu/ mucci/dynaprof/ (2006).

  12. M. Schulz, J. May, and J. Gyllenhaal. DynTG: A Tool for Interactive, Dynamic Instrumentation, in Proceedings of the 5th International Conference in Computational Science (ICCS), Part II, LNCS, Vol. 3515, pp. 140–148 (May 2005).

  13. P. Roth, D. Arnold, and B. Miller, MRNet: A Software-Based Multicast/Reduction Network for Scalable Tools, in Proceedings of IEEE/ACM Supercomputing ’03 November (2003).

  14. IBM, An Overview of the BlueGene/L Supercomputer. Whitepaper available at http://www-fp.mcs.anl.gov/bgconsortium.

  15. SLURM: Simple Linux Utility for Resource Management. http://www.llnl.gov/linux/slurm/ (June 2005).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gregory L. Lee.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lee, G.L., Schulz, M., Ahn, D.H. et al. Dynamic Binary Instrumentation and Data Aggregation on Large Scale Systems. Int J Parallel Prog 35, 207–232 (2007). https://doi.org/10.1007/s10766-007-0036-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10766-007-0036-3

Keywords

Navigation