Abstract
Straightforward trace collection and processing becomes increasingly challenging and ultimately impractical for more complex, long-running, highly parallel applications. Accordingly, the scalasca project is extending the kojak measurement system for mpi, openmp and partitioned global address space (pgas) parallel applications to incorporate runtime management and summarisation capabilities. This offers a more scalable and effective profile of parallel execution performance for an initial overview and to direct instrumentation and event tracing to the key functions and callpaths for comprehensive analysis. The design and re-structuring of the revised measurement system are described, highlighting the synergies possible from integrated runtime callpath summarisation and event tracing for scalable parallel execution performance diagnosis. Early results from measurements of 16,384 mpi processes on IBM BlueGene/L already demonstrate considerably improved scalability.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Forschungszentrum Jüelich GmbH: SCALASCA: Scalable performance Analysis of Large-Scale parallel Applications, http://www.scalasca.org/
Forschungszentrum Jüelich GmbH (ZAM) and the University of Tennessee (ICL): KOJAK: Kit for Objective Judgement and Knowledge-based detection of performance bottlenecks, http://www.fz-juelich.de/zam/kojak/
Wolf, F., Mohr, B.: Automatic Performance Analysis of Hybrid MPI/OpenMP Applications. J. Systems Architecture 49(10-11), 421–439 (2003)
Nagel, W., Arnold, A., Weber, M., Hoppe, H.-C., Solchenbach, K.: VAMPIR: Visualization and Analysis of MPI Resources. Supercomputer 63(1), 69–80 (1996)
Labarta, J., Girona, S., Pillet, V., Cortes, T., Gregoris, L.: DiP: A Parallel Program Development Environment. In: Fraigniaud, P., Mignotte, A., Robert, Y., Bougé, L. (eds.) Euro-Par 1996. LNCS, vol. 1124, pp. 665–674. Springer, Heidelberg (1996)
Wolf, F., Freitag, F., Mohr, B., Moore, S., Wylie, B.J.N.: Large Event Traces in Parallel Performance Analysis. In: Proc. 19th Int’l Conf. on Architecture of Computing Systems, Frankfurt am Main, Germany. Lecture Notes in Informatics, p. 81. Gesellschaft für Informatik, pp. 264–273 (2006)
Shende, S.S., Malony, A.D.: The TAU Parallel Performance System. Int’l J. High Performance Computing Applications 20(2), 287–331 (2006)
Cain, H.W., Miller, B.P., Wylie, B.J.N.: A Callgraph-based Search Strategy for Automated Performance Diagnosis. Concurrency and Computation: Practice and Experience 14(3), 203–217 (2002)
Geimer, M., Wolf, F., Knüpfer, A., Mohr, B., Wylie, B.J.N.: A Platform for Scalable Parallel Trace Analysis. In: A Parallel Trace-Data Interface for Scalable performance Analysis, Umeå, Sweden. LNCS, pp. 398–408. Springer, Heidelberg (2006)
Song, F., Wolf, F., Bhatia, N., Dongarra, J., Moore, S.: An Algebra for Cross-Experiment Performance Analysis. In: Proc. 33rd Int’l Conf. on Parallel Processing (ICPP 2004), Montreal, Quebec, Canada, pp. 63–72. IEEE Computer Society Press, Los Alamitos (2004)
Wolf, F., Mohr, B., Bhatia, N., Hermanns, M.-A.: EPILOG binary trace-data format, version 1.3 (2005), http://www.fz-juelich.de/zam/kojak/doc/epilog.pdf
Gailly, J., Adler, M.: zlib general-purpose compression library, version 1.2.3 (2005), http://www.zlib.net/
Vetter, J., Chambreau, C.: mpiP — lightweight, scalable MPI profiling (2005), http://www.llnl.gov/CASC/mpip/
Fürlinger, K., Gerndt, M.: ompP — A Profiling Tool for OpenMP. In: Proc. 1st Int’l Work. on OpenMP (IWOMP) Eugene, OR, USA (2005)
Knüpfer, A., Brendel, R., Brunst, H., Mix, H., Nagel, W.E.: Introducing the Open Trace Format (OTF). In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M.A., Dongarra, J.J. (eds.) ICCS 2006. LNCS, vol. 3992, pp. 526–533. Springer, Heidelberg (2006)
The BlueGene/L Team at IBM and LLNL: An overview of the BlueGene/L supercomputer. In: Proc. SC2002, Baltimore, MD, USA. IEEE Computer Society (2002)
Advanced Simulation and Computing Program: The ASC SMG 2000 benchmark code (2001), http://www.llnl.gov/asc/purple/benchmarks/limited/smg/
Geimer, M., Wolf, F., Wylie, B.J.N., Mohr, B.: Scalable Parallel Trace-based Performance Analysis. In: Mohr, B., Träff, J.L., Worringen, J., Dongarra, J. (eds.) Recent Advances in Parallel Virtual Machine and Message Passing Interface. LNCS, vol. 4192, pp. 303–312. Springer, Heidelberg (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wylie, B.J.N., Wolf, F., Mohr, B., Geimer, M. (2007). Integrated Runtime Measurement Summarisation and Selective Event Tracing for Scalable Parallel Execution Performance Diagnosis. In: Kågström, B., Elmroth, E., Dongarra, J., Waśniewski, J. (eds) Applied Parallel Computing. State of the Art in Scientific Computing. PARA 2006. Lecture Notes in Computer Science, vol 4699. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75755-9_55
Download citation
DOI: https://doi.org/10.1007/978-3-540-75755-9_55
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75754-2
Online ISBN: 978-3-540-75755-9
eBook Packages: Computer ScienceComputer Science (R0)