Abstract
Scientific applications should be well balanced in order to achieve high scalability on current and future high end massively parallel systems. However, the identification of sources of load imbalance in such applications is not a trivial exercise, and the current state of the art in performance analysis tools do not provide an efficient mechanism to help users to identify the main areas of load imbalance in an application. In this paper we discuss a new set of metrics that we defined to identify and measure application load imbalance. We then describe the extensions that were made to the Cray performance measurement and analysis infrastructure to detect application load imbalance and present to the user in an insightful way.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Top500 Supercomputer Sites: The 28th TOP500 List (2006), http://www.top500.org/
Graham, S., Kessler, P., McKusick, M.: gprof: A Call Graph Execution Profiler. In: Proceedings of the SIGPLAN 1982 Symposium on Compiler Construction, Boston, MA, pp. 120–126. Association for Computing Machinery (June 1982)
Pettersson, M.: Linux X86 Performance-Monitoring Counters Driver. Computing Science Department, Uppsala University - Sweden (2002), http://user.it.uu.se/~mikpe/linux/perfctr/
Browne, S., Dongarra, J., Garner, N., Ho, G., Mucci, P.: A Portable Programming Interface for Performance Evaluation on Modern Processors. The International Journal of High Performance Computing Applications 14(3), 189–204 (2000)
DeRose, L., Reed, D.: Svpablo: A Multi-Language Architecture-Independent Performance Analysis System. In: Proceedings of the International Conference on Parallel Processing, pp. 311–318 (August 1999)
DeRose, L.: The Hardware Performance Monitor Toolkit. In: Sakellariou, R., Keane, J.A., Gurd, J.R., Freeman, L. (eds.) Euro-Par 2001. LNCS, vol. 2150, pp. 122–131. Springer, Heidelberg (2001)
Mellor-Crummey, J., Fowler, R., Marin, G., Tallent, N.: HPCView: A tool for top-down analysis of node performance. The Journal of Supercomputing 23, 81–101 (2002)
Nagel, W., Arnold, A., Weber, M., Hoppe, H.C., Solchenbach, K.: Vampir: Visualization and Analysis of MPI Resources. Supercomputer 12, 69–80 (1996)
Kim, S., Kuhn, B., Voss, M., Hoppe, H.C., Nagel, W.: VGV: Supporting Performance Analysis of Object-Oriented Mixed MPI/OpenMP Parallel Applications. In: Proceedings of the International Parallel and Distributed Processing Symposium (April 2002)
European Center for Parallelism of Barcelona (CEPBA): Paraver - Parallel Program Visualization and Analysis Tool - Reference Manual (November 2000), http://www.cepba.upc.es/paraver
Wu, C., Bolmarcich, A., Snir, M., Wootton, D., Parpia, F., Chan, A., Lusk, E., Gropp, W.: From trace generation to visualization: A performance framework for distributed parallel systems. In: Proceedings of Supercomputing 2000 (November 2000)
DeRose, L., Ekanadham, K., Hollingsworth, J.K., Sbaraglia, S.: SIGMA: A Simulator Infrastructure to Guide Memory Analysis. In: Proceedings of SC 2002, Baltimore, Maryland (November 2002)
Labarta, J., Girona, S., Cortes, T.: Analyzing scheduling policies using Dimemas. Parallel Computing 23(1–2), 23–34 (1997)
Snavely, A., Carrington, L., Wolter, N., Labarta, J., Badia, R., Purkayastha, A.: A framework for performance modeling and prediction. In: Proceedings of SC 2002, Baltimore, Maryland (November 2002)
Bell, R., Malony, A.D., Shende, S.: A Portable, Extensible, and Scalable Tool for Parallel Performance Profile Analysis. In: Kosch, H., Böszörményi, L., Hellwagner, H. (eds.) Euro-Par 2003. LNCS, vol. 2790, pp. 17–26. Springer, Heidelberg (2003)
Wolf, F., Mohr, B.: Automatic performance analysis of hybrid mpi/openmp applications. Journal of Systems Architecture, Special Issue ’Evolutions in parallel distributed and network-based processing’ 49(10–11), 421–439 (2003)
Miller, B.P., Callaghan, M.D., Cargille, J.M., Hollingsworth, J.K., Irvin, R.B., Karavanic, K.L., Kunchithapadam, K., Newhall, T.: The Paradyn Parallel Performance Measurement Tools. IEEE Computer 28(11), 37–46 (1995)
DeRose, L., Homer, B., Johnson, D., Kaufmann, S.: The New Generation of Cray Tools. In: Proceedings of Cray Users Group Meeting – CUG 2005 (May 2005)
Lawrence Livermode National Laboratory: the ASCI sweep3d Benchmark Code (1995), http://www.llnl.gov/asci_benchmarks/asci/limited/sweep3d/
DeRose, L., Pantano, M., Aydt, R., Shaffer, E., Schaeffer, B., Whitmore, S., Reed, D.A.: An Approach to Immersive Performance Visualization of Parallel and Wide-Area Distributed Applications. In: Proceedings of 8th International Symposium on High Performance Distributed Computing - HPDC 1999 (August 1999)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
DeRose, L., Homer, B., Johnson, D. (2007). Detecting Application Load Imbalance on High End Massively Parallel Systems. In: Kermarrec, AM., Bougé, L., Priol, T. (eds) Euro-Par 2007 Parallel Processing. Euro-Par 2007. Lecture Notes in Computer Science, vol 4641. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74466-5_17
Download citation
DOI: https://doi.org/10.1007/978-3-540-74466-5_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74465-8
Online ISBN: 978-3-540-74466-5
eBook Packages: Computer ScienceComputer Science (R0)