Abstract
In parallel applications, a significant amount of communication occurs in a collective fashion to perform, for example, broadcasts, reductions, or complete exchanges. Although the MPI standard defines many convenience functions for this purpose, which not only improve code readability and maintenance but are usually also highly efficient, many application programmers still create their own, manual implementations using point-to-point communication. We show how instances of such hand-crafted collectives can be automatically detected. Matching pre- and post-conditions of hashed message exchanges recorded in event traces, our method is independent of the specific communication pattern employed. We demonstrate that replacing detected broadcasts in the HPL benchmark can yield significant performance improvements.
Chapter PDF
Similar content being viewed by others
References
HPL – A portable implementation of the high-performance Linpack benchmark for distributed-memory computers, http://netlib.org/benchmark/hpl/
Bernaschi, M., Iannello, G., Lauria, M.: Efficient Implementation of Reduce-scatter in MPI. In: Proceedings. 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing, pp. 301–308 (2002)
Di Martino, B., Mazzeo, A., Mazzocca, N., Villano, U.: Parallel program analysis and restructuring by detection of point-to-point interaction patterns and their transformation into collective communication constructs. Science of Computer Programming 40, 235–263 (2001)
Geimer, M., Wolf, F., Wylie, B.J.N., Ábrahám, E., Becker, D., Mohr, B.: The Scalasca performance toolset architecture. Concurrency and Computation: Practice and Experience 22(6), 702–719 (2010)
Gorlatch, S.: Send-Receive Considered Harmful: Myths and Realities of Message Passing. ACM Transactions on Programming Languages and Systems (TOPLAS) 26, 47–56 (2004)
Hermanns, M.-A., Geimer, M., Wolf, F., Wylie, B.J.N.: Verifying causality between distant performance phenomena in large-scale MPI applications. In: Proc. of the 17th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP), Weimar, Germany, pp. 78–84. IEEE Computer Society (February 2009)
Hoefler, T., Siebert, C., Lumsdaine, A.: Group Operation Assembly Language - A Flexible Way to Express Collective Communication. In: The 38th International Conference on Parallel Processing. IEEE (September 2009)
Hoefler, T., Siebert, C., Rehm, W.: A practically constant-time MPI Broadcast Algorithm for large-scale InfiniBand Clusters with Multicast. In: Proceedings of the 21st IEEE International Parallel & Distributed Processing Symposium, pp. 1–8. IEEE Computer Society (March 2007)
Kumar, S., Sabharwal, Y., Garg, R., Heidelberger, P.: Optimization of All-to-All Communication on the Blue Gene/L Supercomputer. In: Proc. of the 37th International Conference on Parallel Processing, pp. 320–329. IEEE Computer Society, Washington, DC (2008)
Message Passing Interface Forum. MPI: A Message-Passing Interface Standard, Version 2.2. High Performance Computing Center Stuttgart, HLRS (2009)
Preissl, R., Schulz, M., Kranzlmuller, D., de Supinski, B.R., Quinlan, D.J.: Transforming MPI Source code based on communication patterns. Future Generation Computer Systems 26, 147–154 (2009)
Ross, R., Latham, R., Gropp, W., Lusk, E., Thakur, R.: Processing MPI Datatypes Outside MPI. In: Ropo, M., Westerholm, J., Dongarra, J. (eds.) PVM/MPI. LNCS, vol. 5759, pp. 42–53. Springer, Heidelberg (2009)
Sanders, P., Träff, J.L.: Parallel Prefix (Scan) Algorithms for MPI. In: Mohr, B., Träff, J.L., Worringen, J., Dongarra, J. (eds.) PVM/MPI 2006. LNCS, vol. 4192, pp. 49–57. Springer, Heidelberg (2006)
Träff, J.L., Ripke, A., Siebert, C., Balaji, P., Thakur, R., Gropp, W.: A Pipelined Algorithm for Large, Irregular All-Gather Problems. International Journal of High Performance Compututing Applications 24, 58–68 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Calotoiu, A., Siebert, C., Wolf, F. (2012). Pattern-Independent Detection of Manual Collectives in MPI Programs. In: Kaklamanis, C., Papatheodorou, T., Spirakis, P.G. (eds) Euro-Par 2012 Parallel Processing. Euro-Par 2012. Lecture Notes in Computer Science, vol 7484. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32820-6_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-32820-6_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32819-0
Online ISBN: 978-3-642-32820-6
eBook Packages: Computer ScienceComputer Science (R0)