Abstract
Total Exchange is one of the most important collective communication patterns for scientific applications. In this paper we propose an algorithm called \({\mathcal LG}\) for the total exchange redistribution problem between two clusters. In our approach we perform communications in two different phases, aiming to minimize the number of communication steps through the wide-area network. Therefore, we are able to reduce the number of messages exchanged through the backbone to only 2× max (n 1 ,n 2) against 2×n 1 ×n 2 messages with the traditional strategy (where n 1 and n 2 are the number of nodes of each clusters). Experimental results show that we reach over than 50% of performance improvement comparing to the traditional strategies.
Chapter PDF
Similar content being viewed by others
References
Christara, C., Ding, X., Jackson, K.: An efficient transposition algorithm for distributed memory computers. In: Proceedings of the High Performance Computing Systems and Applications, pp. 349–368 (1999)
Calvin, C., Perennes, S., Trystram, D.: All-to-all broadcast in torus with wormhole-like routing. In: Proceedings of the IEEE Symposium on Parallel and Distributeed Processing, pp. 130–137. IEEE Computer Society Press, Los Alamitos (1995)
Yang, Y., Wang, J.: Optimal all-to-all personalized exchange in multistage networks. In: Proceedings of the International Conference on Parallel and Distributed Systems (ICPADS 2000), pp. 229–236 (2000)
Kalé, L.V., Kumar, S., Varadarajan, K.: A framework for collective personalized communication. In: Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS 2003) (2003)
Gabriel, E., Fagg, G.E., Bosilca, G., Angskun, T., Dongarra, J., Squyres, J.M., Sahay, V., Kambadur, P., Barrett, B., Lumsdaine, A., Castain, R.H., Daniel, D.J., Graham, R.L., Woodall, T.S.: Open MPI: Goals, concept and design of a next generation MPI implementation. In: Kranzlmüller, D., Kacsuk, P., Dongarra, J.J. (eds.) EuroPVM 2004. LNCS, vol. 3241, pp. 97–104. Springer, Heidelberg (2004)
Gropp, W.: Mpich2: A new start for mpi implementations. In: Kranzlmüller, D., Kacsuk, P., Dongarra, J.J., Volkert, J. (eds.) PVM/MPI 2002. LNCS, vol. 2474, p. 7. Springer, Heidelberg (2002)
Bruck, J., Ho, C.T., Kipnis, S., Upfal, E., Weathersby, D.: Efficient algorithms for all-to-all communications in multiport message-passing systems. IEEE Transactions on Parallel and Distributed Systems 8(11), 1143–1156 (1997)
Goldman, A., Trystram, D., Peters, J.G.: Exchange of messages of different sizes. Journal of Parallel and Distributed Computing 66(1), 1–18 (2006)
Faraj, A., Yuan, X.: Message scheduling for all-to-all personalized communication on ethernet switched clusters. In: IPDPS. Proceedings of the IEEE International Parallel and Distributed Processing Symposium, IEEE Computer Society Press, Los Alamitos (2005)
Sanders, P., Traff, J.L.: The hierarchical factor algorithm for all-to-all communication. In: Monien, B., Feldmann, R.L. (eds.) Euro-Par 2002. LNCS, vol. 2400, pp. 799–803. Springer, Heidelberg (2002)
Bhat, P., Prasanna, V., Raghavendra, C.S.: Adaptive communication algorithms for distributed heterogeneous systems. In: HPDC 1998. Proceedings of the IEEE International Symposium on High Performance Distributed Computing, IEEE Computer Society Press, Los Alamitos (1998)
Liu, W., Wang, C.L., Prasanna, V.K.: Portable and scalable algorithms for irregular all-to-all communication. In: Proceedings of the 16th ICDCS, pp. 428–435 (1996)
Chun, A.T.T., Wang, C.L.: Contention-aware communication schedule for high-speed communication. Cluster Computing: The Journal of Networks, Software Tools and Application 6(4), 337–351 (2003)
Kielmann, T., Bal, H., Gorlatch, S., Verstoep, K., Hofman, R.: Network performance-aware collective communication for clustered wide area systems. Parallel Computing 27(11), 1431–1456 (2001)
Steffenel, L.A., Mounie, G.: Scheduling heuristics for efficient broadcast operations on grid environments. In: Proceedings of the Performance Modeling, Evaluation and Optimization of Parallel and Distributed Systems Workshop - PMEO 2006 (associated to IPDPS 2006), Rhodes Island, Greece, IEEE Computer Society Press, Los Alamitos (2006)
Kielmann, T., Hofman, R., Bal, H., Plaat, A., Bhoedjang, R.: Magpie: MPI’s collective communication operations for clustered wide area systems. In: Proceedings of the 7th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 131–140. ACM Press, New York (1999)
Gabriel, E., Resch, M., Beisel, T., Keller, R.: Distributed computing in a heterogenous computing environment. In: Alexandrov, V.N., Dongarra, J.J. (eds.) PVM/MPI 1998. LNCS, vol. 1497, pp. 180–187. Springer, Heidelberg (1998)
Casanova, H.: Network modeling issues for grid application scheduling. International Journal of Foundations of Computer Science 16(2), 145–162 (2005)
Jeannot, E., Wagner, F.: Scheduling messages for data redistribution: an experimental study. International Journal of High Performance Computing Applications 20(4), 443–454 (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jeannot, E., Steffenel, L.A. (2007). Fast and Efficient Total Exchange on Two Clusters. In: Kermarrec, AM., Bougé, L., Priol, T. (eds) Euro-Par 2007 Parallel Processing. Euro-Par 2007. Lecture Notes in Computer Science, vol 4641. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74466-5_91
Download citation
DOI: https://doi.org/10.1007/978-3-540-74466-5_91
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74465-8
Online ISBN: 978-3-540-74466-5
eBook Packages: Computer ScienceComputer Science (R0)