Abstract
This paper presents a scalable and efficient Message-Passing in Java (MPJ) collective communication library for parallel computing on multi-core architectures. The continuous increase in the number of cores per processor underscores the need for scalable parallel solutions. Moreover, current system deployments are usually multi-core clusters, a hybrid shared/distributed memory architecture which increases the complexity of communication protocols. Here, Java represents an attractive choice for the development of communication middleware for these systems, as it provides built-in networking and multithreading support. As the gap between Java and compiled languages performance has been narrowing for the last years, Java is an emerging option for High Performance Computing (HPC).
Our MPJ collective communication library increases Java HPC applications performance on multi-core clusters: (1) providing multi-core aware collective primitives; (2) implementing several algorithms (up to six) per collective operation, whereas publicly available MPJ libraries are usually restricted to one algorithm; (3) analyzing the efficiency of thread-based collective operations; (4) selecting at runtime the most efficient algorithm depending on the specific multi-core system architecture, and the number of cores and message length involved in the collective operation; (5) supporting the automatic performance tuning of the collectives depending on the system and communication parameters; and (6) allowing its integration in any MPJ implementation as it is based on MPJ point-to-point primitives. A performance evaluation on an InfiniBand and Gigabit Ethernet multi-core cluster has shown that the implemented collectives significantly outperform the original ones, as well as higher speedups when analyzing the impact of their use on collective communications intensive Java HPC applications. Finally, the presented library has been successfully integrated in MPJ Express (http://mpj-express.org), and will be distributed with the next release.
Similar content being viewed by others
References
Taboada GL, Touriño J, Doallo R (2009) Java for high performance computing: assessment of current research and practice. In: Proc 7th int conf on principles and practice of programming in Java (PPPJ’09), Calgary, Canada, pp 30–39
Blount B, Chatterjee S (1999) An evaluation of Java for numerical computing. Sci Program 7(2):97–110
Shafi A, Carpenter B, Baker M, Hussain A (2010) A comparative study of Java and C performance in two large-scale parallel applications. Concurr Comput, Pract Exp 15(21):1882–1906
Taboada GL, Touriño J, Doallo R (2010) F-MPJ: scalable Java message-passing communications on parallel systems. J Supercomput (in press)
Carpenter B, Fox G, Ko S-H, Lim S, mpiJava 1.2: API specification. http://www.hpjava.org/reports/mpiJava-spec/mpiJava-spec/mpiJava-spec.html [Last visited: March 2010]
Carpenter B, Getov V, Judd G, Skjellum A, Fox G (2000) MPJ: MPI-like message-passing for Java. Concurr Comput Pract Exp 12(11):1019–1038
Java Grande Forum. http://www.javagrande.org [Last visited: March 2010]
Baker M, Carpenter B, Fox G, Ko S, Lim S (1999) mpiJava: an object-oriented Java interface to MPI. In: Proc 1st int workshop on Java for parallel and distributed computing (IWJPDC’99), LNCS, vol 1586, San Juan, Puerto Rico, pp 748–762
Shafi A, Carpenter B, Baker M (2009) Nested parallelism for multi-core HPC systems using Java. J Parallel Distrib Comput 69(6):532–545
Bornemann M, v. Nieuwpoort RV, Kielmann T (2005) MPJ/Ibis: a flexible and efficient message-passing platform for Java. In: Proc 12th EuroPVM/MPI (EuroPVM/MPI’05), LNCS, vol 3666, Sorrento, Italy, pp 217–224
Pugh B, Spacco J (2003) MPJava: High-performance message-passing in Java using Java.nio. In: Proc 16th int workshop on languages and compilers for parallel computing (LCPC’03), LNCS, vol 2958, College Station, TX, USA, pp 323–339
Taboada GL, Touriño J, Doallo R (2010) Performance analysis of message-passing libraries on high-speed clusters. Int J Comput Syst Sci Eng 25(1):63–78, January
Chan E, Heimlich M, Purkayastha A, van de Geijn RA (2007) Collective communication: theory, practice, and experience. Concurr Comput, Pract Exp 19(13):1749–1783
Barchet-Estefanel LA, Mounie G (2004) Fast tuning of intra-cluster collective communications. In: Proc 11th EuroPVM/MPI (EuroPVM/MPI’04), LNCS, vol 3241, Budapest, Hungary, pp 28–35
Pjesivac-Grbovic J, Angskun T, Bosilca G, Fagg GE, Gabriel E, Dongarra JJ (2007) Performance analysis of MPI collective operations. Cluster Comput 10(2):127–143
Thakur R, Rabenseifner R, Gropp W (2005) Optimization of collective communication operations in MPICH. Int J High Perform Comput Appl 19(1):49–66
Pjesivac-Grbovic J, Fagg GE, Angskun T, Bosilca G, Dongarra JJ (2006) MPI collective algorithm selection and quadtree encoding. In: 13th EuroPVM/MPI (EuroPVM/MPI’06), LNCS, vol 4192, Bonn, Germany, pp 40–48
Sanders P, Träff JL (2002) The hierarchical factor algorithm for all-to-all communication. In: Proc 8th int Euro-Par (Euro-Par’02), LNCS, vol 2400, Paderborn, Germany, pp 799–804
Zhu H, Goodell D, Gropp W, Thakur R (2009) Hierarchical collectives in MPICH2. In: Proc 16th EuroPVM/MPI (EuroPVM/MPI’09), LNCS, vol 5759, Espoo, Finland, pp 325–326
Tu B, Fan J, Zhan J, Zhao X (2010) Performance analysis and optimization of MPI collective operations on multi-core clusters. J Supercomp (in press)
Tipparaju V, Nieplocha J, Panda DK (2003) Fast collective operations using shared and remote memory access protocols on clusters. In: Proc 17th int parallel and distributed processing symposium (IPDPS’03), Nice, France, pp. 84–93
Mercier G, Clet-Ortega J (2009) Towards an efficient process placement policy for MPI applications in multicore environments. In: Proc 16th EuroPVM/MPI (EuroPVM/MPI’09), LNCS, vol 5759, Espoo, Finland, pp 104–115
Nelisse A, Maassen J, Kielmann T, Bal HE (2003) CCJ: object-based message-passing and collective communication in Java. Concurr Comput, Pract Exp 15(3–5):341–369
Lim S, Carpenter B, Fox G, Lee H (2005) Collective communications for scalable programming. In: Proc 3rd int symposium on parallel and distributed processing and applications (ISPA’05), LNCS, vol 3758, Nanjing, China, pp 286–297
Shafi A, Manzoor J (2009) Towards efficient shared memory communications in MPJ Express. In: Proc 11th int workshop on Java and components for parallelism, distribution and concurrency (IWJacPDC’09), Rome, Italy, p 111b (8 pages)
Taboada GL, Touriño J, Doallo R (2003) Performance analysis of Java message-passing libraries on fast ethernet, myrinet and SCI clusters. In: Proc 5th IEEE int conf on cluster computing (CLUSTER’03), Hong Kong, China, pp 118–126
Mallón DA, Taboada GL, Touriño J, Doallo R (2009) NPB-MPJ: NAS parallel benchmarks implementation for message-passing in Java. In: Proc 17th euromicro int conf on parallel, distributed, and network-based processing (PDP’09), Weimar, Germany, pp 181–190
Baker M, Carpenter B, Shafi A (2006) MPJ Express meets Gadget: towards a Java code for cosmological simulations. In: 13th EuroPVM/MPI (EuroPVM/MPI’06), Bonn, Germany, pp 358–365
Finis Terrae. http://www.top500.org/system/9156 [Last visited: March 2010]
TOP500 supercomputing site. http://www.top500.org [Last visited: March 2010]
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Taboada, G.L., Ramos, S., Touriño, J. et al. Design of efficient Java message-passing collectives on multi-core clusters. J Supercomput 55, 126–154 (2011). https://doi.org/10.1007/s11227-010-0464-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-010-0464-5