ABSTRACT
Cluster computers are dominating high performance computing (HPC) today. The success of this architecture is based on the fact that it proffits from the improvements provided by mainstream computing well known under the label of Moore's law. But trying to get to Exascale within this decade might require additional endeavors beyond surfing this technology wave. In order to find possible directions for the future we review Amdahl's and Gustafson's thoughts on scalability. Based on this analysis we propose an advance architecture combining a Cluster with a so called Booster element comprising of accelerators interconnected by a high performance fabric. We argue that this architecture provides significant advantages compared to today's accelerated clusters and might pave the way for clusters into the era of Exascale computing. The DEEP project has been presented aiming for an implementation of this concept. Six applications from fields having the potential to exploit Exascale systems will be ported to DEEP.We analyze one application in detail and explore the consequences of the constraints of the DEEP systems on its scalability.
- http://www.top500.orgGoogle Scholar
- http://www.deep-project.euGoogle Scholar
- http://http://www.mpi-forum.orgGoogle Scholar
- Gordon E. Moore, "Cramming more components onto integrated circuits.", Electronics. 19, Nr. 3, 1965, pp. 114-117.Google Scholar
- www.cse.nd.edu/Reports/2008/TR-2008-13.pdfGoogle Scholar
- http://www.theregister.co.uk/2010/11/22/ibm_blue_gene_q_superGoogle Scholar
- http://developer.nvidia.com/gpudirectGoogle Scholar
- http://www.green500.orgGoogle Scholar
- H. Baier et al., "QPACE: power-efficient parallel architecture based on IBM PowerXCell 8i", Computer Science - R&D 25 (2010), pp. 149-154. doi:10.1007/s00450-010-0122-4.Google Scholar
- Gene Amdahl (1967), "Validity of the Single Processor Approach to Achieving Large-Scale Computing Capabilities", (PDF), AFIPS Conference Proceedings (30), pp. 483-485. Google ScholarDigital Library
- John L. Gustafson, "Re-evaluating Amdahl's Law", Communications of the ACM 31(5), 1988, pp. 532-533. Google ScholarDigital Library
- Charles Clos, "A Study of Non-blocking Switching Networks", The Bell System Technical Journal, 1953, vol. 32, no. 2, pp. 406-424Google ScholarCross Ref
- http://www.intel.com/pressroom/archive/releases/2010/20100531comp.htmGoogle Scholar
- http://newsroom.intel.com/servlet/JiveServlet/download/38-6968/Intel_SC11_presentation.pdfGoogle Scholar
- Mondrian Nüssle et al., "A resource optimized remote-memory-access architecture for low-latency communication", The 38th International Conferenceon Parallel Processing (ICPP-2009), September 22-25, Vienna, Austria. Google ScholarDigital Library
- H. Fröning und H. Litz, Effcient Hardware Support for the Partitioned Global Address Space, 10th Workshop on Communication Architecture for Clusters (CAC2010), co-located with 24th International Parallel and Distributed Processing Symposium (IPDPS 2010), Atlanta, Georgia, 2012.Google Scholar
- S. Markidis, G. Lapenta and Rizwan-Uddin, "Multi-scale simulations of plasma with iPIC3D", Mathematics and Computers in Simulation, pp. 1509-1519, 2010. Google ScholarDigital Library
- J. U. Brackbill and D. W. Forslund, "Simulation of low frequency, electromagnetic phenomena in plasmas", Journal of Computational Physics, 1982, p. 271.Google ScholarCross Ref
- P. Ricci, G. Lapenta and J. U. Brackbill, "A simplified implicit Maxwell solver", Journal of Computational Physics (2002), p. 117. Google ScholarDigital Library
- B. Marder, "A method for incorporating Gauss' law into electromagnetic PIC codes", J. Comput. Phys., vol. 68 (1987), p. 48 Google ScholarDigital Library
- A. Bruce Langdon, "On enforcing Gauss' law in electromagnetic particle-in-cell codes", Computer Physics Communications, vol. 70, Issue 3 (1992).Google Scholar
- A. Duran, E. Ayguaée, R. M. Badia, J. Labarta, L. Martinell, X. Martorell and J. Planas, "OmpSs: A Proposal for Programming Heterogeneous Multi-Core Architectures", in Parallel Processing Letters, vol. 21, Issue 2 (2011) pp. 173-193.Google ScholarCross Ref
- G. R. Gao, T. L. Sterling, R. Stevens, M. Hereld and W. Zhu, "ParalleX: A Study of A New Parallel Computation Model", in Proc. of 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), Long Beach, California, USAGoogle ScholarCross Ref
Index Terms
- On the scalability of the clusters-booster concept: a critical assessment of the DEEP architecture
Recommendations
Beowulf Clusters: From Research Curiosity to Exascale
Beowulf '14: Proceedings of the 20 Years of Beowulf Workshop on Honor of Thomas Sterling's 65th BirthdayThis paper reviews the technical and social events that stimulated early deployments of large-scale Beowulf-style clusters for production scientific and engineering use at the National Center for Supercomputing Applications (NCSA) and the subsequent ...
Virtual Organization Clusters
PDP '09: Proceedings of the 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based ProcessingSharing traditional clusters based on multiprogramming systems among different Virtual Organizations (VOs) can lead to complex situations resulting from the differing software requirements of each VO. This complexity could be eliminated if each cluster ...
An analysis of computational workloads for the ORNL Jaguar system
ICS '12: Proceedings of the 26th ACM international conference on SupercomputingThis study presents an analysis of science application workloads for the Jaguar Cray XT5 system during its tenure as a 2.3 petaflop supercomputer at Oak Ridge National Laboratory. Jaguar was the first petascale system to be deployed for open science and ...
Comments