Abstract
The TOP 500 list is the most widely regarded ranking of modern supercomputers, based on Gflop/s measured for High Performance LINPACK (HPL). Ranking the most powerful supercomputers is important: Hardware producers hone their products towards maximum benchmark performance, while nations fund huge installations, aiming at a place on the pedestal. However, the relevance of HPL for real-world applications is declining rapidly, as the available compute cycles are heavily overrated. While relevant comparisons foster healthy competition, skewed comparisons foster developments aimed at distorted goals. Thus, in recent years, discussions on introducing a new benchmark, better aligned with real-world applications and therefore the needs of real users, have increased, culminating in a highly regarded candidate: High Performance Conjugate Gradients (HPCG).
In this paper we present an in-depth analysis of this new benchmark. Furthermore, we present a model, capable of predicting the performance of HPCG on a given architecture, based solely on two inputs: the effective bandwidth between the main memory and the CPU and the highest occuring network latency between two compute units.
Finally, we argue that within the scope of modern supercomputers with a decent network, only the first input is required for a highly accurate prediction, effectively reducing the information content of HPCG results to that of a stream benchmark executed on one single node.
We conclude with a series of suggestions to move HPCG closer to its intended goal: a new benchmark for modern supercomputers, capable of capturing a well-balanced mixture of relevant hardware properties.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alverson, B., Froese, E., Kaplan, L., Roweth, D.: Cray xc\({\textregistered }\) series network
Alverson, R., Roweth, D., Kaplan, L.: The gemini system interconnect. In: 2010 IEEE 18th Annual Symposium on High Performance Interconnects (HOTI), pp. 83–87. IEEE (2010)
Ashby, S.F., Falgout, R.D.: A parallel multigrid preconditioned conjugate gradient algorithm for groundwater flow simulations. Nucl. Sci. Eng. 124(1), 145–159 (1996)
Bailey, D.H., Barszcz, E. Barton, J.T., Browning, D.S., Carter, R.L., Dagum, L., Fatoohi, R.A., Frederickson, P.O., Lasinski, T.A., Schreiber, R.S., Simon, H.D., Venkatakrishnan, V., Weeratunga, S.K.: The NAS parallel benchmarks—summary and preliminary results. In: Proceedings of the 1991 ACM/IEEE Conference on Supercomputing, Supercomputing 1991, pp. 158–165, ACM, New York (1991)
Bailey, D.H., Barszcz, E., Barton, J.T., Browning, D.S., Carter, R.L., Fatoohi, R.A., Frederickson, P.O., Lasinski, T.A., Simon, H.D., Venkatakrishnan, V., Weeratunga, S.K.: The NAS parallel benchmarks. Technical report, The International Journal of Supercomputer Applications (1991)
Benzi, M.: Preconditioning techniques for large linear systems: a survey. J. Comput. Phys. 182(2), 418–477 (2002)
Bolz, J., Farmer, I., Grinspun, E., Schröoder, P.: Sparse matrix solvers on the GPU: conjugate gradients and multigrid. In: ACM Transactions on Graphics (TOG), vol. 22, pp. 917–924. ACM, New York (2003)
Buluc, A., Williams, S., Oliker, L., Demmel, J.: Reduced-bandwidth multithreaded algorithms for sparse matrix-vector multiplication. In: 2011 IEEE International on Parallel and Distributed Processing Symposium (IPDPS), pp. 721–733. IEEE (2011)
Cappello, F., Etiemble, D.: MPI versus MPI+OpenMP on the IBM SP for the NAS benchmarks. In: ACM/IEEE 2000 Conference Supercomputing, p. 12. IEEE (2000)
Demmel, J., Hoemmen, M., Mohiyuddin, M., Yelick, K.: Avoiding communication in sparse matrix computations. In: IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2008, pp. 1–12. IEEE (2008)
Dongarra, J., Heroux, M.A.: Toward a new metric for ranking high performance computing systems. Sandia report, SAND2013-4744, 312 (2013)
Dongarra, J., Luszczek, P.: HPCG: one year later. In: ISC 2014 (2014)
Heroux, M.A., Dongarra, J., Luszczek, P.: HPCG benchmark technical specification. Technical report, October 2013
Hoefler, T., Gropp, W., Thakur, R., Träff, J.L.: Toward performance models of MPI implementations for understanding application scaling issues. In: Keller, R., Gabriel, E., Resch, M., Dongarra, J. (eds.) EuroMPI 2010. LNCS, vol. 6305, pp. 21–30. Springer, Heidelberg (2010)
Hoefler, T., Lumsdaine, A., Rehm, W.: Implementation and performance analysis of non-blocking collective operations for MPI. In: Proceedings of the 2007 ACM/IEEE Conference on Supercomputing, SC 2007, pp. 1–10. IEEE (2007)
Luszczek, P., Dongarra, J.J., Koester, D., Rabenseifner, R., Lucas, B., Kepner, J., McCalpin, J., Bailey, D., Takahashi, D.: Introduction to the HPC Challenge Benchmark Suite. Lawrence Berkeley National Laboratory (2005)
Muller, M.S., van Waveren, M., Lieberman, R., Whitney, B., Saito, H., Kumaran, K., Baron, J., Brantley, W.C., Parrott, C., Elken, T., et al.: SPEC MPI2007an application benchmark suite for parallel systems using MPI. Concurr. Comput.: Prac. Exp. 22(2), 191–205 (2010)
Petitet, A.: HPL-A portable implementation of the high-performance linpack benchmark for distributed-memory computers (2004). http://www.netlib.org/benchmark/hpl/
Shalf, J., Dosanjh, S., Morrison, J.: Exascale computing technology challenges. In: Palma, J.M.L.M., Daydé, M., Marques, O., Lopes, J.C. (eds.) VECPAR 2010. LNCS, vol. 6449, pp. 1–25. Springer, Heidelberg (2011)
Smith, J.E., Taylor, W.R.: Accurate modelling of interconnection networks in vector supercomputers. In: Proceedings of the 5th International Conference on Supercomputing, pp. 264–273. ACM, New York (1991)
Szebenyi, Z., Wylie, B.J.N., Wolf, F.: SCALASCA parallel performance analyses of SPEC MPI2007 applications. In: Kounev, S., Gorton, I., Sachs, K. (eds.) SIPEW 2008. LNCS, vol. 5119, pp. 99–123. Springer, Heidelberg (2008)
Xu, Z., Hwang, K.: Modeling communication overhead: MPI and MPL performance on the IBM SP2. IEEE Parallel Distrib. Technol.: Syst. Appl. 4(1), 9–24 (1996)
Acknowledgements
The authors would like to thank Mandes Schönherr for valuable contributions. This research is partly supported by EU project POLCA (FP7-ICT-2013-10, grant agreement no. 610686).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Marjanović, V., Gracia, J., Glass, C.W. (2015). Performance Modeling of the HPCG Benchmark. In: Jarvis, S., Wright, S., Hammond, S. (eds) High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation. PMBS 2014. Lecture Notes in Computer Science(), vol 8966. Springer, Cham. https://doi.org/10.1007/978-3-319-17248-4_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-17248-4_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-17247-7
Online ISBN: 978-3-319-17248-4
eBook Packages: Computer ScienceComputer Science (R0)