Efficiency Analysis of Intel, AMD and Nvidia 64-Bit Hardware for Memory-Bound Problems: A Case Study of Ab Initio Calculations with VASP

Stegailov, Vladimir; Vecher, Vyacheslav

doi:10.1007/978-3-319-78054-2_8

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10778))

Included in the following conference series:

International Conference on Parallel Processing and Applied Mathematics

1314 Accesses
4 Citations

Abstract

Nowadays, the wide spectrum of Intel Xeon processors is available. The new Zen CPU architecture developed by AMD has extended the number of options for x86_64 HPC hardware. Moreover, Nvidia has released a custom 64-bit Denver architecture based on the ARM instruction set. This large number of options makes the optimal CPU choice for perspective HPC systems not a straightforward procedure. Such a co-design procedure should follow the requests from the end-users community. Modern computational materials science studies are among the major consumers of HPC resources worldwide. The VASP code is perhaps the most popular tool for these research. In this work, we discuss the benchmark metric and results based on a VASP test model that give us the possibility to compare different hardware and to distinguish the best options with respect to energy-to-solution criterion.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Efficiency Analysis of Intel and AMD x86_64 Architectures for Ab Initio Calculations: A Case Study of VASP

Porting and Optimizing VASP on the SW26010

Performance Characterisation of the 64-Core SG2042 RISC-V CPU for HPC

References

Kresse, G., Hafner, J.: Ab initio molecular dynamics for liquid metals. Phys. Rev. B 47, 558–561 (1993). http://link.aps.org/doi/10.1103/PhysRevB.47.558
Kresse, G., Hafner, J.: Ab initio molecular-dynamics simulation of the liquid-metal-amorphous-semiconductor transition in germanium. Phys. Rev. B 49, 14251–14269 (1994). http://link.aps.org/doi/10.1103/PhysRevB.49.14251
Kresse, G., Furthmuller, J.: Efficiency of ab-initio total energy calculations for metals and semiconductors using a plane-wave basis set. Computat. Mater. Sci. 6(1), 15–50 (1996). https://doi.org/10.1016/0927-0256(96)00008-0. http://www.sciencedirect.com/science/article/pii/0927025696000080
Kresse, G., Furthmüller, J.: Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. Phys. Rev. B 54, 11169–11186 (1996). http://link.aps.org/doi/10.1103/PhysRevB.54.11169
Bethune, I.: Ab initio molecular dynamics. In: Introduction to Molecular Dynamics on ARCHER (2015). https://www.archer.ac.uk/training/course-material/2015/06/MolDy_Strath/AbInitioMD.pdf
Hutchinson, M.: VASP on GPUs. When and how. In: GPU Technology Theater, SC 2015 (2015). http://images.nvidia.com/events/sc15/pdfs/SC5107-vasp-gpus.pdf
Zhao, Z., Marsman, M.: Estimating the performance impact of the MCDRAM on KNL using dual-socket Ivy Bridge nodes on Cray XC30. In: Proceedings of Cray User Group – 2016 (2016). https://cug.org/proceedings/cug2016_proceedings/includes/files/pap111.pdf
Boggs, D., Brown, G., Tuck, N., Venkatraman, K.S.: Denver: Nvidia’s first 64-bit arm processor. IEEE Micro 35(2), 46–55 (2015). https://doi.org/10.1109/MM.2015.12
Article Google Scholar
Kogge, P., Shalf, J.: Exascale computing trends: adjusting to the “new normal” for computer architecture. Comput. Sci. Eng. 15(6), 16–26 (2013). https://doi.org/10.1109/MCSE.2013.95
Article Google Scholar
Burtscher, M., Kim, B.D., Diamond, J., McCalpin, J., Koesterke, L., Browne, J.: Perfexpert: an easy-to-use performance diagnosis tool for HPC applications. In: Proceedings of 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2010, pp. 1–11. IEEE Computer Society, Washington, DC (2010). https://doi.org/10.1109/SC.2010.41
Rane, A., Browne, J.: Enhancing performance optimization of multicore/multichip nodes with data structure metrics. ACM Trans. Parallel Comput. 1(1), 3:1–3:20 (2014). http://doi.acm.org/10.1145/2588788
Stanisic, L., Mello Schnorr, L.C., Degomme, A., Heinrich, F.C., Legrand, A., Videau, B.: Characterizing the performance of modern architectures through opaque benchmarks: pitfalls learned the hard way. In: IPDPS 2017–31st IEEE International Parallel & Distributed Processing Symposium (RepPar Workshop), Orlando, USA (2017). https://hal.inria.fr/hal-01470399
Hoefler, T., Belli, R.: Scientific benchmarking of parallel computing systems: twelve ways to tell the masses when reporting performance results. In: Proceedings of International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2015, pp. 73:1–73:12. ACM, New York (2015). https://doi.org/10.1145/2807591.2807644
Scogland, T., Azose, J., Rohr, D., Rivoire, S., Bates, N., Hackenberg, D.: Node variability in large-scale power measurements: perspectives from the Green500, Top500 and EEHPCWG. In: Proceedings of International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2015, pp. 74:1–74:11. ACM, New York (2015). http://doi.acm.org/10.1145/2807591.2807653
Calore, E., Schifano, S.F., Tripiccione, R.: Energy-performance tradeoffs for HPC applications on low power processors. In: Hunold, S., et al. (eds.) Euro-Par 2015. LNCS, vol. 9523, pp. 737–748. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-27308-2_59
Chapter Google Scholar
Rojek, K., Ilic, A., Wyrzykowski, R., Sousa, L.: Energy-aware mechanism for stencil-based MPDATA algorithm with constraints. Concurr. Comput.: Pract. Exp. e4016-n/a (2016). http://dx.doi.org/10.1002/cpe.4016.Cpe.4016
Luijten, R.P., Cossale, M., Clauberg, R., Doering, A.: Power measurements and cooling of the DOME 28nm 1.8GHz 24-thread ppc64 $\mu $Server compute node. In: 2015 International Conference on IC Design Technology (ICICDT), pp. 1–4 (2015). https://doi.org/10.1109/ICICDT.2015.7165919
Nikolskiy, V., Stegailov, V.: Floating-point performance of ARM cores and their efficiency in classical molecular dynamics. J. Phys.: Conf. Ser. 681(1), 012,049 (2016). http://stacks.iop.org/1742-6596/681/i=1/a=012049
Nikolskiy, V.P., Stegailov, V.V., Vecher, V.S.: Efficiency of the Tegra K1 and X1 systems-on-chip for classical molecular dynamics. In: 2016 International Conference on High Performance Computing Simulation (HPCS), pp. 682–689 (2016). https://doi.org/10.1109/HPCSim.2016.7568401
Vecher, V., Nikolskii, V., Stegailov, V.: GPU-accelerated molecular dynamics: energy consumption and performance. In: Voevodin, V., Sobolev, S. (eds.) RuSCDays 2016. CCIS, vol. 687, pp. 78–90. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-55669-7_7
Chapter Google Scholar
Cytowski, M.: Best Practice Guide – IBM Power 775. PRACE, November 2013. http://www.prace-ri.eu/IMG/pdf/Best-Practice-Guide-IBM-Power-775.pdf
Stegailov, V.V., Orekhov, N.D., Smirnov, G.S.: HPC hardware efficiency for quantum and classical molecular dynamics. In: Malyshkin, V. (ed.) PaCT 2015. LNCS, vol. 9251, pp. 469–473. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-21909-7_45
Chapter Google Scholar

Download references

Acknowledgment

The authors are grateful to Dr. Maciej Cytowski and Dr. Jacek Peichota (ICM, University of Warsaw) for the data on the VASP benchmark [21].

The authors acknowledge Joint Supercomputer Centre of Russian Academy of Sciences (http://www.jscc.ru) and Shared Resource Center “Far Eastern Computing Resource” IACP FEB RAS (http://cc.dvo.ru) for the access to the supercomputers MVS10P, MVS1P5 and IRUS17.

The work was supported by the grant No. 14-50-00124 of the Russian Science Foundation. A part of the equipment used in this work was purchased with the financial support of MIPT and HSE.

Author information

Authors and Affiliations

Joint Institute for High Temperatures of RAS, Moscow, Russia
Vladimir Stegailov & Vyacheslav Vecher
Moscow Institute of Physics and Technology (State University), Dolgoprudny, Russia
Vladimir Stegailov & Vyacheslav Vecher
National Research University Higher School of Economics, Moscow, Russia
Vladimir Stegailov

Authors

Vladimir Stegailov
View author publications
You can also search for this author in PubMed Google Scholar
Vyacheslav Vecher
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vladimir Stegailov .

Editor information

Editors and Affiliations

Czestochowa University of Technology, Czestochowa, Poland
Roman Wyrzykowski
University of Tennessee, Knoxville, Tennessee, USA
Jack Dongarra
University of Southern California, Marina Del Rey, California, USA
Ewa Deelman
Czestochowa University of Technology, Czestochowa, Poland
Konrad Karczewski

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Stegailov, V., Vecher, V. (2018). Efficiency Analysis of Intel, AMD and Nvidia 64-Bit Hardware for Memory-Bound Problems: A Case Study of Ab Initio Calculations with VASP. In: Wyrzykowski, R., Dongarra, J., Deelman, E., Karczewski, K. (eds) Parallel Processing and Applied Mathematics. PPAM 2017. Lecture Notes in Computer Science(), vol 10778. Springer, Cham. https://doi.org/10.1007/978-3-319-78054-2_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-78054-2_8
Published: 23 March 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-78053-5
Online ISBN: 978-3-319-78054-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Efficiency Analysis of Intel, AMD and Nvidia 64-Bit Hardware for Memory-Bound Problems: A Case Study of Ab Initio Calculations with VASP