Abstract
Performance, i.e., execution times, is one of the most important features of HPC software, but energy consumption is also growing in importance if we intend to extend application to Exascale. This is the case of HPC software used in weather forecasting, in which every ounce of performance is critical in order to increase the accuracy and precision of its results. In this work, we study the performance-energy balance of an OpenPOWER processor, which is designed for the high workloads typically seen on data servers and HPC environments. Our results show that the OpenPOWER processor is superior in performance in weather forecast workloads compared to other processors commonly used in HPC, but at the expense of consuming more energy. Furthermore, the highest hyperthreading modes available on OpenPOWER processors do not perform well with HPC workloads and are even detrimental to performance.
Similar content being viewed by others
References
Adinetz AV, Baumeister PF, Böttiger H, Hater T, Maurer T, Pleiter D, Schenck W, Schifano SF (2014) Performance evaluation of scientific applications on POWER8. In: International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems, Springer, pp 24–45
Bergman K, Borkar S, Campbell D, Carlson W, Dally W, Denneau M, Franzon P, Harrod W, Hill K, Hiller J et al. (2008) Exascale computing study: technology challenges in achieving exascale systems. Defense Advanced Research Projects Agency Information Processing Techniques Office (DARPA IPTO), Technical Report, p 15
Bermejo B, Juiz C, Guerrero C (2018) On the linearity of performance and energy at virtual machine consolidation: the cis2 index for cpu workload in server saturation. In: 2018 IEEE 20th International Conference on High Performance Computing and Communications; IEEE 16th International Conference on Smart City; IEEE 4th International Conference on Data Science and Systems (HPCC/SmartCity/DSS), pp 928–933
Bermejo B, Juiz C, Guerrero C (2018) Virtualization and consolidation: a systematic review of the past 10 years of research on energy and performance. J Supercomput 75:1–29
Daniels MH, Lundquist KA, Mirocha JD, Wiersema DJ, Chow FK (2016) A new vertical grid nesting capability in the weather research and forecasting (wrf) model. Mon Weather Rev 144(10):3725–3747
Davidović Davor, Skala Karolj, Belušić Danijel, Prtenjak Maja Telišman (2010) Grid implementation of the weather research and forecasting model. Earth Sci Inf 3(4):199–208
Denham Mónica, Lamperti Enrico, Areta Javier (2018) Weather radar data processing on graphic cards. J Supercomput 74(2):868–885
Farguell A, Cortés A, Margalef T, Miró JR, Mercader J (2018) Scalability of a multi-physics system for forest fire spread prediction in multi-core platforms. J Supercomput. https://doi.org/10.1007/s11227-018-2330-9
Feliu Josue, Eyerman Stijn, Sahuquillo Julio, Petit Salvador, Eeckhout Lieven (2017) Improving IBM POWER8 performance through symbiotic job scheduling. IEEE Trans Parallel Distrib Syst 28(10):2838–2851
Fernández-Quiruelas V, Blanco C, Cofiño Antonio S, Fernández J (2015) Large-scale climate simulations harnessing clusters, grid and cloud infrastructures. Future Gener Comput Syst 51:36–44
Freeh Vincent W, Lowenthal David K, Feng Pan, Nandini Kappiah, Rob Springer, Rountree Barry L, Femal Mark E (2007) Analyzing the energy-time trade-off in high-performance computing applications. IEEE Trans Parallel DistribSyst 18(6):835–848
Goel B, Titos-Gil R, Negi R, McKee SA, Stenstrom P (2014) Performance and energy analysis of the restricted transactional memory implementation on haswell. In: 2014 IEEE 28th International Parallel and Distributed Processing Symposium, pp 615–624
Jeffers James, Reinders James, Sodani Avinash (2016) Intel Xeon phi processor high performance programming Knights landing edition. Morgan Kaufmann, Burlington
Jin Haoqiang, Jespersen Dennis, Mehrotra Piyush, Biswas Rupak, Huang Lei, Chapman Barbara (2011) High performance computing using MPI and OpenMP on multi-core parallel systems. Parallel Comput 37(9):562–575
Jones Robert W (1977) A nested grid for a three-dimensional model of a tropical cyclone. J Atmos Sci 34(10):1528–1553
Kaliszan D, Fürst S, Gienger M, Gogolenko S, Meyer N, Petruczynik S (2019) Comparative benchmarking of HPC systems for GSS applications: GSS applications in the HPC ecosystem. In: Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, ACM, pp 43–52
Kim R, Choi J, Lee M (2019) Optimizing parallel GEMM routines using auto-tuning with intel AVX-512. In: Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, ACM, pp 101–110
Köhler M, Saak J (2018) Frequency scaling and energy efficiency regarding the Gauss–Jordan elimination scheme with application to the matrix-sign-function on OpenPOWER 8. Concur Comput Pract Exp 31:e4504
Alexandros Labrinidis, Jagadish Hosagrahar V (2012) Challenges and opportunities with big data. Proc VLDB Endow 5(12):2032–2033
Leng T, Ali R, Hsieh J, Mashayekhi V, Rooholamini R (2002) An empirical study of hyper-threading in high performance computing clusters. In: Proceedings of LCI International Conference on Linux Clusters: Linux HPC revolution, 45
Lu Xiaoyi, Shi Haiyang, Shankar Dipti, Panda Dhabaleswar K DK (2017) Performance characterization and acceleration of big data workloads on OpenPOWER system. In: 2017 IEEE International Conference on Big Data, pp 213–222
Mlawer Eli J, Taubman Steven J, Brown Patrick D, Iacono Michael J, Clough Shepard A (1997) Radiative transfer for inhomogeneous atmospheres: RRTM, a validated correlated-k model for the longwave. J Geophys Res Atmos 102(D14):16663–16682
Hugh Morrison, Milbrandt Jason A (2015) Parameterization of cloud microphysics based on the prediction of bulk ice particle properties. part i: scheme description and idealized tests. J Atmos Sci 72(1):287–311
Niu GY, Yang ZL, Mitchell KE, Chen F, Ek MB, Barlage M, Kumar A, Manning K, Niyogi D, Rosero E et al (2011) The community noah land surface model with multiparameterization options (Noah-MP): 1 model description and evaluation with local-scale measurements. J Geophys Res Atmos. https://doi.org/10.1029/2010JD015139
Park Jinsu, Baek Woongki (2019) Analyzing and optimizing the performance and energy efficiency of transactional scientific applications on large-scale NUMA systems with HTM support. J Parallel Distrib Comput 127:1–17
Shainer G, Liu T, Lui P, Graham R (2011) Accelerating high performance computing applications through mpi offloading. HPC Advisory Council–HPC Scale Special Interest Group, Sunnyvale, CA
Pablo Silva Juan, José Hagopian, Marcel Burdiat, Ernesto Dufrechou, Martín Pedemonte, Alejandro Gutiérrez, Gabriel Cazes, Pablo Ezzatti (2014) Another step to the full GPU implementation of the weather research and forecasting model. J Supercomput 70(2):746–755
Balaram Sinharoy, Van Norstrand JA, Eickemeyer Richard J, Le Hung Q, Jens Leenstra, Nguyen Dung Q, Konigsburg B, Ward K, Brown MD, Moreira José E et al (2015) IBM POWER8 processor core microarchitecture. IBM J Res Dev 59(1):1–2
Skamarock WC, Klemp JB, Dudhia J, Gill DO, Barker DM, Wang W, Powers JG (2005) A description of the advanced research wrf version 2. Technical report, National Center For Atmospheric Research Boulder Co Mesoscale and Microscale Meteorology Div
Sudheer CD, Srinivasan A (2015) Efficient barrier implementation on the POWER8 processor. In: 2015 IEEE 22nd International Conference on High Performance Computing (HiPC), pp 165–173
Wang Yuzhu, Jiang Jinrong, Zhang Junqiang, He Juanxiong, Zhang He, Chi Xuebin, Yue Tianxiang (2018) An efficient parallel algorithm for the coupling of global climate models and regional climate models on a large-scale multi-core cluster. J Supercomput 74(8):3999–4018
Wei Y, Wang Y, Cai L, Tang W, Wang B, Ethier S, See S, Lin J (2016) Performance and portability studies with Open ACC accelerated version of GTC-P. In: 2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT), pp 13–18
Zenker E, Widera R, Huebl A, Juckeland G, Knüpfer A, Nagel WE, Bussmann M (2016) Performance-portable many-core plasma simulations: porting PIConGPU to OpenPOWER and beyond. In: International Conference on High Performance Computing, Springer, pp 293–301
Acknowledgements
Funding from projects CGL2013-48367-P and CGL2016-80609-R (Spanish Ministry of Economy and Competitiveness, Science and Innovation) is gratefully acknowledged. RM acknowledges an FPI grant EEBB-I-17-12253. AN acknowledges Grant FPU13/02798
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Moreno, R., Arias, E., Navarro, A. et al. How good is the OpenPOWER architecture for high-performance CPU-oriented weather forecasting applications?. J Supercomput 75, 6178–6193 (2019). https://doi.org/10.1007/s11227-019-02844-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-019-02844-3