skip to main content
10.1145/3324989.3325723acmconferencesArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
research-article

Porting the COSMO Weather Model to Manycore CPUs

Published: 12 June 2019 Publication History

Abstract

Weather and climate simulations are a major application driver in high-performance computing (HPC). With the end of Dennard scaling and Moore's law, the HPC industry increasingly employs specialized computation accelerators to increase computational throughput. Manycore architectures, such as Intel's Knights Landing (KNL), are a representative example of future processing devices. However, software has to be modified to use these devices efficiently. In this work, we demonstrate how an existing domain-specific language that has been designed for CPUs and GPUs can be extended to Manycore architectures such as KNL. We achieve comparable performance to the NVIDIA Tesla P100 GPU architecture on hand-tuned representative stencils of the dynamical core of the COSMO weather model and its radiation code. Further, we present performance within a factor of two of the P100 of the full DSL-based GPU-optimized COSMO dycore code. We find that optimizing code to full performance on modern manycore architectures requires similar effort and hardware knowledge as for GPUs. Further, we show limitations of the present approaches, and outline our lessons learned and possible principles for design of future DSLs for accelerators in the weather and climate domain.

References

[1]
Samantha V. Adams, Rupert W. Ford, M. Hambley, J. M. Hobson, I. Kavcic, C. M. Maynard, T. Melvin, Eike Hermann Müller, S. Mullerworth, A. R. Porter, Mike Rezny, Ben Shipway, and R. Wong. 2018. LFRic: Meeting the challenges of scalability and performance portability in Weather and Climate models. CoRR abs/1809.07267 (2018). arXiv:1809.07267 http://arxiv.org/abs/1809.07267
[2]
Valentin Clement, Sylvaine Ferrachat, Oliver Fuhrer, Xavier Lapillonne, Carlos E. Osuna, Robert Pincus, Jon Rood, and William Sawyer. 2018. The CLAW DSL: Abstractions for Performance Portable Weather and Climate Models. In Proceedings of the Platform for Advanced Scientific Computing Conference (PASC '18). ACM, New York, NY, USA, Article 2, 10 pages.
[3]
COSMO. 1998. Consortium for Small-scale Modeling. http://www.cosmo-model.org/
[4]
G Doms and M Baldauf. 2018. A Description of the Nonhydrostatic Regional COSMO-Model. http://www.cosmo-model.org/content/model/documentation/core/default.htm
[5]
H. Carter Edwards, Daniel Sunderland, Vicki Porter, Chris Amsler, and Sam Mish. 2012. Manycore performance-portability: Kokkos multidimensional array library. Scientific Programming 20 (2012), 89--114.
[6]
Oliver Fuhrer, Tarun Chadha, Torsten Hoefler, Grzegorz Kwasniewski, Xavier Lapillonne, David Leutwyler, Daniel Lüthi, Carlos Osuna, Christoph Schär, Thomas C. Schulthess, and Hannes Vogt. 2018. Near-global climate simulation at 1 km resolution: establishing a performance baseline on 4888 GPUs with COSMO 5.0. Geoscientific Model Development 11, 4 (May 2018), 1665--1681.
[7]
Oliver Fuhrer, Carlos Osuna, Xavier Lapillonne, Tobias Gysi, Ben Cumming, Mauro Bianco, Andrea Arteaga, and Thomas Christoph Schulthess. 2014. Towards a performance portable, architecture agnostic implementation strategy for weather and climate models. Supercomputing Frontiers and Innovations 1, 1 (June 2014), 45-62-62.
[8]
Mark Govett, Jim Rosinski, Jacques Middlecoff, Tom Henderson, Jin Lee, Alexander MacDonald, Ning Wang, Paul Madden, Julie Schramm, and Antonio Duarte. 2017. Parallelization and Performance of the NIM Weather Model on CPU, GPU, and MIC Processors. Bulletin of the American Meteorological Society 98, 10 (2017), 2201--2213. arXiv:https://doi.org/10.1175/BAMS-D-15-00278.1
[9]
Tobias Grosser and Torsten Hoefler. 2016. Polly-ACC Transparent Compilation to Heterogeneous Hardware. In Proceedings of the 2016 International Conference on Supercomputing (ICS '16). ACM, New York, NY, USA, Article 1, 13 pages.
[10]
Tobias Gysi, Carlos Osuna, Oliver Fuhrer, Mauro Bianco, and Thomas C. Schulthess. 2015. STELLA: A Domain-specific Tool for Structured Grid Methods in Weather and Climate Models. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC '15). ACM, New York, NY, USA, 41:1--41:12.
[11]
Intel Corporation. 2016. Intel® 64 and IA-32 Architectures Optimization Reference Manual. https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf
[12]
Intel Corporation. 2017. Intel® Xeon Phi™ Coprocessor x200 Product Family Datasheet. https://www.intel.com.br/content/dam/www/public/us/en/documents/datasheets/xeon-phi-coprocessor-x200-family-datasheet.pdf
[13]
Intel Corporation. 2018. Product Change Notification 116378 - 00. https://qdms.intel.com/dm/i.aspx/9C54A9A7-BF37-4496-B268-BD2746EA54D3/PCN116378-00.pdf
[14]
Jim Jeffers, James Reinders, and Avinash Sodani. 2016. Intel Xeon Phi Processor High Performance Programming (Knights Landing Edition). Morgan Kaufmann, Boston.
[15]
John Michalakes, Michael J. Iacono, and Elizabeth R. Jessup. 2016. Optimizing Weather Model Radiative Transfer Physics for Intel's Many Integrated Core (MIC) Architecture. Parallel Processing Letters 26 (2016), 1--16.
[16]
J. Mielikainen, B. Huang, and A. H.-L. Huang. 2014. Intel Xeon Phi accelerated Weather Research and Forecasting (WRF) Goddard microphysics scheme. Geoscientific Model Development Discussions 7, 6 (Dec. 2014), 8941--8973.
[17]
T. A. J. Ouermi, Aaron Knoll, Robert Michael Kirby, and Martin Berzins. 2017. OpenMP 4 Fortran Modernization of WSM6 for KNL. In PEARC.
[18]
Sabela Ramos and Torsten Hoefler. 2017. Capability Models for Manycore Memory Systems: A Case-Study with Xeon Phi KNL. In 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, Orlando, FL, USA, 297--306.
[19]
Bodo Ritter and Jean-Francois Geleyn. 1992. A Comprehensive Radiation Scheme for Numerical Weather Prediction Models with Potential Applications in Climate Simulations. Monthly Weather Review 120, 2 (Feb. 1992), 303--325.
[20]
T. C. Schulthess, P. Bauer, N. Wedi, O. Fuhrer, T. Hoefler, and C. Schär. 2019. Reflecting on the Goal and Baseline for Exascale Computing: A Roadmap Based on Weather and Climate Simulations. Computing in Science Engineering 21, 1 (Jan. 2019), 30--41.
[21]
Pascal Spörri. 2017. COSMO C++ Dynamical Core Training Course - Introduction and Code Flow. https://wiki.c2sm.ethz.ch/pub/COSMO/CXXDynamicalCore/20170403_-_1_-_CPP_Dycore_Intro_Code_Flow.pdf
[22]
Erich Strohmaier, Jack Dongarra, Horst Simon, and Martin Meuer. 2018. TOP500 List -- November 2018. https://www.top500.org/lists/2018/11/
[23]
Lukasz Szustak, Krzysztof Rojek, and Pawel Gepner. 2014. Using Intel Xeon Phi Coprocessor to Accelerate Computations in MPDATA Algorithm. In Parallel Processing and Applied Mathematics (Lecture Notes in Computer Science), Roman Wyrzykowski, Jack Dongarra, Konrad Karczewski, and Jerzy Waśniewski (Eds.). Springer Berlin Heidelberg, 582--592.
[24]
Llewellyn H. Thomas. 1949. Elliptic Problems in Linear Differential Equations over a Network. Watson Science Computer Laboratory Report. Columbia University, New York, NY, USA.
[25]
Louis J. Wicker and William C. Skamarock. 2002. Time-Splitting Methods for Elastic Models Using Forward Time Schemes. Monthly Weather Review 130, 8 (Aug. 2002), 2088--2097.

Cited By

View all
  • (2024)A shared compilation stack for distributed-memory parallelism in stencil DSLsProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651344(38-56)Online publication date: 27-Apr-2024
  • (2024)Machine learning approaches to predict the execution time of the meteorological simulation software COSMOJournal of Intelligent Information Systems10.1007/s10844-024-00880-xOnline publication date: 31-Aug-2024
  • (2023)Analysis of MURaM, a Solar Physics Application, for Scalability, Performance and PortabilityProceedings of the SC '23 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624606(1929-1938)Online publication date: 12-Nov-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PASC '19: Proceedings of the Platform for Advanced Scientific Computing Conference
June 2019
177 pages
ISBN:9781450367707
DOI:10.1145/3324989
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 June 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. COSMO
  2. Domain-Specific Languanges
  3. KNL
  4. Supercomputing
  5. Weather Forecasting

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

PASC '19
Sponsor:

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)14
  • Downloads (Last 6 weeks)6
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)A shared compilation stack for distributed-memory parallelism in stencil DSLsProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651344(38-56)Online publication date: 27-Apr-2024
  • (2024)Machine learning approaches to predict the execution time of the meteorological simulation software COSMOJournal of Intelligent Information Systems10.1007/s10844-024-00880-xOnline publication date: 31-Aug-2024
  • (2023)Analysis of MURaM, a Solar Physics Application, for Scalability, Performance and PortabilityProceedings of the SC '23 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624606(1929-1938)Online publication date: 12-Nov-2023
  • (2023)Casper: Accelerating Stencil Computations Using Near-Cache ProcessingIEEE Access10.1109/ACCESS.2023.325200211(22136-22154)Online publication date: 2023
  • (2023)Domain-specific implementation of high-order Discontinuous Galerkin methods in spherical geometryComputer Physics Communications10.1016/j.cpc.2023.108993(108993)Online publication date: Oct-2023
  • (2022)Scalable distributed high-order stencil computationsProceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis10.5555/3571885.3571924(1-13)Online publication date: 13-Nov-2022
  • (2022)Enabling Large-Scale Simulation of CAM on the Sunway TaihuLight SupercomputerIEEE Transactions on Computers10.1109/TC.2021.306342271:4(824-837)Online publication date: 1-Apr-2022
  • (2022)Scalable Distributed High-Order Stencil ComputationsSC22: International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SC41404.2022.00035(1-13)Online publication date: Nov-2022
  • (2021)Resilience and fault tolerance in high-performance computing for numerical weather and climate predictionThe International Journal of High Performance Computing Applications10.1177/1094342021990433(109434202199043)Online publication date: 8-Feb-2021
  • (2021)Refactoring the MPS/University of Chicago Radiative MHD (MURaM) model for GPU/CPU performance portability using OpenACC directivesProceedings of the Platform for Advanced Scientific Computing Conference10.1145/3468267.3470576(1-12)Online publication date: 5-Jul-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media