research-article

Persistent coarrays: integrating MPI storage windows in coarray fortran

Authors:
Sergio Rivas-Gomez

KTH Royal Institute of Technology, Stockholm, Sweden

KTH Royal Institute of Technology, Stockholm, Sweden
View Profile

,
Alessandro Fanfarillo

National Center for Atmospheric Research

National Center for Atmospheric Research
View Profile

,
Sai Narasimhamurthy

Seagate Systems, Havant, United Kingdom

Seagate Systems, Havant, United Kingdom
View Profile

,
Stefano Markidis

KTH Royal Institute of Technology, Stockholm, Sweden

KTH Royal Institute of Technology, Stockholm, Sweden
View Profile

EuroMPI '19: Proceedings of the 26th European MPI Users' Group MeetingSeptember 2019Article No.: 3Pages 1–8https://doi.org/10.1145/3343211.3343214

Published:11 September 2019Publication History

EuroMPI '19: Proceedings of the 26th European MPI Users' Group Meeting

Pages 1–8

ABSTRACT

The inherent integration of novel hardware and software components on HPC is expected to considerably aggravate the Mean Time Between Failures (MTBF) on scientific applications, while simultaneously increase the programming complexity of these clusters. In this work, we present the initial steps towards the integration of transparent resilience support inside Coarray Fortran. In particular, we propose persistent coarrays, an extension of OpenCoarrays that integrates MPI storage windows to leverage its transport layer and seamlessly map coarrays to files on storage. Preliminary results indicate that our approach provides clear benefits on representative workloads, while incurring in minimal source code changes.

References

Nilmini Abeyratne, Hsing-Min Chen, Byoungchan Oh, Ronald Dreslinski, Chaitali Chakrabarti, and Trevor Mudge. 2016. Checkpointing Exascale Memory Systems with Existing Memory Technologies. In Proceedings of the Second International Symposium on Memory Systems (MEMSYS '16). ACM, New York, NY, USA, 18--29. Google ScholarDigital Library
Katie Antypas, Nicholas Wright, Nicholas P Cardo, Allison Andrews, and Matthew Cordery. 2014. Cori: A Cray XC Pre-exascale System for NERSC. Cray User Group Proceedings. Cray (2014).Google Scholar
Daniel P Bovet and Marco Cesati. 2005. Understanding the Linux Kernel: from I/O ports to process management. O'Reilly. Google ScholarDigital Library
Franck Cappello, Al Geist, Bill Gropp, Laxmikant Kale, Bill Kramer, and Marc Snir. 2009. Toward Exascale Resilience. The International Journal of High Performance Computing Applications 23, 4 (2009), 374--388. Google ScholarDigital Library
Barbara Chapman, Tony Curtis, Swaroop Pophale, Stephen Poole, Jeff Kuehn, Chuck Koelbel, and Lauren Smith. 2010. Introducing OpenSHMEM: SHMEM for the PGAS community. In Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model. ACM, 2. Google ScholarDigital Library
Matthieu Courbariaux, Yoshua Bengio, and Jean-Pierre David. 2014. Training deep neural networks with low precision multiplications. arXiv preprint arXiv:1412.7024 (2014).Google Scholar
Leonardo Dagum and Ramesh Menon. 1998. OpenMP: An industry-standard API for shared-memory programming. Computing in Science & Engineering 1 (1998). Google ScholarDigital Library
Jack Dongarra, Pete Beckman, Terry Moore, Patrick Aerts, Giovanni Aloisio, Jean-Claude Andre, David Barkai, Jean-Yves Berthou, Taisuke Boku, Bertrand Braunschweig, et al. 2011. The International Exascale Software Project Roadmap. The International Journal of High-Performance Computing Applications 25, 1 (2011). Google ScholarDigital Library
Piotr Dorożyński, Pawełt Czarnul, Artur Malinowski, Krzysztof Czuryłto, Łukasz Dorau, Maciej Maciejewski, and Pawełt Skowron. 2016. Checkpointing of parallel MPI applications using MPI one-sided API with support for byte-addressable non-volatile RAM. Procedia Computer Science 80 (2016), 30--40. Google ScholarDigital Library
H. Elnawawy, M. Alshboul, J. Tuck, and Y. Solihin. 2017. Efficient Checkpointing of Loop-Based Codes for Non-volatile Main Memory. In 2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT). 318--329.Google Scholar
Alessandro Fanfarillo, Tobias Burnus, Valeria Cardellini, Salvatore Filippone, Dan Nagle, and Damian Rouson. 2014. OpenCoarrays: Open-source Transport Layers supporting Coarray Fortran compilers. In Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models. ACM, 4. Google ScholarDigital Library
Alessandro Fanfarillo, Sudip Kumar Garain, Dinshaw Balsara, and Daniel Nagle. 2019. Resilient Computational Applications using Coarray Fortran. Parallel Comput. 81 (2019), 58--67.Google ScholarCross Ref
Michael Feldman. 2017. Oak Ridge readies Summit supercomputer for 2018 debut. in: Top500.org, http://bit.ly/2ERRFr9. {On-Line}.Google Scholar
Robert Gerstenberger, Maciej Besta, and Torsten Hoefler. 2014. Enabling highly-scalable remote memory access programming with MPI-3 one sided. Scientific Programming 22, 2 (2014), 75--91. Google ScholarDigital Library
Gurbinder Gill, Roshan Dathathri, Loc Hoang, Ramesh Peri, and Keshav Pingali. 2019. Single Machine Graph Analytics on Massive Datasets Using Intel Optane DC Persistent Memory. arXiv preprint arXiv:1904.07162 (2019).Google Scholar
William Gropp, Torsten Hoefler, Rajeev Thakur, and Ewing Lusk. 2014. Using advanced MPI: Modern features of the message-passing interface. MIT Press. Google ScholarDigital Library
William Gropp and Ewing Lusk. 2004. Fault tolerance in message passing interface programs. The International Journal of High Performance Computing Applications 18, 3 (2004), 363--372. Google ScholarDigital Library
David Henty. 2011. A Parallel Benchmark Suite for Fortran Coarrays. In Parallel Computing. Elsevier, 281--288.Google Scholar
Joseph Izraelevitz, Jian Yang, Lu Zhang, Juno Kim, Xiao Liu, Amirsaman Memaripour, Yun Joon Soh, Zixuan Wang, Yi Xu, Subramanya R Dulloor, et al. 2019. Basic Performance Measurements of the Intel Optane DC Persistent Memory Module. arXiv preprint arXiv:1903.05714 (2019).Google Scholar
Edward Karrels and Ewing Lusk. 1994. Performance analysis of MPI programs. In Proceedings of the Workshop on Environments and Tools For Parallel Scientific Computing. 195--200.Google Scholar
Urs Köster, Tristan Webb, Xin Wang, Marcel Nassar, Arjun K Bansal, William Constable, Oguz Elibol, Scott Gray, Stewart Hall, Luke Hornof, Amir Khosrow-shahi, Carey Kloss, Ruby J Pai, and Naveen Rao. 2017. Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks. In Advances in Neural Information Processing Systems 30 (NIPS 2017). 1740--1750. Google ScholarDigital Library
John D McCalpin. 1995. A survey of memory bandwidth and machine balance in current high performance computers. IEEE TCCA Newsletter 19 (1995), 25.Google Scholar
MPI Forum. 2015. MPI: A Message-Passing Interface Standard. Vol. 3.1. http://mpi-forum.org/docs/mpi-3.1/mpi31-report.pdf. Accessed: 2019-04-21. Google ScholarDigital Library
Mihir Nanavati, Malte Schwarzkopf, Jake Wires, and Andrew Warfield. 2015. Non-volatile storage. Commun. ACM 59, 1 (2015), 56--63. Google ScholarDigital Library
Sai Narasimhamurthy, Nikita Danilov, Sining Wu, Ganesan Umanesan, Stefano Markidis, Sergio Rivas-Gomez, Ivy Bo Peng, Erwin Laure, Dirk Pleiter, and Shaun De Witt. 2018. SAGE: Percipient Storage for Exascale Data-centric Computing. Parallel Computing (2018).Google Scholar
Robert W Numrich and John Reid. 1998. Co-Array Fortran for parallel programming. In ACM Sigplan Fortran Forum, Vol. 17. ACM, 1--31. Google ScholarDigital Library
Ivy Bo Peng, Roberto Gioiosa, Gokcen Kestor, Pietro Cicotti, Erwin Laure, and Stefano Markidis. 2017. Exploring the Performance Benefit of Hybrid Memory System on HPC Environments. In Parallel and Distributed Processing Symposium Workshops (IPDPSW), 2017 IEEE International. IEEE, 683--692.Google Scholar
Daniel A Reed and Jack Dongarra. 2015. Exascale computing and big data. Commun. ACM 58, 7 (2015), 56--68. Google ScholarDigital Library
John Reid. 2018. The new features of Fortran 2018. In ACM SIGPLAN Fortran Forum, Vol. 37. ACM, 5--43. Google ScholarDigital Library
John Reid and Robert W Numrich. 2007. Co-arrays in the next Fortran Standard. Scientific Programming 15, 1 (2007), 9--26. Google ScholarDigital Library
Sergio Rivas-Gomez, Roberto Gioiosa, Ivy Bo Peng, Gokcen Kestor, Sai Narasimhamurthy, Erwin Laure, and Stefano Markidis. 2018. MPI Windows on Storage for HPC Applications. Parallel Computing 77 (2018), 38--56.Google ScholarCross Ref
Gabriel Rodriguez, María J. Martín, Patricia González, Juan Touriño, and Ramón Doallo. 2010. CPPC: A Compiler-assisted Tool for Portable Checkpointing of Message-passing Applications. Concurr. Comput.: Pract. Exper. 22, 6 (April 2010), 749--766. Google ScholarDigital Library
David Schneider. 2018. US supercomputing strikes back. IEEE Spectrum 55, 1 (2018), 52--53.Google ScholarCross Ref
Monika ten Bruggencate and Duncan Roweth. 2010. DMAPP - An API for One-sided Program Models on Baker Systems. In Cray User Group Conference.Google Scholar
Rob F Van der Wijngaart and Timothy G Mattson. 2014. The Parallel Research Kernels. In 2014 IEEE High Performance Extreme Computing Conference (HPEC). IEEE, 1--6.Google Scholar
Sudharshan S Vazhkudai, Bronis R de Supinski, Arthur S Bland, Al Geist, James Sexton, Jim Kahle, Christopher J Zimmer, Scott Atchley, Sarp Oral, Don E Maxwell, et al. 2018. The design, deployment, and evaluation of the CORAL pre-exascale systems. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis. IEEE Press, 52. Google ScholarDigital Library

Index Terms

Persistent coarrays: integrating MPI storage windows in coarray fortran
1. Computing methodologies
  1. Distributed computing methodologies
    1. Distributed programming languages
  2. Parallel computing methodologies
    1. Parallel programming languages

Recommendations

OpenSHMEM as a Portable Communication Layer for PGAS Models: A Case Study with Coarray Fortran
CLUSTER '15: Proceedings of the 2015 IEEE International Conference on Cluster Computing

Languages and libraries based on the Partitioned Global Address Space (PGAS) programming model have emerged in recent years with a focus on addressing the programming challenges for scalable parallel systems. Among these, Coarray Fortran (CAF) is unique ...
Read More
A Modern Fortran Interface in OpenSHMEM Need for Interoperability with Parallel Fortran Using Coarrays
Special Issue on Innovations in Systems for Irregular Applications, Part 2

Languages and libraries based on Partitioned Global Address Space (PGAS) programming models are convenient for exploiting scalable parallelism on large applications across different domains with irregular memory access patterns. OpenSHMEM is a PGAS-...
Read More
Fortran 2008 coarrays

Coarrays are a Fortran 2008 standard feature intended for SIMD type parallel programming. The runtime environment starts a number of identical executable images of the coarray program, on multiple processors, which could be actual physical processors or ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
EuroMPI '19: Proceedings of the 26th European MPI Users' Group Meeting
September 2019
134 pages
ISBN:9781450371759
DOI:10.1145/3343211
Editors:
Torsten Hoefler
ETH, Zürich
,
Jesper Larsson Träff
TU Wien, Austria
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 11 September 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
MPI storage windows
coarray fortran
persistent coarrays
Qualifiers
- research-article
Conference

Acceptance Rates
EuroMPI '19 Paper Acceptance Rate13of26submissions,50%Overall Acceptance Rate66of139submissions,47%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3
  Total Citations
  View Citations
- 72
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Persistent coarrays: integrating MPI storage windows in coarray fortran

EuroMPI '19: Proceedings of the 26th European MPI Users' Group Meeting

ABSTRACT

References

Cited By

Index Terms

Recommendations

OpenSHMEM as a Portable Communication Layer for PGAS Models: A Case Study with Coarray Fortran

A Modern Fortran Interface in OpenSHMEM Need for Interoperability with Parallel Fortran Using Coarrays

Fortran 2008 coarrays