research-article

SHAMan: an intelligent framework for HPC auto-tuning of I/O accelerators

Authors:
Sophie Robert

Li-PaRaD, University of Versailles, Versailles, France Atos BDS R&D Data Management Echirolles, France

Li-PaRaD, University of Versailles, Versailles, France Atos BDS R&D Data Management Echirolles, France
View Profile

,
Soraya Zertal

Li-PaRaD, University of Versailles, Versailles, France

Li-PaRaD, University of Versailles, Versailles, France
View Profile

,
Gaël Goret

Atos BDS R&D Data Management, Echirolles, France

Atos BDS R&D Data Management, Echirolles, France
View Profile

SITA'20: Proceedings of the 13th International Conference on Intelligent Systems: Theories and ApplicationsSeptember 2020Article No.: 31Pages 1–6https://doi.org/10.1145/3419604.3419775

Published:08 November 2020Publication History

SITA'20: Proceedings of the 13th International Conference on Intelligent Systems: Theories and Applications

Pages 1–6

ABSTRACT

Like most modern computer systems, High Performance Computing (HPC) machines integrate many highly configurable hardware devices and software components. Finding their optimal parametrization is a complex task, as the size of the parametric space and the non-linear behavior of HPC systems make hand tuning, theoretical modeling or exhaustive sampling unsuitable for most cases. Auto-tuning methods relying on black-box optimization have emerged as a promising solution for finding systems' best parametrization without making any assumption on their behaviors. In this paper, we present the architecture of an auto-tuning framework, called Smart HPC Application MANager (SHAMan), that integrates black-box optimization heuristics to find the optimal parametrization of an Input/Output (I/O) accelerator for a HPC application. We describe the conceptual and technical architecture of the framework and its native support for HPC clusters' ecosystem. We detail in depth the stand-alone optimization engine and its integration as a service provided by a Web application. We deployed and tested the framework by tuning an I/O accelerator developed by the Atos company on a HPC cluster running in production. The tuner's performance is evaluated by optimizing 90 different I/O oriented applications. We show a median improvement of 29% in speed-up compared to the default parametrization and this improvement goes up to 98% for a certain class of applications.

References

[n.d.]. Flask, a lightweight WSGI web application framework. https://www.palletsprojects.com/p/flask/. Online; accessed: 2020-02-06.Google Scholar
[n.d.]. MongoDB, the most popular database for modern apps. https://www.mongodb.com/. Online; accessed: 2020-02-06.Google Scholar
Takuya Akiba, Shotaro Sano, Toshihiko Yanase, Takeru Ohta, and Masanori Koyama. 2019. Optuna: A Next-generation Hyperparameter Optimization Framework. arXiv: cs.LG/1907.10902Google Scholar
Jason Ansel, Shoaib Kamil, Kalyan Veeramachaneni, Jonathan Ragan-Kelley, Jeffrey Bosboom, Una-May O'Reilly, and Saman Amarasinghe. 2014. OpenTuner: An Extensible Framework for Program Autotuning. In Proceedings of the 23rd International Conference on Parallel Architectures and Compilation (PACT '14). Association for Computing Machinery, New York, NY, USA, 303--316. https://doi.org/10.1145/2628071.2628092 Google ScholarDigital Library
P. Balaprakash, J. Dongarra, T. Gamblin, M. Hall, J. K. Hollingsworth, B. Norris, and R. Vuduc. 2018. Autotuning in High-Performance Computing Applications. Proc. IEEE 106, 11 (2018), 2068--2083.Google ScholarCross Ref
B. Behzad, S. Byna, M. Prabhat, and M. Snir. 2015. Pattern-driven parallel I/O tuning. In Proceedings of the 10th Parallel Data Storage Workshop. 43--48. Google ScholarDigital Library
B. Behzad, H. V. T. Luu, J. Huchette, S. Byna, Prabhat, R. Aydt, Q. Koziol, and M. Snir. 2013. Taming Parallel I/O Complexity with Auto-tuning. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC '13). Article 68, 12 pages. Google ScholarDigital Library
Z. Cao. 2018. A Practical, Real-Time Auto-Tuning Framework for Storage Systems.Google Scholar
Z. Cao, V. Tarasov, S. Tiwari, and E. Zadok. 2018. Towards Better Understanding of Black-box Auto-tuning: A Comparative Analysis for Storage Systems. In Proceedings of the 2018 USENIX Conference on Usenix Annual Technical Conference (USENIX ATC '18). 893--907. Google ScholarDigital Library
L.Davis. 1991. Handbook of Genetic Algorithms. Van Nostrand Reinhold.Google Scholar
K. T. Fang, R. Li, and A. Sudjianto. 2005. Design and Modeling for Computer Experiments (Computer Science & Data Analysis). Chapman & Hall/CRC. Google ScholarDigital Library
Daniel Golovin, Benjamin Solnik, Subhodeep Moitra, Greg Kochanski, John Elliot Karro, and D. Sculley (Eds.). 2017. Google Vizier: A Service for Black-Box Optimization. http://www.kdd.org/kdd2017/papers/view/google-vizier-a-service-for-black-box-optimizationGoogle Scholar
M. Jette, A. Yoo, and M. Grondona. 2003. SLURM: Simple linux utility for resource management. Lecture notes in computer science.Google Scholar
R. Li K-T. Fang and A. Sudjianto. 2005. Design and Modleing for Computer Experiments. Chapman and Hall/CRC.Google Scholar
Patrick Koch, Oleg Golovidov, Steven Gardner, Brett Wujek, Joshua Griffin, and Yan Xu. 2018. Autotune. Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery Data Mining (Jul 2018). https://doi.org/10.1145/3219819.3219837Google ScholarDigital Library
Y. Li, K. Chang, O. Bel, E. L. Miller, and D. D. E Long. 2017. CAPES: Unsupervised Storage Performance Tuning Using Neural Network-based Deep Reinforcement Learning. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC '17). New York, NY, USA, Article 42, 14 pages. Google ScholarDigital Library
T. Miyazaki, I. Sato, and N. Shimizu. 2018. Bayesian Optimization of HPC Systems for Energy Efficiency. In High Performance Computing. Springer International Publishing, Cham, 44--62.Google Scholar
S. Robert, S. Zertal, and G. Goret. 2019. Auto-tuning of IO accelerators using black-box optimization. In Proceedings of the International Conference on High Performance Computing Simulation (HPCS).Google Scholar
C. D. Gelatt S. Kirkpatrick and M. P. Vecchi. 1983. Optimization by Simulated Annealing. Vol. 220. Science.Google Scholar
Y. Hamadi V. K. Ky, C. D'Ambrosio and L. Liberti. 2016. Surrogate-based methods for black-box optimization. International Transactions in Operational Research 24 (2016).Google Scholar

Index Terms

SHAMan: an intelligent framework for HPC auto-tuning of I/O accelerators
1. General and reference
  1. Cross-computing tools and techniques
    1. Performance
2. Theory of computation
  1. Design and analysis of algorithms
    1. Mathematical optimization
      1. Discrete optimization
        Optimization with randomized search heuristics
    2. Online algorithms
      1. Online learning algorithms

Recommendations

Taming parallel I/O complexity with auto-tuning
SC '13: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis

We present an auto-tuning system for optimizing I/O performance of HDF5 applications and demonstrate its value across platforms, applications, and at scale. The system uses a genetic algorithm to search a large space of tunable parameters and to ...
Read More
Multi-core and many-core shared-memory parallel raycasting volume rendering optimization and tuning

Given the computing industry trend of increasing processing capacity by adding more cores to a chip, the focus of this work is tuning the performance of a staple visualization algorithm, raycasting volume rendering, for shared-memory parallelism on ...
Read More
SHAMan: A Flexible Framework for Auto-tuning HPC Systems
Modelling, Analysis, and Simulation of Computer and Telecommunication Systems
Abstract
Modern computer components, both hardware and software, come with many tunable parameters and their parametrization can have a strong impact on their performance. Auto-tuning methods relying on black-box optimization have delivered good results ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

SITA'20: Proceedings of the 13th International Conference on Intelligent Systems: Theories and Applications
September 2020
333 pages
ISBN:9781450377331
DOI:10.1145/3419604

Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 8 November 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Auto-tuning
High Performance Computing
I/O accelerators
Input/Output
Performance optimization
Randomized search heuristics
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 4
  Total Citations
  View Citations
- 61
  Total Downloads
- Downloads (Last 12 months)17
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

SHAMan: an intelligent framework for HPC auto-tuning of I/O accelerators

SITA'20: Proceedings of the 13th International Conference on Intelligent Systems: Theories and Applications

ABSTRACT

References

Cited By

Index Terms

Recommendations

Taming parallel I/O complexity with auto-tuning

Multi-core and many-core shared-memory parallel raycasting volume rendering optimization and tuning

SHAMan: A Flexible Framework for Auto-tuning HPC Systems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

SHAMan: an intelligent framework for HPC auto-tuning of I/O accelerators

SITA'20: Proceedings of the 13th International Conference on Intelligent Systems: Theories and Applications

ABSTRACT

References

Cited By

Index Terms

Recommendations

Taming parallel I/O complexity with auto-tuning

Multi-core and many-core shared-memory parallel raycasting volume rendering optimization and tuning

SHAMan: A Flexible Framework for Auto-tuning HPC Systems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media