research-article

Program autotuning as a service: opportunities and challenges

Authors:
Oleg Sukhoroslov

Institute for Information Transmission Problems of the Russian Academy of Sciences, Moscow, Russia

Institute for Information Transmission Problems of the Russian Academy of Sciences, Moscow, Russia
View Profile

,
Sergey Volkov

Institute for Information Transmission Problems of the Russian Academy of Sciences, Moscow, Russia

Institute for Information Transmission Problems of the Russian Academy of Sciences, Moscow, Russia
View Profile

,
Alexander Afanasiev

Institute for Information Transmission Problems of the Russian Academy of Sciences, Moscow, Russia

Institute for Information Transmission Problems of the Russian Academy of Sciences, Moscow, Russia
View Profile

UCC '16: Proceedings of the 9th International Conference on Utility and Cloud ComputingDecember 2016Pages 148–155https://doi.org/10.1145/2996890.2996903

Published:06 December 2016Publication History

UCC '16: Proceedings of the 9th International Conference on Utility and Cloud Computing

Pages 148–155

ABSTRACT

Program autotuning is becoming an increasingly valuable tool for improving performance portability across diverse target architectures, exploring trade-offs between several criteria, or meeting quality of service requirements. Recent work on general autotuning frameworks enabled rapid development of domain-specific autotuners reusing common libraries of parameter types and search techniques. In this work we explore the use of such frameworks to develop general-purpose online services for program autotuning using the Software as a Service model. Beyond the common benefits of this model, the proposed approach opens up a number of unique opportunities, such as collecting performance data and utilizing it to improve further runs, or enabling remote online autotuning. However, the proposed autotuning as a service approach also brings in several challenges, such as accessing target systems, dealing with measurement latency, and supporting execution of user-provided code. This paper presents the first step towards implementing the proposed approach and addressing these challenges. We describe an implementation of generic autotuning service that can be used for tuning arbitrary programs on user-provided computing systems. The service is based on OpenTuner autotuning framework and runs on Everest platform that enables rapid development of computational web services. In contrast to OpenTuner, the service doesn't require installation of the framework, allows users to avoid writing code and supports efficient parallel execution of measurement tasks across multiple machines. The performance of the service is evaluated by using it for tuning synthetic and real programs.

References

Everest. {online}. http://everest.distcomp.org/.Google Scholar
A. Afanasiev, O. Sukhoroslov, and V. Voloshinov. MathCloud: Publication and reuse of scientific applications as restful web services. In Parallel Computing Technologies, pages 394--408. Springer, 2013. Google ScholarDigital Library
J. Ansel, S. Kamil, K. Veeramachaneni, J. Ragan-Kelley, J. Bosboom, U.-M. O'Reilly, and S. Amarasinghe. Opentuner: an extensible framework for program autotuning. In Proceedings of the 23rd international conference on Parallel architectures and compilation, pages 303--316. ACM, 2014. Google ScholarDigital Library
J. W. Choi, A. Singh, and R. W. Vuduc. Model-driven autotuning of sparse matrix-vector multiply on gpus. In ACM Sigplan Notices, volume 45, pages 115--126. ACM, 2010. Google ScholarDigital Library
M. Christen, O. Schenk, and H. Burkhart. Patus: A code generation and autotuning framework for parallel iterative stencil computations on modern microarchitectures. In Parallel & Distributed Processing Symposium (IPDPS), 2011 IEEE International, pages 676--687. IEEE, 2011. Google ScholarDigital Library
T. Delaitre, T. Kiss, A. Goyeneche, G. Terstyanszky, S. Winter, and P. Kacsuk. GEMLCA: Running legacy code applications as grid services. Journal of Grid Computing, 3(1-2):75--90, 2005.Google ScholarCross Ref
J. J. Dongarra, P. Luszczek, and A. Petitet. The linpack benchmark: past, present and future. Concurrency and Computation: practice and experience, 15(9):803--820, 2003.Google Scholar
M. Frigo and S. G. Johnson. The design and implementation of fftw3. Proceedings of the IEEE, 93(2):216--231, 2005.Google ScholarCross Ref
G. Fursin, A. Lokhmotov, and E. Plowman. Collective knowledge: towards r&d sustainability. In 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE), pages 864--869. IEEE, 2016. Google ScholarDigital Library
G. Fursin, R. Miceli, A. Lokhmotov, M. Gerndt, M. Baboulin, A. D. Malony, Z. Chamski, D. Novillo, and D. Del Vento. Collective mind: Towards practical and collaborative auto-tuning. Scientific Programming, 22(4):309--329, 2014.Google ScholarDigital Library
S. Kamil, C. Chan, L. Oliker, J. Shalf, and S. Williams. An auto-tuning framework for parallel multicore stencil computations. In Parallel & Distributed Processing (IPDPS), 2010 IEEE International Symposium on, pages 1--12. IEEE, 2010.Google ScholarCross Ref
S. Krishnan, L. Clementi, J. Ren, P. Papadopoulos, and W. Li. Design and evaluation of opal2: A toolkit for scientific software as a service. In Services-I, 2009 World Conference on, pages 709--716. IEEE, 2009. Google ScholarDigital Library
X. Li, M. J. Garzarán, and D. Padua. A dynamically tuned sorting library. In Code Generation and Optimization, 2004. CGO 2004. International Symposium on, pages 111--122. IEEE, 2004. Google ScholarDigital Library
X. Li, M. J. Garzaran, and D. Padua. Optimizing sorting with genetic algorithms. In Proceedings of the international symposium on Code generation and optimization, pages 99--110. IEEE Computer Society, 2005. Google ScholarDigital Library
T. Lutz, C. Fensch, and M. Cole. Partans: An autotuning framework for stencil computation on multi-gpu systems. ACM Transactions on Architecture and Code Optimization (TACO), 9(4):59, 2013. Google ScholarDigital Library
M. Olszewski and M. Voss. Install-time system for automatic generation of optimized parallel sorting algorithms. In PDPTA, pages 17--23. Citeseer, 2004.Google Scholar
M. Püschel, J. M. Moura, B. Singer, J. Xiong, J. Johnson, D. Padua, M. Veloso, and R. W. Johnson. Spiral: A generator for platform-adapted libraries of signal processing alogorithms. International Journal of High Performance Computing Applications, 18(1):21--45, 2004. Google ScholarDigital Library
L. Richardson and S. Ruby. RESTful web services. "O'Reilly Media, Inc.", 2008. Google ScholarDigital Library
O. Sukhoroslov, S. Volkov, and A. Afanasiev. A web-based platform for publication and distributed execution of computing applications. In Parallel and Distributed Computing (ISPDC), 2015 14th International Symposium on, pages 175--184, June 2015. Google ScholarDigital Library
S. Volkov and O. Sukhoroslov. A generic web service for running parameter sweep experiments in distributed computing environment. Procedia Computer Science, 66:477--486, 2015.Google ScholarCross Ref
R. Vuduc, J. W. Demmel, and K. A. Yelick. Oski: A library of automatically tuned sparse matrix kernels. In Journal of Physics: Conference Series, volume 16, page 521. IOP Publishing, 2005.Google Scholar
R. C. Whaley and J. J. Dongarra. Automatically tuned linear algebra software. In Proceedings of the 1998 ACM/IEEE conference on Supercomputing, pages 1--27. IEEE Computer Society, 1998. Google ScholarDigital Library
K. Wu. DeepTuner: A System for Search Technique Recommendation in Program Autotuning. PhD thesis, Massachusetts Institute of Technology, 2015.Google Scholar

Index Terms

Program autotuning as a service: opportunities and challenges

Recommendations

Compiler-based code generation and autotuning for geometric multigrid on GPU-accelerated supercomputers
Highlights
- Generate parallel CUDA code from sequential C input code using a compiler-based tool for key operators in Geometric Multigrid.
Abstract
GPUs, with their high bandwidths and computational capabilities are an increasingly popular target for scientific computing. Unfortunately, to date, harnessing the power of the GPU has required use of a GPU-specific programming model ...
Read More
Autotuning OpenACC work distribution via direct search
XSEDE '15: Proceedings of the 2015 XSEDE Conference: Scientific Advancements Enabled by Enhanced Cyberinfrastructure

OpenACC provides a high-productivity API for programming GPUs and similar accelerator devices. One of the last steps in tuning OpenACC programs is selecting values for the num_gangs and vector_length clauses, which control how a parallel workload is ...
Read More
Autotuning, code generation and optimizing compiler technology for gpus
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
UCC '16: Proceedings of the 9th International Conference on Utility and Cloud Computing
December 2016
549 pages
ISBN:9781450346160
DOI:10.1145/2996890
General Chairs:
Changjun Jiang
Tongji University, China
,
Omer Rana
Cardiff University, UK
,
Nick Antonopoulos
University of Derby, UK
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 6 December 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
distributed computing
program autotuning
software as a service
web services
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate38of125submissions,30%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 69
  Total Downloads
- Downloads (Last 12 months)4
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Program autotuning as a service: opportunities and challenges

UCC '16: Proceedings of the 9th International Conference on Utility and Cloud Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Compiler-based code generation and autotuning for geometric multigrid on GPU-accelerated supercomputers

Autotuning OpenACC work distribution via direct search

Autotuning, code generation and optimizing compiler technology for gpus

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Program autotuning as a service: opportunities and challenges

UCC '16: Proceedings of the 9th International Conference on Utility and Cloud Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Compiler-based code generation and autotuning for geometric multigrid on GPU-accelerated supercomputers

Autotuning OpenACC work distribution via direct search

Autotuning, code generation and optimizing compiler technology for gpus

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media