research-article

BOAT: Building Auto-Tuners with Structured Bayesian Optimization

Authors:

Valentin Dalibard,

Michael Schaarschmidt,

Eiko YonekiAuthors Info & Claims

WWW '17: Proceedings of the 26th International Conference on World Wide Web

Pages 479 - 488

https://doi.org/10.1145/3038912.3052662

Published: 03 April 2017 Publication History

Abstract

Due to their complexity, modern systems expose many configuration parameters which users must tune to maximize performance. Auto-tuning has emerged as an alternative in which a black-box optimizer iteratively evaluates configurations to find efficient ones. Unfortunately, for many systems, such as distributed systems, evaluating performance takes too long and the space of configurations is too large for the optimizer to converge within a reasonable time.

We present BOAT, a framework which allows developers to build efficient bespoke auto-tuners for their system, in situations where generic auto-tuners fail. At BOAT's core is structured Bayesian optimization (SBO), a novel extension of the Bayesian optimization algorithm. SBO leverages contextual information provided by system developers, in the form of a probabilistic model of the system's behavior, to make informed decisions about which configurations to evaluate. In a case study, we tune the scheduling of a neural network computation on a heterogeneous cluster. Our auto-tuner converges within ten iterations. The optimized configurations outperform those found by generic auto-tuners in thirty iterations by up to 2X.

References

[1]

Marín Abadi et al. TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorflow.org.

[2]

Marín Abadi et al. Tensorflow: A system for large-scale machine learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), pages 265--283, 2016.

Digital Library

[3]

Jason Ansel et al. Petabricks: A language and compiler for algorithmic choice. In Proceedings of the 2009 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '09, pages 38--49, New York, NY, USA, 2009. ACM.

Digital Library

[4]

Jason Ansel et al. Opentuner: an extensible framework for program autotuning. In Proceedings of the 23rd international conference on Parallel architectures and compilation, pages 303--316. ACM, 2014.

Digital Library

[5]

Eric Brochu, Vlad M Cora, and Nando de Freitas. A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. Technical Report UBC TR-2009-023, University of British Columbia, 2009.

[6]

Brendan Burns, Brian Grant, David Oppenheimer, Eric Brewer, and John Wilkes. Borg, Omega, and Kubernetes. ACM Queue, 14:70--93, 2016.

Digital Library

[7]

Surajit Chaudhuri. An overview of query optimization in relational systems. In Proceedings of the seventeenth symposium on Principles of database systems, pages 34--43. ACM, 1998.

Digital Library

[8]

Jianmin Chen, Rajat Monga, Samy Bengio, and Rafal Jozefowicz. Revisiting distributed synchronous SGD. arXiv preprint arXiv:1604.00981, 2016.

[9]

Soumith Chintala. Deepmark benchmark. https://github.com/DeepMark/deepmark.

[10]

Scott Clark, Eric Liu, Peter Frazier, JiaLei Wang, Deniz Oktay, and Norases Vesdapunt. MOE: A global, black box optimization engine for real world metric optimization. https://github.com/Yelp/MOE, 2014.

[11]

Brian F Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. Benchmarking cloud serving systems with YCSB. In Proceedings of the 1st ACM symposium on Cloud computing, pages 143--154. ACM, 2010.

Digital Library

[12]

Valentin Dalibard. A framework to build bespoke auto-tuners with structured Bayesian optimisation. PhD thesis, University of Cambridge (UCAM-CL-TR-900), January 2017.

[13]

Jeffrey Dean et al. Large scale distributed deep networks. In Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012, pages 1232--1240, 2012.

Digital Library

[14]

Christina Delimitrou and Christos Kozyrakis. Paragon: QoS-aware scheduling for heterogeneous datacenters. In ACM SIGPLAN Notices, volume 48, pages 77--88. ACM, 2013.

Digital Library

[15]

Christina Delimitrou and Christos Kozyrakis. Quasar: resource-efficient and QoS-aware cluster management. In ACM SIGPLAN Notices, volume 49, pages 127--144. ACM, 2014.

Digital Library

[16]

Diego Didona, Nuno Diegues, Anne-Marie Kermarrec, Rachid Guerraoui, Ricardo Neves, and Paolo Romano. Proteus™: Abstraction meets performance in transactional memory. In Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems, pages 757--771. ACM, 2016.

Digital Library

[17]

Andrew D. Gordon, Thomas A. Henzinger, Aditya V. Nori, and Sriram K. Rajamani. Probabilistic programming. In International Conference on Software Engineering (ICSE, FOSE track), 2014.

Digital Library

[18]

Herodotos Herodotou, Harold Lim, Gang Luo, Nedyalko Borisov, Liang Dong, Fatma Bilgen Cetin, and Shivnath Babu. Starfish: A self-tuning system for big data analytics. In CIDR, volume 11, pages 261--272, 2011.

[19]

Frank Hutter, Holger H Hoos, and Kevin Leyton-Brown. Sequential model-based optimization for general algorithm configuration. In International Conference on Learning and Intelligent Optimization, pages 507--523. Springer, 2011.

Digital Library

[20]

Alex Krizhevsky. One weird trick for parallelizing convolutional neural networks. arXiv preprint arXiv:1404.5997, 2014.

[21]

Kevin P. Murphy. Machine Learning: A Probabilistic Perspective. The MIT Press, 2012.

Digital Library

[22]

Carl Edward Rasmussen. Gaussian processes for machine learning. 2006.

Digital Library

[23]

Benjamin Recht, Christopher Re, Stephen Wright, and Feng Niu. Hogwild: A lock-free approach to parallelizing stochastic gradient descent. In Advances in Neural Information Processing Systems, pages 693--701, 2011.

Digital Library

[24]

Frank Seide, Hao Fu, Jasha Droppo, Gang Li, and Dong Yu. 1-bit stochastic gradient descent and its application to data-parallel distributed training of speech dnns. In INTERSPEECH, pages 1058--1062, 2014.

[25]

Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P Adams, and Nando de Freitas. Taking the human out of the loop: A review of Bayesian optimization. Proceedings of the IEEE, 104(1):148--175, 2016.

[26]

Jasper Snoek, Hugo Larochelle, and Ryan Prescott Adams. Practical bayesian optimization of machine learning algorithms. In Neural Information Processing Systems, 2012.

Digital Library

[27]

Christian Szegedy et al. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1--9, 2015.

[28]

The Apache Software Foundation. Apache Cassandra. http://cassandra.apache.org.

[29]

Shivaram Venkataraman, Zongheng Yang, Michael Franklin, Benjamin Recht, and Ion Stoica. Ernest: efficient performance prediction for large-scale advanced analytics. In 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI 16), pages 363--378, 2016.

Digital Library

[30]

R Clint Whaley and Jack J Dongarra. Automatically tuned linear algebra software. In Proceedings of the 1998 ACM/IEEE conference on Supercomputing, pages 1--27. IEEE Computer Society, 1998.

Digital Library

Cited By

Rachuri SShaik NChoksi MGandhi A(2024)EcoEdgeInfer: Dynamically Optimizing Latency and Sustainability for Inference on Edge Devices2024 IEEE/ACM Symposium on Edge Computing (SEC)10.1109/SEC62691.2024.00023(191-205)Online publication date: 4-Dec-2024
https://doi.org/10.1109/SEC62691.2024.00023
Rakshit AReddy SRamnath RArora ABoubin J(2024)Righteous: Automatic Right-Sizing for Complex Edge Deployments2024 IEEE/ACM Symposium on Edge Computing (SEC)10.1109/SEC62691.2024.00010(15-28)Online publication date: 4-Dec-2024
https://doi.org/10.1109/SEC62691.2024.00010
Jeannot ELemarinier PMercier GRobert-Hayek SSartori R(2024)Application-Agnostic Auto-Tuning of Open MPI Collectives Using Bayesian Optimization2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW63119.2024.00141(771-781)Online publication date: 27-May-2024
https://doi.org/10.1109/IPDPSW63119.2024.00141
Show More Cited By

Index Terms

BOAT: Building Auto-Tuners with Structured Bayesian Optimization
1. Mathematics of computing
  1. Probability and statistics
    1. Probabilistic inference problems
      1. Bayesian computation
2. Software and its engineering
  1. Software notations and tools

Recommendations

Performance Tuning of Matrix Multiplication in OpenCL on Different GPUs and CPUs
SCC '12: Proceedings of the 2012 SC Companion: High Performance Computing, Networking Storage and Analysis

OpenCL (Open Computing Language) is a framework for general-purpose parallel programming. Programs written in OpenCL are functionally portable across multiple processors including CPUs, GPUs, and also FPGAs. Using an auto-tuning technique makes ...
From CUDA to OpenCL: Towards a performance-portable solution for multi-platform GPU programming

In this work, we evaluate OpenCL as a programming tool for developing performance-portable applications for GPGPU. While the Khronos group developed OpenCL with programming portability in mind, performance is not necessarily portable. OpenCL has ...
Developing High-Performance, Portable OpenCL Code via Multi-Dimensional Homomorphisms
IWOCL '19: Proceedings of the International Workshop on OpenCL

A key challenge in programming high-performance applications is achieving portable performance, such that the same program code can reach a consistent level of performance over the variety of modern parallel processors, including multi-core CPU and ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

WWW '17: Proceedings of the 26th International Conference on World Wide Web

April 2017

1678 pages

ISBN:9781450349130

General Chairs:
Rick Barrett
W3Events
,
Rick Cummings
Murdoch University
,
Program Chairs:
Eugene Agichtein
Emory University
,
Evgeniy Gabrilovich
Google Research

Copyright © 2017 Copyright is held by the International World Wide Web Conference Committee (IW3C2).

Sponsors

IW3C2: International World Wide Web Conference Committee

In-Cooperation

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

International World Wide Web Conferences Steering Committee

Republic and Canton of Geneva, Switzerland

Publication History

Published: 03 April 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

University of Cambridge Computer Laboratory
EPSRC

Conference

WWW '17

Sponsor:

IW3C2

WWW '17: 26th International World Wide Web Conference

April 3 - 7, 2017

Perth, Australia

Acceptance Rates

WWW '17 Paper Acceptance Rate 164 of 966 submissions, 17%;

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

51
Total Citations
View Citations
764
Total Downloads

Downloads (Last 12 months)54
Downloads (Last 6 weeks)8

Reflects downloads up to 14 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Rachuri SShaik NChoksi MGandhi A(2024)EcoEdgeInfer: Dynamically Optimizing Latency and Sustainability for Inference on Edge Devices2024 IEEE/ACM Symposium on Edge Computing (SEC)10.1109/SEC62691.2024.00023(191-205)Online publication date: 4-Dec-2024
https://doi.org/10.1109/SEC62691.2024.00023
Rakshit AReddy SRamnath RArora ABoubin J(2024)Righteous: Automatic Right-Sizing for Complex Edge Deployments2024 IEEE/ACM Symposium on Edge Computing (SEC)10.1109/SEC62691.2024.00010(15-28)Online publication date: 4-Dec-2024
https://doi.org/10.1109/SEC62691.2024.00010
Jeannot ELemarinier PMercier GRobert-Hayek SSartori R(2024)Application-Agnostic Auto-Tuning of Open MPI Collectives Using Bayesian Optimization2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW63119.2024.00141(771-781)Online publication date: 27-May-2024
https://doi.org/10.1109/IPDPSW63119.2024.00141
Bolet GGeorgakoudis GParasyris KCameron KBeckingsale DGamblin T(2024)An Exploration of Global Optimization Strategies for Autotuning OpenMP-based Codes2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW63119.2024.00138(741-750)Online publication date: 27-May-2024
https://doi.org/10.1109/IPDPSW63119.2024.00138
Yang THu WPeng WLi YLi JWang GLiu X(2024)VDTuner: Automated Performance Tuning for Vector Data Management Systems2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00332(4357-4369)Online publication date: 13-May-2024
https://doi.org/10.1109/ICDE60146.2024.00332
Yang TChen RLi YLiu XWang G(2023)CoTuner: A Hierarchical Learning Framework for Coordinately Optimizing Resource Partitioning and Parameter TuningProceedings of the 52nd International Conference on Parallel Processing10.1145/3605573.3605578(317-326)Online publication date: 7-Aug-2023
https://dl.acm.org/doi/10.1145/3605573.3605578
Liu YXu HLau W(2023)Cloud Configuration Optimization for Recurring Batch-Processing ApplicationsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.324608634:5(1495-1507)Online publication date: May-2023
https://doi.org/10.1109/TPDS.2023.3246086
Yu ZPei CZhang SWen XLi JXie GPei D(2023)AutoKAD: Empowering KPI Anomaly Detection with Label-Free Deployment2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE)10.1109/ISSRE59848.2023.00063(13-23)Online publication date: 9-Oct-2023
https://doi.org/10.1109/ISSRE59848.2023.00063
Cao RBao LZhao KZhangsun P(2023)Etune: Efficient configuration tuning for big-data software systems via configuration space reductionJournal of Systems and Software10.1016/j.jss.2023.111936(111936)Online publication date: Dec-2023
https://doi.org/10.1016/j.jss.2023.111936
Cao RBao LZhangsun PWu CWei SSun RLi RZhang Z(2023)PTSSBench: a performance evaluation platform in support of automated parameter tuning of software systemsAutomated Software Engineering10.1007/s10515-023-00402-z31:1Online publication date: 21-Nov-2023
https://doi.org/10.1007/s10515-023-00402-z
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten