skip to main content
10.1145/3038912.3052662acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

BOAT: Building Auto-Tuners with Structured Bayesian Optimization

Published: 03 April 2017 Publication History

Abstract

Due to their complexity, modern systems expose many configuration parameters which users must tune to maximize performance. Auto-tuning has emerged as an alternative in which a black-box optimizer iteratively evaluates configurations to find efficient ones. Unfortunately, for many systems, such as distributed systems, evaluating performance takes too long and the space of configurations is too large for the optimizer to converge within a reasonable time.
We present BOAT, a framework which allows developers to build efficient bespoke auto-tuners for their system, in situations where generic auto-tuners fail. At BOAT's core is structured Bayesian optimization (SBO), a novel extension of the Bayesian optimization algorithm. SBO leverages contextual information provided by system developers, in the form of a probabilistic model of the system's behavior, to make informed decisions about which configurations to evaluate. In a case study, we tune the scheduling of a neural network computation on a heterogeneous cluster. Our auto-tuner converges within ten iterations. The optimized configurations outperform those found by generic auto-tuners in thirty iterations by up to 2X.

References

[1]
Marín Abadi et al. TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorflow.org.
[2]
Marín Abadi et al. Tensorflow: A system for large-scale machine learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), pages 265--283, 2016.
[3]
Jason Ansel et al. Petabricks: A language and compiler for algorithmic choice. In Proceedings of the 2009 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '09, pages 38--49, New York, NY, USA, 2009. ACM.
[4]
Jason Ansel et al. Opentuner: an extensible framework for program autotuning. In Proceedings of the 23rd international conference on Parallel architectures and compilation, pages 303--316. ACM, 2014.
[5]
Eric Brochu, Vlad M Cora, and Nando de Freitas. A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. Technical Report UBC TR-2009-023, University of British Columbia, 2009.
[6]
Brendan Burns, Brian Grant, David Oppenheimer, Eric Brewer, and John Wilkes. Borg, Omega, and Kubernetes. ACM Queue, 14:70--93, 2016.
[7]
Surajit Chaudhuri. An overview of query optimization in relational systems. In Proceedings of the seventeenth symposium on Principles of database systems, pages 34--43. ACM, 1998.
[8]
Jianmin Chen, Rajat Monga, Samy Bengio, and Rafal Jozefowicz. Revisiting distributed synchronous SGD. arXiv preprint arXiv:1604.00981, 2016.
[9]
Soumith Chintala. Deepmark benchmark. https://github.com/DeepMark/deepmark.
[10]
Scott Clark, Eric Liu, Peter Frazier, JiaLei Wang, Deniz Oktay, and Norases Vesdapunt. MOE: A global, black box optimization engine for real world metric optimization. https://github.com/Yelp/MOE, 2014.
[11]
Brian F Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. Benchmarking cloud serving systems with YCSB. In Proceedings of the 1st ACM symposium on Cloud computing, pages 143--154. ACM, 2010.
[12]
Valentin Dalibard. A framework to build bespoke auto-tuners with structured Bayesian optimisation. PhD thesis, University of Cambridge (UCAM-CL-TR-900), January 2017.
[13]
Jeffrey Dean et al. Large scale distributed deep networks. In Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012, pages 1232--1240, 2012.
[14]
Christina Delimitrou and Christos Kozyrakis. Paragon: QoS-aware scheduling for heterogeneous datacenters. In ACM SIGPLAN Notices, volume 48, pages 77--88. ACM, 2013.
[15]
Christina Delimitrou and Christos Kozyrakis. Quasar: resource-efficient and QoS-aware cluster management. In ACM SIGPLAN Notices, volume 49, pages 127--144. ACM, 2014.
[16]
Diego Didona, Nuno Diegues, Anne-Marie Kermarrec, Rachid Guerraoui, Ricardo Neves, and Paolo Romano. Proteus™: Abstraction meets performance in transactional memory. In Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems, pages 757--771. ACM, 2016.
[17]
Andrew D. Gordon, Thomas A. Henzinger, Aditya V. Nori, and Sriram K. Rajamani. Probabilistic programming. In International Conference on Software Engineering (ICSE, FOSE track), 2014.
[18]
Herodotos Herodotou, Harold Lim, Gang Luo, Nedyalko Borisov, Liang Dong, Fatma Bilgen Cetin, and Shivnath Babu. Starfish: A self-tuning system for big data analytics. In CIDR, volume 11, pages 261--272, 2011.
[19]
Frank Hutter, Holger H Hoos, and Kevin Leyton-Brown. Sequential model-based optimization for general algorithm configuration. In International Conference on Learning and Intelligent Optimization, pages 507--523. Springer, 2011.
[20]
Alex Krizhevsky. One weird trick for parallelizing convolutional neural networks. arXiv preprint arXiv:1404.5997, 2014.
[21]
Kevin P. Murphy. Machine Learning: A Probabilistic Perspective. The MIT Press, 2012.
[22]
Carl Edward Rasmussen. Gaussian processes for machine learning. 2006.
[23]
Benjamin Recht, Christopher Re, Stephen Wright, and Feng Niu. Hogwild: A lock-free approach to parallelizing stochastic gradient descent. In Advances in Neural Information Processing Systems, pages 693--701, 2011.
[24]
Frank Seide, Hao Fu, Jasha Droppo, Gang Li, and Dong Yu. 1-bit stochastic gradient descent and its application to data-parallel distributed training of speech dnns. In INTERSPEECH, pages 1058--1062, 2014.
[25]
Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P Adams, and Nando de Freitas. Taking the human out of the loop: A review of Bayesian optimization. Proceedings of the IEEE, 104(1):148--175, 2016.
[26]
Jasper Snoek, Hugo Larochelle, and Ryan Prescott Adams. Practical bayesian optimization of machine learning algorithms. In Neural Information Processing Systems, 2012.
[27]
Christian Szegedy et al. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1--9, 2015.
[28]
The Apache Software Foundation. Apache Cassandra. http://cassandra.apache.org.
[29]
Shivaram Venkataraman, Zongheng Yang, Michael Franklin, Benjamin Recht, and Ion Stoica. Ernest: efficient performance prediction for large-scale advanced analytics. In 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI 16), pages 363--378, 2016.
[30]
R Clint Whaley and Jack J Dongarra. Automatically tuned linear algebra software. In Proceedings of the 1998 ACM/IEEE conference on Supercomputing, pages 1--27. IEEE Computer Society, 1998.

Cited By

View all
  • (2024)EcoEdgeInfer: Dynamically Optimizing Latency and Sustainability for Inference on Edge Devices2024 IEEE/ACM Symposium on Edge Computing (SEC)10.1109/SEC62691.2024.00023(191-205)Online publication date: 4-Dec-2024
  • (2024)Righteous: Automatic Right-Sizing for Complex Edge Deployments2024 IEEE/ACM Symposium on Edge Computing (SEC)10.1109/SEC62691.2024.00010(15-28)Online publication date: 4-Dec-2024
  • (2024)Application-Agnostic Auto-Tuning of Open MPI Collectives Using Bayesian Optimization2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW63119.2024.00141(771-781)Online publication date: 27-May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
WWW '17: Proceedings of the 26th International Conference on World Wide Web
April 2017
1678 pages
ISBN:9781450349130

Sponsors

  • IW3C2: International World Wide Web Conference Committee

In-Cooperation

Publisher

International World Wide Web Conferences Steering Committee

Republic and Canton of Geneva, Switzerland

Publication History

Published: 03 April 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. auto-tuning
  2. bayesian optimization
  3. distributed stochastic gradient descent
  4. distributed systems
  5. neural networks
  6. probabilistic programming

Qualifiers

  • Research-article

Funding Sources

  • University of Cambridge Computer Laboratory
  • EPSRC

Conference

WWW '17
Sponsor:
  • IW3C2

Acceptance Rates

WWW '17 Paper Acceptance Rate 164 of 966 submissions, 17%;
Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)54
  • Downloads (Last 6 weeks)8
Reflects downloads up to 14 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)EcoEdgeInfer: Dynamically Optimizing Latency and Sustainability for Inference on Edge Devices2024 IEEE/ACM Symposium on Edge Computing (SEC)10.1109/SEC62691.2024.00023(191-205)Online publication date: 4-Dec-2024
  • (2024)Righteous: Automatic Right-Sizing for Complex Edge Deployments2024 IEEE/ACM Symposium on Edge Computing (SEC)10.1109/SEC62691.2024.00010(15-28)Online publication date: 4-Dec-2024
  • (2024)Application-Agnostic Auto-Tuning of Open MPI Collectives Using Bayesian Optimization2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW63119.2024.00141(771-781)Online publication date: 27-May-2024
  • (2024)An Exploration of Global Optimization Strategies for Autotuning OpenMP-based Codes2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW63119.2024.00138(741-750)Online publication date: 27-May-2024
  • (2024)VDTuner: Automated Performance Tuning for Vector Data Management Systems2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00332(4357-4369)Online publication date: 13-May-2024
  • (2023)CoTuner: A Hierarchical Learning Framework for Coordinately Optimizing Resource Partitioning and Parameter TuningProceedings of the 52nd International Conference on Parallel Processing10.1145/3605573.3605578(317-326)Online publication date: 7-Aug-2023
  • (2023)Cloud Configuration Optimization for Recurring Batch-Processing ApplicationsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.324608634:5(1495-1507)Online publication date: May-2023
  • (2023)AutoKAD: Empowering KPI Anomaly Detection with Label-Free Deployment2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE)10.1109/ISSRE59848.2023.00063(13-23)Online publication date: 9-Oct-2023
  • (2023)Etune: Efficient configuration tuning for big-data software systems via configuration space reductionJournal of Systems and Software10.1016/j.jss.2023.111936(111936)Online publication date: Dec-2023
  • (2023)PTSSBench: a performance evaluation platform in support of automated parameter tuning of software systemsAutomated Software Engineering10.1007/s10515-023-00402-z31:1Online publication date: 21-Nov-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media