research-article

Speculative Approximations for Terascale Distributed Gradient Descent Optimization

Authors:

Florin RusuAuthors Info & Claims

DanaC'15: Proceedings of the Fourth Workshop on Data analytics in the Cloud

Article No.: 1, Pages 1 - 10

https://doi.org/10.1145/2799562.2799563

Published: 31 May 2015 Publication History

Abstract

Model calibration is a major challenge faced by the plethora of statistical analytics packages that are increasingly used in Big Data applications. Identifying the optimal model parameters is a time-consuming process that has to be executed from scratch for every dataset/model combination even by experienced data scientists. We argue that the incapacity to evaluate multiple parameter configurations simultaneously and the lack of support to quickly identify sub-optimal configurations are the principal causes.

In this paper, we develop two database-inspired techniques for efficient model calibration. Speculative parameter testing applies advanced parallel multi-query processing methods to evaluate several configurations concurrently. Online aggregation is applied to identify sub-optimal configurations early in the processing by incrementally sampling the training dataset and estimating the objective function corresponding to each configuration. We design concurrent online aggregation estimators and define halting conditions to accurately and timely stop the execution.

We apply the proposed techniques to distributed gradient descent optimization -- batch and incremental -- for support vector machines and logistic regression models. We implement the resulting solutions in GLADE PF-OLA -- a state-of-the-art Big Data analytics system -- and evaluate their performance over terascalesize synthetic and real datasets. The results confirm that as many as 32 configurations can be evaluated concurrently almost as fast as one, while sub-optimal configurations are detected accurately in as little as a 1/20th fraction of the time.

References

[1]

A. Agarwal et al. A Reliable Effective Terascale Linear Learning System. JMLR, 15(1), 2014.

Digital Library

[2]

A. Dobra et al. Turbo-Charging Estimate Convergence in DBO. PVLDB, 2009.

Digital Library

[3]

A. Ghoting et al. SystemML: Declarative Machine Learning on MapReduce. In ICDE 2011.

Digital Library

[4]

A. Sujeeth et al. OptiML: An Implicitly Parallel Domain-Specific Language for Machine Learning. In ICML 2011.

Digital Library

[5]

D. P. Bertsekas. Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization: A Survey. MIT 2010.

[6]

S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004.

[7]

C. Jermaine et al. Scalable Approximate Query Processing with the DBO Engine. In SIGMOD 2007.

Digital Library

[8]

C. Jermaine et al. The Sort-Merge-Shrink Join. TODS, 31(4), 2006.

Digital Library

[9]

C. Qin and F. Rusu. Speculative Approximations for Terascale Analytics. http://arxiv.org/abs/1501.00255, 2015.

[10]

C. Wang et al. On Pruning for Top-K Ranking in Uncertain Databases. PVLDB, 4(10), 2011.

Digital Library

[11]

Y. Cheng, C. Qin, and F. Rusu. GLADE: Big Data Analytics Made Easy. In SIGMOD 2012.

Digital Library

[12]

E. Sparks et al. MLI: An API for Distributed Machine Learning. In ICDM 2013.

[13]

F. Rusu et al. The DBO Database System. In SIGMOD 2008.

Digital Library

[14]

X. Feng, A. Kumar, B. Recht, and C. Ré. Towards a Unified Architecture for in-RDBMS Analytics. In SIGMOD 2012.

Digital Library

[15]

G. Cormode et al. Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches. Foundations and Trends in Databases, 4, 2012.

Digital Library

[16]

G. Luo, C. J. Ellmann, P. J. Haas, and J. F. Naughton. A Scalable Hash Ripple Join Algorithm. In SIGMOD 2002.

Digital Library

[17]

A. Gelman, J. Carlin, H. Stern, and D. Rubin. Bayesian Data Analysis. Chapman & Hall/CRC, 2003.

[18]

R. Gemulla, E. Nijkamp, P. J. Haas, and Y. Sismanis. Large-Scale Matrix Factorization with Distributed Stochastic Gradient Descent. In KDD 2011.

Digital Library

[19]

P. J. Haas. Large-Sample and Deterministic Confidence Intervals for Online Aggregation. In SSDBM 1997.

Digital Library

[20]

P. J. Haas and J. M. Hellerstein. Ripple Joins for Online Aggregation. In SIGMOD 1999.

Digital Library

[21]

J. Hellerstein, P. Haas, and H. Wang. Online Aggregation. In SIGMOD 1997.

Digital Library

[22]

J. Dean et al. Large Scale Distributed Deep Networks. In NIPS 2012.

Digital Library

[23]

J. Hellerstein et al. The MADlib Analytics Library: Or MAD Skills, the SQL. PVLDB, 2012.

Digital Library

[24]

A. Kyrola, G. Blelloch, and C. Guestrin. GraphChi: Large-Scale Graph Computation on Just a PC. In OSDI 2012.

Digital Library

[25]

N. Pansare, V. R. Borkar, C. Jermaine, and T. Condie. Online Aggregation for Large MapReduce Jobs. PVLDB, 4(11), 2011.

[26]

F. Niu, B. Recht, C. Ré, and S. J. Wright. A Lock-Free Approach to Parallelizing Stochastic Gradient Descent. In NIPS 2011.

Digital Library

[27]

O. Dekel et al. Optimal Distributed Online Prediction Using Mini-Batches. JMLR, 13(1), 2012.

Digital Library

[28]

C. Qin and F. Rusu. Scalable I/O-Bound Parallel Incremental Gradient Descent for Big Data Analytics in GLADE. In DanaC 2013.

Digital Library

[29]

C. Qin and F. Rusu. PF-OLA: A High-Performance Framework for Parallel Online Aggregation. DAPD, 32(3), 2014.

Digital Library

[30]

R. Avnur et al. CONTROL: Continuous Output and Navigation Technology with Refinement On-Line. In SIGMOD 1998.

Digital Library

[31]

F. Rusu and A. Dobra. GLADE: A Scalable Framework for Efficient Analytics. OS Review, 46(1), 2012.

Digital Library

[32]

S. Agarwal et al. Knowing When You're Wrong: Building Fast and Reliable Approximate Query Processing Systems. In SIGMOD 2014.

Digital Library

[33]

S. Agarwal et al. Blink and It's Done: Interactive Queries on Very Large Data. PVLDB, 5(12), 2012.

Digital Library

[34]

S. Chen et al. PR-Join: A Non-Blocking Join Achieving Higher Early Result Rate with Statistical Guarantees. In SIGMOD 2010.

Digital Library

[35]

S. Wu et al. Continuous Sampling for Online Aggregation over Multiple Queries. In SIGMOD 2010.

Digital Library

[36]

S. Wu et al. Distributed Online Aggregation. PVLDB, 2(1), 2009.

Digital Library

[37]

T. Condie et al. MapReduce Online. In NSDI 2010.

Digital Library

[38]

Y. Low et al. GraphLab: A New Parallel Framework for Machine Learning. In UAI 2010.

Digital Library

[39]

Y. Low et al. Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud. PVLDB, 5(8), 2012.

Digital Library

[40]

Z. Cai et al. A Comparison of Platforms for Implementing and Running Very Large Scale Machine Learning Algorithms. In SIGMOD 2014.

Digital Library

[41]

Z. Cai et al. Simulation of Database-Valued Markov Chains using SimSQL. In SIGMOD 2013.

Digital Library

[42]

C. Zhang and C. Ré. DimmWitted: A Study of Main-Memory Statistical Analytics. PVLDB, 7(12), 2014.

Digital Library

[43]

M. Zinkevich, M. Weimer, A. Smola, and L. Li. Parallelized Stochastic Gradient Descent. In NIPS 2010.

Digital Library

Cited By

Kara ANikolic MOlteanu DZhang H(2023)F-IVM: analytics over relational databases under updatesThe VLDB Journal10.1007/s00778-023-00817-w33:4(903-929)Online publication date: 14-Nov-2023
https://doi.org/10.1007/s00778-023-00817-w
Shaikhha AHuot MSmith JOlteanu D(2022)Functional collection programming with semi-ring dictionariesProceedings of the ACM on Programming Languages10.1145/35273336:OOPSLA1(1-33)Online publication date: 29-Apr-2022
https://dl.acm.org/doi/10.1145/3527333
Shaikhha ASchleich MGhita AOlteanu DMars JTang LXue JWu P(2020)Multi-layer optimizations for end-to-end data analyticsProceedings of the 18th ACM/IEEE International Symposium on Code Generation and Optimization10.1145/3368826.3377923(145-157)Online publication date: 22-Feb-2020
https://dl.acm.org/doi/10.1145/3368826.3377923
Show More Cited By

Index Terms

Speculative Approximations for Terascale Distributed Gradient Descent Optimization

Recommendations

A Cost-based Optimizer for Gradient Descent Optimization
SIGMOD '17: Proceedings of the 2017 ACM International Conference on Management of Data

As the use of machine learning (ML) permeates into diverse application domains, there is an urgent need to support a declarative framework for ML. Ideally, a user will specify an ML task in a high-level and easy-to-use language and the framework will ...
Health data analytics using scalable logistic regression with stochastic gradient descent

As wearable medical sensors continuously generate enormous data, it is difficult to process and analyse. This paper focuses on developing scalable sensor data processing architecture in cloud computing to store and process body sensor data for ...
Descent three-term conjugate gradient methods based on secant conditions for unconstrained optimization

The conjugate gradient method is an effective method for large-scale unconstrained optimization problems. Recent research has proposed conjugate gradient methods based on secant conditions to establish fast convergence of the methods. However, these ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

DanaC'15: Proceedings of the Fourth Workshop on Data analytics in the Cloud

May 2015

29 pages

ISBN:9781450337243

DOI:10.1145/2799562

Editor:
Asterios Katsifodimos
Technische Universität Berlin

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMOD: ACM Special Interest Group on Management of Data

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 May 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article
Research
Refereed limited

Conference

SIGMOD/PODS'15

Sponsor:

SIGMOD

SIGMOD/PODS'15: International Conference on Management of Data

May 31 - June 4, 2015

VIC, Melbourne, Australia

Acceptance Rates

DanaC'15 Paper Acceptance Rate 4 of 6 submissions, 67%;

Overall Acceptance Rate 19 of 34 submissions, 56%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

14
Total Citations
View Citations
140
Total Downloads

Downloads (Last 12 months)9
Downloads (Last 6 weeks)0

Reflects downloads up to 10 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Kara ANikolic MOlteanu DZhang H(2023)F-IVM: analytics over relational databases under updatesThe VLDB Journal10.1007/s00778-023-00817-w33:4(903-929)Online publication date: 14-Nov-2023
https://doi.org/10.1007/s00778-023-00817-w
Shaikhha AHuot MSmith JOlteanu D(2022)Functional collection programming with semi-ring dictionariesProceedings of the ACM on Programming Languages10.1145/35273336:OOPSLA1(1-33)Online publication date: 29-Apr-2022
https://dl.acm.org/doi/10.1145/3527333
Shaikhha ASchleich MGhita AOlteanu DMars JTang LXue JWu P(2020)Multi-layer optimizations for end-to-end data analyticsProceedings of the 18th ACM/IEEE International Symposium on Code Generation and Optimization10.1145/3368826.3377923(145-157)Online publication date: 22-Feb-2020
https://dl.acm.org/doi/10.1145/3368826.3377923
Boehm MKumar AYang J(2019)Data Management in Machine Learning SystemsSynthesis Lectures on Data Management10.2200/S00895ED1V01Y201901DTM05714:1(1-173)Online publication date: 25-Feb-2019
https://doi.org/10.2200/S00895ED1V01Y201901DTM057
Ma YRusu FTorres M(2019)Stochastic Gradient Descent on Modern Hardware: Multi-core CPU or GPU? Synchronous or Asynchronous?2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS.2019.00113(1063-1072)Online publication date: May-2019
https://doi.org/10.1109/IPDPS.2019.00113
Abo Khamis MNgo HNguyen XOlteanu DSchleich MVan den Bussche JArenas M(2018)In-Database Learning with Sparse TensorsProceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3196959.3196960(325-340)Online publication date: 27-May-2018
https://dl.acm.org/doi/10.1145/3196959.3196960
Nikolic MOlteanu DDas GJermaine CBernstein P(2018)Incremental View Maintenance with Triple Lock Factorization BenefitsProceedings of the 2018 International Conference on Management of Data10.1145/3183713.3183758(365-380)Online publication date: 27-May-2018
https://dl.acm.org/doi/10.1145/3183713.3183758
Qin CTorres MRusu F(2017)Scalable asynchronous gradient descent optimization for out-of-core modelsProceedings of the VLDB Endowment10.14778/3115404.311540510:10(986-997)Online publication date: 1-Jun-2017
https://dl.acm.org/doi/10.14778/3115404.3115405
Qin CRusu FChoudhary AWu KDong B(2017)Dot-Product JoinProceedings of the 29th International Conference on Scientific and Statistical Database Management10.1145/3085504.3085512(1-12)Online publication date: 27-Jun-2017
https://dl.acm.org/doi/10.1145/3085504.3085512
Kumar ABoehm MYang JChirkova RYang JSuciu D(2017)Data Management in Machine LearningProceedings of the 2017 ACM International Conference on Management of Data10.1145/3035918.3054775(1717-1722)Online publication date: 9-May-2017
https://dl.acm.org/doi/10.1145/3035918.3054775
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten