Adaptive Sampling for Incremental Optimization Using Stochastic Gradient Descent

Papa, Guillaume; Bianchi, Pascal; Clémençon, Stephan

doi:10.1007/978-3-319-24486-0_21

Guillaume Papa¹⁶,
Pascal Bianchi¹⁶ &
Stephan Clémençon¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9355))

Included in the following conference series:

International Conference on Algorithmic Learning Theory

1417 Accesses
3 Citations

Abstract

A wide collection of popular statistical learning methods, ranging from K-means to Support Vector Machines through Neural Networks, can be formulated as a stochastic gradient descent (SGD) algorithm in a specific setup. In practice, the main limitation of this incremental optimization technique is due to the stochastic noise induced by the choice at random of the data involved in the gradient estimate computation at each iteration. It is the purpose of this paper to introduce a novel implementation of the SGD algorithm, where the data subset used at a given step is not picked uniformly at random among all possible subsets but drawn from a specific adaptive sampling scheme, depending on the past iterations in a Markovian manner, in order to refine the current statistical estimation of the gradient. Beyond an algorithmic description of the approach we propose, rate bounds are established and illustrative numerical results are displayed in order to provide theoretical and empirical evidence of its statistical performance, compared to more “naive” SGD implementations. Computational issues are also discussed at length, revealing the practical advantages of the method promoted.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Nemirovski, A., Juditsky, A., Lan, G., Shapiro, A.: Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. (2009)
Google Scholar
Defazio, A., Bach, F., Lacoste-Julien, S.: SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives (2014)
Google Scholar
Devroye, L.: Non-uniform random variate generation (1986)
Google Scholar
Needell, D., Srebro, N., Ward, R.: Stochastic gradient descent, weighted sampling and the randomized kaczmarz algorithm
Google Scholar
Bach, F., Moulines, E.: Non-asymptotic analysis of stochastic approximation algorithms for machine learning. In: NIPS, pp. 451–459 (2011)
Google Scholar
Fort, G.: Central limit theorems for stochastic approximation with controlled Markov Chain. EsaimPS (2014)
Google Scholar
Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Statist. 22 (1951)
Google Scholar
Bottou, L.: Online algorithms and stochastic approximations. In: Online Learning and Neural Networks
Google Scholar
Bottou, L.: Stochastic gradient tricks. In: Neural Networks, Tricks of the Trade, Reloaded (2012)
Google Scholar
Mairal, J.: Incremental Majorization-Minimization Optimization with Application to Large-Scale Machine Learning (2014)
Google Scholar
Nesterov, Y., Nesterov, I.U.E.: Introductory Lectures on Convex Optimization: A Basic Course. Applied Optimization. Springer (2004)
Google Scholar
Pelletier, M.: Weak convergence rates for stochastic approximation with application to multiple targets and simulated annealing. Ann. Appl. Prob (1998)
Google Scholar
Johnson, R., Zhang, T.: Accelerating stochastic gradient descent using predictive variance reduction. In: NIPS, pp. 315–323 (2013)
Google Scholar
Schmidt, M.W., Le Roux, N., Bach, F.: Minimizing finite sums with the stochastic average gradient. CoRR (2013)
Google Scholar
Clemencon, S., Bertail, P., Chautru, E.: Scaling up m-estimation via sampling designs: The horvitz-thompson stochastic gradient descent. In: 2014 IEEE Big Data (2014)
Google Scholar
Shalev-Shwartz, S., Zhang, T.: Stochastic Dual Coordinate Ascent Methods for Regularized Loss Minimization (2012)
Google Scholar
Meyn, S., Tweedie, R.L.: Markov Chains and Stochastic Stability (2009)
Google Scholar
Zhao, P., Zhang, T.: Stochastic Optimization with Importance Sampling (2014)
Google Scholar

Download references

Author information

Authors and Affiliations

Institut Mines Telecom - Telecom ParisTech, LTCI Telecom ParisTech and CNRS No. 5141, Paris, France
Guillaume Papa, Pascal Bianchi & Stephan Clémençon

Authors

Guillaume Papa
View author publications
You can also search for this author in PubMed Google Scholar
Pascal Bianchi
View author publications
You can also search for this author in PubMed Google Scholar
Stephan Clémençon
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pascal Bianchi .

Editor information

Editors and Affiliations

University of California, La Jolla, California, USA
Kamalika Chaudhuri
UNIV DEL INSUBRIA, 21100 VARESE, Italy
CLAUDIO GENTILE
REGINA, Saskatchewan, Canada
Sandra Zilles

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Papa, G., Bianchi, P., Clémençon, S. (2015). Adaptive Sampling for Incremental Optimization Using Stochastic Gradient Descent. In: Chaudhuri, K., GENTILE, C., Zilles, S. (eds) Algorithmic Learning Theory. ALT 2015. Lecture Notes in Computer Science(), vol 9355. Springer, Cham. https://doi.org/10.1007/978-3-319-24486-0_21

Download citation

DOI: https://doi.org/10.1007/978-3-319-24486-0_21
Published: 31 October 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24485-3
Online ISBN: 978-3-319-24486-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics