research-article

NASA: achieving lower regrets and faster rates via adaptive stepsizes

Authors:
Hua Ouyang

Georgia Institute of Technology, Atlanta, GA, USA

Georgia Institute of Technology, Atlanta, GA, USA
View Profile

,
Alexander Gray

Georgia Institute of Technology, Atlanta, GA, USA

Georgia Institute of Technology, Atlanta, GA, USA
View Profile

KDD '12: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data miningAugust 2012Pages 159–167https://doi.org/10.1145/2339530.2339557

Published:12 August 2012Publication History

KDD '12: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 159–167

ABSTRACT

The classic Stochastic Approximation (SA) method achieves optimal rates under the black-box model. This optimality does not rule out better algorithms when more information about functions and data is available.

We present a family of Noise Adaptive Stochastic Approximation (NASA) algorithms for online convex optimization and stochastic convex optimization. NASA is an adaptive variant of Mirror Descent Stochastic Approximation. It is novel in its practical variation-dependent stepsizes and better theoretical guarantees. We show that comparing with state-of-the-art adaptive and non-adaptive SA methods, lower regrets and faster rates can be achieved under low-variation assumptions.

Supplemental Material

311b_m_talk_3.mp4

mp4

141.2 MB

Download

References

P. Bartlett, E. Hazan, and A. Rakhlin. Adaptive online gradient descent. In Proceedings of NIPS, 2007.Google Scholar
L. Bottou and N. Murata. Stochastic approximations and efficent learning. In M. A. Arbib, editor, The Handbook of Brain Theory and Neural Networks. Cambridge University Press, 2nd edition, 2002.Google Scholar
J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. In Proceedings of COLT, 2010.Google Scholar
E. Hazan, A. Agarwal, A. Kalai, and S. Kale. Logarithmic regret algorithms for online convex optimization. In COLT, 2006. Google ScholarDigital Library
C. Hu, J. Kwok, and W. Pan. Accelerated gradient methods for stochastic optimization and online learning. In NIPS, 2009.Google Scholar
H. J. Kushner and G. G. Yin. Stochastic Approximation and Recursive Algorithms and Applications. Springer, 2nd edition, 2003.Google Scholar
G. Lan. Efficient methods for stochastic composite optimization. Technical report, Georgia Institute of Technology, August 2008.Google Scholar
H. B. McMahan and M. Streeter. Adaptive bound optimization for online convex optimization. In Proceedings of COLT, 2010.Google Scholar
A. S. Nemirovski, A. Juditsky, G. Lan, and A. Shapiro. Robust stochastic approximation approach to stochastic programming. SIAM Journal on Optimization, 19(4):1574--1609, 2009. Google ScholarDigital Library
A. S. Nemirovsky and D. B. Yudin. Problem Complexity and Method Efficiency in Optimization. John Wiley & Sons, 1983.Google Scholar
Y. Nesterov. Introductory Lectures on Convex Optimization. Kluwer Academic Publishers, 2004.Google ScholarDigital Library
H. Robbins and S. Monro. A stochastic approximation method. Annals of Mathematical Statistics, 22(3):400--407, 1951.Google ScholarCross Ref
S. Shalev-Shwartz. Online Learning: Theory, Algorithms, and Applications. PhD thesis, The Hebrew University of Jerusalem, 2007.Google Scholar
A. Shapiro, D. Dentcheva, and A. Ruszczynski. Lectures on Stochastic Programming: Modeling and Theory. SIAM, 2009.Google ScholarDigital Library
N. Srebro, K. Sridharan, and A. Tewari. Smoothness, low-noise and fast rates. In NIPS, 2010.Google Scholar
K. Sridharan, N. Srebro, and A. Tewari. On the universality of online mirror descent. In Proceedings of NIPS 24, 2011.Google Scholar
M. Zinkevich. Online convex programming and generalized infinitesimal gradient ascent. In ICML, 2003.Google ScholarDigital Library

Index Terms

NASA: achieving lower regrets and faster rates via adaptive stepsizes

Recommendations

A Linearly Convergent Variant of the Conditional Gradient Algorithm under Strong Convexity, with Applications to Online and Stochastic Optimization

Linear optimization is many times algorithmically simpler than nonlinear convex optimization. Linear optimization over matroid polytopes, matching polytopes, and path polytopes are examples of problems for which we have simple and efficient combinatorial ...
Read More
IQN: An Incremental Quasi-Newton Method with Local Superlinear Convergence Rate

The problem of minimizing an objective that can be written as the sum of a set of $n$ smooth and strongly convex functions is challenging because the cost of evaluating the function and its derivatives is proportional to the number of elements in the sum. ...
Read More
Descent direction method with line search for unconstrained optimization in noisy environment

A two-phase descent direction method for unconstrained stochastic optimization problem is proposed. A line-search method with an arbitrary descent direction is used to determine the step sizes during the initial phase, and the second phase performs the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '12: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
August 2012
1616 pages
ISBN:9781450314626
DOI:10.1145/2339530
General Chair:
Qiang Yang
Hong Kong University of Science and Technology
,
Program Chairs:
Deepak Agarwal
LinkedIn
,
Jian Pei
Simon Fraser University
Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 August 2012
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
adaptive learning
online convex optimization
online learning
stochastic optimization
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,133of8,635submissions,13%
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

KDD '24: The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 360
  Total Downloads
- Downloads (Last 12 months)2
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

NASA: achieving lower regrets and faster rates via adaptive stepsizes

KDD '12: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

A Linearly Convergent Variant of the Conditional Gradient Algorithm under Strong Convexity, with Applications to Online and Stochastic Optimization

IQN: An Incremental Quasi-Newton Method with Local Superlinear Convergence Rate

Descent direction method with line search for unconstrained optimization in noisy environment