research-article

Partial-Quasi-Newton Methods: Efficient Algorithms for Minimax Optimization Problems with Unbalanced Dimensionality

Authors:

Chengchang Liu,

John C.S. LuiAuthors Info & Claims

KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Pages 1031 - 1041

https://doi.org/10.1145/3534678.3539379

Published: 14 August 2022 Publication History

Abstract

This paper studies the strongly-convex-strongly-concave minimax optimization with unbalanced dimensionality. Such problems contain several popular applications in data science such as few shot learning and fairness-aware machine learning task. The design of conventional iterative algorithm for minimax optimization typically focuses on reducing the total number of oracle calls, which ignores the unbalanced computational cost for accessing the information from two different variables in minimax. We propose a novel second-order optimization algorithm, called Partial-Quasi-Newton (PQN) method, which takes the advantage of unbalanced structure in the problem to establish the Hessian estimate efficiently. We theoretically prove our PQN method converges to the saddle point faster than existing minimax optimization algorithms. The numerical experiments on real-world applications show the proposed PQN performs significantly better than the state-of-the-art methods.

Supplemental Material

MP4 File

This video introduce the motivation, algorithm details, convergence rate and applications of the partial-quasi-Newton methods which are very efficient for solving minimax optimization problems with unbalanced dimensionality.

Download
20.79 MB

References

[1]

Ahmet Alacaoglu and Yura Malitsky. Stochastic variance reduction for variational inequality methods. arXiv preprint arXiv:2102.08352, 2021.

[2]

Charles G. Broyden. Quasi-Newton methods and their application to function minimisation. Mathematics of Computation, 21 (99): 368--381, 1967.

[3]

Charles G. Broyden. The convergence of a class of double-rank minimization algorithms 1. general considerations. IMA Journal of Applied Mathematics, 6 (1): 76--90, 1970 a.

[4]

Charles G. Broyden. The convergence of a class of double-rank minimization algorithms: 2. the new algorithm. IMA journal of applied mathematics, 6 (3): 222--231, 1970 b.

[5]

Charles G. Broyden, J. E. Dennis, and Jorge J. Moré. On the local and superlinear convergence of quasi-Newton methods. IMA Journal of Applied Mathematics, 12 (3): 223--245, 1973.

[6]

Chih-Chung Chang and Chih-Jen Lin. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2: 27:1--27:27, 2011. Software and datasets available at http://www.csie.ntu.edu.tw/ cjlin/libsvm.

Digital Library

[7]

Tatjana Chavdarova, Gauthier Gidel, Francc ois Fleuret, and Simon Lacoste-Julien. Reducing noise in GAN training with variance reduced extragradient. In NeurIPS, 2019.

[8]

Corinna Cortes and Mehryar Mohri. Auc optimization vs. error rate minimization. In NIPS, 2003.

Digital Library

[9]

Constantinos Daskalakis, Andrew Ilyas, Vasilis Syrgkanis, and Haoyang Zeng. Training GANs with optimism. In ICLR, 2018.

[10]

William C. Davidon. Variable metric method for minimization. SIAM Journal on Optimization, 1 (1): 1--17, 1991.

Digital Library

[11]

J. E. Dennis, Jr., and Jorge J. Moré. A characterization of superlinear convergence and its application to quasi-Newton methods. Mathematics of Computation, 28 (126): 549--560, 1974.

[12]

Murat A Erdogdu and Andrea Montanari. Convergence rates of sub-sampled newton methods. NIPS, 2015.

Digital Library

[13]

Roger Fletcher and Micheal JD Powell. A rapidly convergent descent method for minimization. The Computer Journal, 6: 163--168, 1963.

[14]

Zhishuai Guo, Mingrui Liu, Zhuoning Yuan, Li Shen, Wei Liu, and Tianbao Yang. Communication-efficient distributed stochastic AUC maximization with deep neural networks. In ICML, 2020.

[15]

Isabelle Guyon, Constantin Aliferis, Greg Cooper, André Elisseeff, Jean-Philippe Pellet, Peter Spirtes, and Alexander Statnikov. Design and analysis of the causation and prediction challenge. In Causation and Prediction Challenge, pages 1--33. PMLR, 2008. Dataset available at http://www.causality.inf.ethz.ch/data/SIDO.html.

[16]

James A. Hanley and Barbara J. McNeil. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology, 143 (1): 29--36, 1982.

[17]

Kevin Huang, Junyu Zhang, and Shuzhong Zhang. Cubic regularized Newton method for saddle point models: a global and local convergence analysis. arXiv preprint arXiv:2008.09919, 2020.

[18]

Qiujiang Jin and Aryan Mokhtari. Non-asymptotic superlinear convergence of standard quasi-Newton methods. arXiv preprint arXiv:2003.13607, 2020.

[19]

G. M. Korpelevich. An extragradient method for finding saddle points and for other problems. Matecon, 12: 747--756, 1976.

[20]

Ching-pei Lee, Cong Han Lim, and Stephen J. Wright. A distributed quasi-newton algorithm for empirical risk minimization with nonsmooth regularization. In SIGKDD, 2018.

[21]

Dachao Lin, Haishan Ye, and Zhihua Zhang. Explicit superlinear convergence rates of Broyden's methods in nonlinear equations. arXiv preprint arXiv:2109.01974, 2021 a.

[22]

Dachao Lin, Haishan Ye, and Zhihua Zhang. Explicit convergence rates of greedy and random quasi-Newton methods. arXiv preprint arXiv:2104.08764, 2021 b.

[23]

Tianyi Lin, Chi Jin, and Michael I. Jordan. Near-optimal algorithms for minimax optimization. In COLT, 2020.

[24]

Chengchang Liu and Luo Luo. Quasi-Newton methods for saddle point problems and beyond. arXiv preprint arXiv:2111.02708, 2021.

[25]

Mingrui Liu, Zhuoning Yuan, Yiming Ying, and Tianbao Yang. Stochastic AUC maximization with deep neural networks. arXiv preprint arXiv:1908.10831, 2019.

[26]

Daniel Lowd and Christopher Meek. Adversarial learning. In SIGKDD, 2005.

Digital Library

[27]

Luo Luo, Cheng Chen, Yujun Li, Guangzeng Xie, and Zhihua Zhang. A stochastic proximal point algorithm for saddle-point problems. arXiv preprint arXiv:1909.06946, 2019.

[28]

Luo Luo, Guangzeng Xie, Tong Zhang, and Zhihua Zhang. Near optimal stochastic algorithms for finite-sum unbalanced convex-concave minimax optimization. arXiv preprint arXiv:2106.01761, 2021.

[29]

Aryan Mokhtari, Asuman Ozdaglar, and Sarath Pattathil. A unified analysis of extra-gradient and optimistic gradient methods for saddle point problems: Proximal point approach. In AISTATA, 2020.

[30]

Yurii Nesterov. Accelerating the Cubic regularization of newton's method on convex problems. Mathematical Programming, 112 (1): 159--181, 2008.

Digital Library

[31]

Yurii Nesterov and Laura Scrimali. Solving strongly monotone variational and quasi-variational inequalities. Discrete and Continuous Dynamical Systems, 31 (4): 1383--1396, 2007.

[32]

Jorge Nocedal and Stephen J. Wright. Numerical Optimization. Springer, New York, NY, USA, second edition, 2006.

[33]

Yuyuan Ouyang and Yangyang Xu. Lower complexity bounds of first-order methods for convex-concave bilinear saddle-point problems. arXiv preprint:1808.02901, 2018.

[34]

Balamurugan Palaniappan and Francis Bach. Stochastic variance reduction methods for saddle-point problems. In NIPS, 2016.

[35]

Mert Pilanci and Martin J. Wainwright. Newton sketch: A near linear-time optimization algorithm with linear-quadratic convergence. SIAM Journal on Optimization, 27 (1): 205--245, 2017.

Digital Library

[36]

John C. Platt. Fast training of support vector machines using sequential minimal optimization. Advances in Kernel Methods-Support Vector Learning, 1998.

[37]

Leonid Denisovich Popov. A modification of the arrow-hurwicz method for search of saddle points. Mathematical notes of the Academy of Sciences of the USSR, 28 (5): 845--848, 1980.

[38]

M. J. D. Powell. On the convergence of the variable metric algorithm. IMA Journal of Applied Mathematics, 7 (1): 21--36, 1971.

[39]

Tai Le Quy, Arjun Roy, Vasileios Iosifidis, and Eirini Ntoutsi. A survey on datasets for fairness-aware machine learning, 2021.

[40]

Kurt S Riedel. A sherman-morrison-woodbury identity for rank augmenting matrices with application to centering. SIAM Journal on Matrix Analysis and Applications, 13 (2): 659--662, 1992.

Digital Library

[41]

R. Tyrrell Rockafellar. Monotone operators and the proximal point algorithm. SIAM journal on control and optimization, 14 (5): 877--898, 1976.

[42]

Anton Rodomanov and Yurii Nesterov. Greedy quasi-Newton methods with explicit superlinear convergence. SIAM Journal on Optimization, 31 (1): 785--811, 2021 a.

Digital Library

[43]

Anton Rodomanov and Yurii Nesterov. New results on superlinear convergence of classical quasi-Newton methods. Journal of optimization theory and applications, 188 (3): 744--769, 2021 b .

[44]

Anton Rodomanov and Yurii Nesterov. Rates of superlinear convergence for classical quasi-Newton methods. Mathematical Programming, pages 1--32, 2021 c.

[45]

Farbod Roosta-Khorasani and Michael W Mahoney. Sub-sampled Newton methods. Mathematical Programming, 174 (1): 293--326, 2019.

Digital Library

[46]

David F. Shanno. Conditioning of quasi-Newton methods for function minimization. Mathematics of computation, 24 (111): 647--656, 1970.

[47]

Vladislav Tominin, Yaroslav Tominin, Ekaterina Borodich, Dmitry Kovalev, Alexander Gasnikov, and Pavel Dvurechensky. On accelerated methods for saddle-point problems with composite structure. arXiv preprint arXiv:2103.09344, 2021.

[48]

Paul Tseng. On linear convergence of iterative methods for the variational inequality problem. Journal of Computational and Applied Mathematics, 60 (1--2): 237--252, 1995.

Digital Library

[49]

Yuanhao Wang and Jian Li. Improved algorithms for convex-concave minimax optimization. In NeurIPS, 2020.

[50]

Guangzeng Xie, Luo Luo, Yijiang Lian, and Zhihua Zhang. Lower complexity bounds for finite-sum convex-concave minimax optimization problems. In ICML, 2020.

[51]

Peng Xu, Jiyan Yang, Fred Roosta, Christopher Ré, and Michael W. Mahoney. Sub-sampled Newton methods with non-uniform sampling. NIPS, 2016.

[52]

Haishan Ye, Luo Luo, and Zhihua Zhang. Approximate Newton methods and their local convergence. In ICML, 2017.

[53]

Haishan Ye, Dachao Lin, Zhihua Zhang, and Xiangyu Chang. Explicit superlinear convergence rates of the SR1 algorithm. arXiv preprint arXiv:2105.07162, 2021.

[54]

Yiming Ying, Longyin Wen, and Siwei Lyu. Stochastic online AUC maximization. NIPS, 2016.

[55]

Brian Hu Zhang, Blake Lemoine, and Margaret Mitchell. Mitigating unwanted biases with adversarial learning. In AIES, 2018.

Digital Library

Cited By

Xiao MBo S(2024)Electroencephalogram Emotion Recognition via AUC MaximizationAlgorithms10.3390/a1711048917:11(489)Online publication date: 1-Nov-2024
https://doi.org/10.3390/a17110489
Liu CChen CLuo LLui JOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Block Broyden's methods for solving nonlinear equationsProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668178(47487-47499)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3668178
Yang LZhang ZChe KYang SWang S(2023)Communication-Efficient Distributed Minimax Optimization via Markov CompressionNeural Information Processing10.1007/978-981-99-8079-6_42(540-551)Online publication date: 20-Nov-2023
https://dl.acm.org/doi/10.1007/978-981-99-8079-6_42

Index Terms

Partial-Quasi-Newton Methods: Efficient Algorithms for Minimax Optimization Problems with Unbalanced Dimensionality
1. Mathematics of computing
  1. Mathematical analysis
    1. Mathematical optimization

Recommendations

Forward---backward quasi-Newton methods for nonsmooth optimization problems

The forward---backward splitting method (FBS) for minimizing a nonsmooth composite function can be interpreted as a (variable-metric) gradient method over a continuously differentiable function which we call forward---backward envelope (FBE). This ...
A J-symmetric quasi-newton method for minimax problems
Abstract
Minimax problems have gained tremendous attentions across the optimization and machine learning community recently. In this paper, we introduce a new quasi-Newton method for the minimax problems, which we call J-symmetric quasi-Newton method. The ...
Descent Directions of Quasi-Newton Methods for Symmetric Nonlinear Equations

In general, when a quasi-Newton method is applied to solve a system of nonlinear equations, the quasi-Newton direction is not necessarily a descent direction for the norm function. In this paper, we show that when applied to solve symmetric nonlinear ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 2022

5033 pages

ISBN:9781450393850

DOI:10.1145/3534678

General Chairs:
Aidong Zhang
University of Virginia
,
Huzefa Rangwala
Amazon/George Mason University

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 August 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Hong Kong Reseach Grant Council
Shanghai Association for Science and Technology

Conference

KDD '22

Sponsor:

KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 14 - 18, 2022

Washington DC, USA

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Sponsor:
sigkdd
sigkdd

The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 3 - 7, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
590
Total Downloads

Downloads (Last 12 months)70
Downloads (Last 6 weeks)5

Reflects downloads up to 01 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Xiao MBo S(2024)Electroencephalogram Emotion Recognition via AUC MaximizationAlgorithms10.3390/a1711048917:11(489)Online publication date: 1-Nov-2024
https://doi.org/10.3390/a17110489
Liu CChen CLuo LLui JOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Block Broyden's methods for solving nonlinear equationsProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668178(47487-47499)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3668178
Yang LZhang ZChe KYang SWang S(2023)Communication-Efficient Distributed Minimax Optimization via Markov CompressionNeural Information Processing10.1007/978-981-99-8079-6_42(540-551)Online publication date: 20-Nov-2023
https://dl.acm.org/doi/10.1007/978-981-99-8079-6_42

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten