skip to main content
10.1145/3534678.3539379acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Partial-Quasi-Newton Methods: Efficient Algorithms for Minimax Optimization Problems with Unbalanced Dimensionality

Published: 14 August 2022 Publication History

Abstract

This paper studies the strongly-convex-strongly-concave minimax optimization with unbalanced dimensionality. Such problems contain several popular applications in data science such as few shot learning and fairness-aware machine learning task. The design of conventional iterative algorithm for minimax optimization typically focuses on reducing the total number of oracle calls, which ignores the unbalanced computational cost for accessing the information from two different variables in minimax. We propose a novel second-order optimization algorithm, called Partial-Quasi-Newton (PQN) method, which takes the advantage of unbalanced structure in the problem to establish the Hessian estimate efficiently. We theoretically prove our PQN method converges to the saddle point faster than existing minimax optimization algorithms. The numerical experiments on real-world applications show the proposed PQN performs significantly better than the state-of-the-art methods.

Supplemental Material

MP4 File
This video introduce the motivation, algorithm details, convergence rate and applications of the partial-quasi-Newton methods which are very efficient for solving minimax optimization problems with unbalanced dimensionality.

References

[1]
Ahmet Alacaoglu and Yura Malitsky. Stochastic variance reduction for variational inequality methods. arXiv preprint arXiv:2102.08352, 2021.
[2]
Charles G. Broyden. Quasi-Newton methods and their application to function minimisation. Mathematics of Computation, 21 (99): 368--381, 1967.
[3]
Charles G. Broyden. The convergence of a class of double-rank minimization algorithms 1. general considerations. IMA Journal of Applied Mathematics, 6 (1): 76--90, 1970 a.
[4]
Charles G. Broyden. The convergence of a class of double-rank minimization algorithms: 2. the new algorithm. IMA journal of applied mathematics, 6 (3): 222--231, 1970 b.
[5]
Charles G. Broyden, J. E. Dennis, and Jorge J. Moré. On the local and superlinear convergence of quasi-Newton methods. IMA Journal of Applied Mathematics, 12 (3): 223--245, 1973.
[6]
Chih-Chung Chang and Chih-Jen Lin. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2: 27:1--27:27, 2011. Software and datasets available at http://www.csie.ntu.edu.tw/ cjlin/libsvm.
[7]
Tatjana Chavdarova, Gauthier Gidel, Francc ois Fleuret, and Simon Lacoste-Julien. Reducing noise in GAN training with variance reduced extragradient. In NeurIPS, 2019.
[8]
Corinna Cortes and Mehryar Mohri. Auc optimization vs. error rate minimization. In NIPS, 2003.
[9]
Constantinos Daskalakis, Andrew Ilyas, Vasilis Syrgkanis, and Haoyang Zeng. Training GANs with optimism. In ICLR, 2018.
[10]
William C. Davidon. Variable metric method for minimization. SIAM Journal on Optimization, 1 (1): 1--17, 1991.
[11]
J. E. Dennis, Jr., and Jorge J. Moré. A characterization of superlinear convergence and its application to quasi-Newton methods. Mathematics of Computation, 28 (126): 549--560, 1974.
[12]
Murat A Erdogdu and Andrea Montanari. Convergence rates of sub-sampled newton methods. NIPS, 2015.
[13]
Roger Fletcher and Micheal JD Powell. A rapidly convergent descent method for minimization. The Computer Journal, 6: 163--168, 1963.
[14]
Zhishuai Guo, Mingrui Liu, Zhuoning Yuan, Li Shen, Wei Liu, and Tianbao Yang. Communication-efficient distributed stochastic AUC maximization with deep neural networks. In ICML, 2020.
[15]
Isabelle Guyon, Constantin Aliferis, Greg Cooper, André Elisseeff, Jean-Philippe Pellet, Peter Spirtes, and Alexander Statnikov. Design and analysis of the causation and prediction challenge. In Causation and Prediction Challenge, pages 1--33. PMLR, 2008. Dataset available at http://www.causality.inf.ethz.ch/data/SIDO.html.
[16]
James A. Hanley and Barbara J. McNeil. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology, 143 (1): 29--36, 1982.
[17]
Kevin Huang, Junyu Zhang, and Shuzhong Zhang. Cubic regularized Newton method for saddle point models: a global and local convergence analysis. arXiv preprint arXiv:2008.09919, 2020.
[18]
Qiujiang Jin and Aryan Mokhtari. Non-asymptotic superlinear convergence of standard quasi-Newton methods. arXiv preprint arXiv:2003.13607, 2020.
[19]
G. M. Korpelevich. An extragradient method for finding saddle points and for other problems. Matecon, 12: 747--756, 1976.
[20]
Ching-pei Lee, Cong Han Lim, and Stephen J. Wright. A distributed quasi-newton algorithm for empirical risk minimization with nonsmooth regularization. In SIGKDD, 2018.
[21]
Dachao Lin, Haishan Ye, and Zhihua Zhang. Explicit superlinear convergence rates of Broyden's methods in nonlinear equations. arXiv preprint arXiv:2109.01974, 2021 a.
[22]
Dachao Lin, Haishan Ye, and Zhihua Zhang. Explicit convergence rates of greedy and random quasi-Newton methods. arXiv preprint arXiv:2104.08764, 2021 b.
[23]
Tianyi Lin, Chi Jin, and Michael I. Jordan. Near-optimal algorithms for minimax optimization. In COLT, 2020.
[24]
Chengchang Liu and Luo Luo. Quasi-Newton methods for saddle point problems and beyond. arXiv preprint arXiv:2111.02708, 2021.
[25]
Mingrui Liu, Zhuoning Yuan, Yiming Ying, and Tianbao Yang. Stochastic AUC maximization with deep neural networks. arXiv preprint arXiv:1908.10831, 2019.
[26]
Daniel Lowd and Christopher Meek. Adversarial learning. In SIGKDD, 2005.
[27]
Luo Luo, Cheng Chen, Yujun Li, Guangzeng Xie, and Zhihua Zhang. A stochastic proximal point algorithm for saddle-point problems. arXiv preprint arXiv:1909.06946, 2019.
[28]
Luo Luo, Guangzeng Xie, Tong Zhang, and Zhihua Zhang. Near optimal stochastic algorithms for finite-sum unbalanced convex-concave minimax optimization. arXiv preprint arXiv:2106.01761, 2021.
[29]
Aryan Mokhtari, Asuman Ozdaglar, and Sarath Pattathil. A unified analysis of extra-gradient and optimistic gradient methods for saddle point problems: Proximal point approach. In AISTATA, 2020.
[30]
Yurii Nesterov. Accelerating the Cubic regularization of newton's method on convex problems. Mathematical Programming, 112 (1): 159--181, 2008.
[31]
Yurii Nesterov and Laura Scrimali. Solving strongly monotone variational and quasi-variational inequalities. Discrete and Continuous Dynamical Systems, 31 (4): 1383--1396, 2007.
[32]
Jorge Nocedal and Stephen J. Wright. Numerical Optimization. Springer, New York, NY, USA, second edition, 2006.
[33]
Yuyuan Ouyang and Yangyang Xu. Lower complexity bounds of first-order methods for convex-concave bilinear saddle-point problems. arXiv preprint:1808.02901, 2018.
[34]
Balamurugan Palaniappan and Francis Bach. Stochastic variance reduction methods for saddle-point problems. In NIPS, 2016.
[35]
Mert Pilanci and Martin J. Wainwright. Newton sketch: A near linear-time optimization algorithm with linear-quadratic convergence. SIAM Journal on Optimization, 27 (1): 205--245, 2017.
[36]
John C. Platt. Fast training of support vector machines using sequential minimal optimization. Advances in Kernel Methods-Support Vector Learning, 1998.
[37]
Leonid Denisovich Popov. A modification of the arrow-hurwicz method for search of saddle points. Mathematical notes of the Academy of Sciences of the USSR, 28 (5): 845--848, 1980.
[38]
M. J. D. Powell. On the convergence of the variable metric algorithm. IMA Journal of Applied Mathematics, 7 (1): 21--36, 1971.
[39]
Tai Le Quy, Arjun Roy, Vasileios Iosifidis, and Eirini Ntoutsi. A survey on datasets for fairness-aware machine learning, 2021.
[40]
Kurt S Riedel. A sherman-morrison-woodbury identity for rank augmenting matrices with application to centering. SIAM Journal on Matrix Analysis and Applications, 13 (2): 659--662, 1992.
[41]
R. Tyrrell Rockafellar. Monotone operators and the proximal point algorithm. SIAM journal on control and optimization, 14 (5): 877--898, 1976.
[42]
Anton Rodomanov and Yurii Nesterov. Greedy quasi-Newton methods with explicit superlinear convergence. SIAM Journal on Optimization, 31 (1): 785--811, 2021 a.
[43]
Anton Rodomanov and Yurii Nesterov. New results on superlinear convergence of classical quasi-Newton methods. Journal of optimization theory and applications, 188 (3): 744--769, 2021 b .
[44]
Anton Rodomanov and Yurii Nesterov. Rates of superlinear convergence for classical quasi-Newton methods. Mathematical Programming, pages 1--32, 2021 c.
[45]
Farbod Roosta-Khorasani and Michael W Mahoney. Sub-sampled Newton methods. Mathematical Programming, 174 (1): 293--326, 2019.
[46]
David F. Shanno. Conditioning of quasi-Newton methods for function minimization. Mathematics of computation, 24 (111): 647--656, 1970.
[47]
Vladislav Tominin, Yaroslav Tominin, Ekaterina Borodich, Dmitry Kovalev, Alexander Gasnikov, and Pavel Dvurechensky. On accelerated methods for saddle-point problems with composite structure. arXiv preprint arXiv:2103.09344, 2021.
[48]
Paul Tseng. On linear convergence of iterative methods for the variational inequality problem. Journal of Computational and Applied Mathematics, 60 (1--2): 237--252, 1995.
[49]
Yuanhao Wang and Jian Li. Improved algorithms for convex-concave minimax optimization. In NeurIPS, 2020.
[50]
Guangzeng Xie, Luo Luo, Yijiang Lian, and Zhihua Zhang. Lower complexity bounds for finite-sum convex-concave minimax optimization problems. In ICML, 2020.
[51]
Peng Xu, Jiyan Yang, Fred Roosta, Christopher Ré, and Michael W. Mahoney. Sub-sampled Newton methods with non-uniform sampling. NIPS, 2016.
[52]
Haishan Ye, Luo Luo, and Zhihua Zhang. Approximate Newton methods and their local convergence. In ICML, 2017.
[53]
Haishan Ye, Dachao Lin, Zhihua Zhang, and Xiangyu Chang. Explicit superlinear convergence rates of the SR1 algorithm. arXiv preprint arXiv:2105.07162, 2021.
[54]
Yiming Ying, Longyin Wen, and Siwei Lyu. Stochastic online AUC maximization. NIPS, 2016.
[55]
Brian Hu Zhang, Blake Lemoine, and Margaret Mitchell. Mitigating unwanted biases with adversarial learning. In AIES, 2018.

Cited By

View all
  • (2024)Electroencephalogram Emotion Recognition via AUC MaximizationAlgorithms10.3390/a1711048917:11(489)Online publication date: 1-Nov-2024
  • (2023)Block Broyden's methods for solving nonlinear equationsProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668178(47487-47499)Online publication date: 10-Dec-2023
  • (2023)Communication-Efficient Distributed Minimax Optimization via Markov CompressionNeural Information Processing10.1007/978-981-99-8079-6_42(540-551)Online publication date: 20-Nov-2023

Index Terms

  1. Partial-Quasi-Newton Methods: Efficient Algorithms for Minimax Optimization Problems with Unbalanced Dimensionality

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
    August 2022
    5033 pages
    ISBN:9781450393850
    DOI:10.1145/3534678
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 August 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. fairness
    2. few-shot learning
    3. minimax optimization
    4. quasi-newton

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    KDD '22
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Upcoming Conference

    KDD '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)70
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 01 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Electroencephalogram Emotion Recognition via AUC MaximizationAlgorithms10.3390/a1711048917:11(489)Online publication date: 1-Nov-2024
    • (2023)Block Broyden's methods for solving nonlinear equationsProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668178(47487-47499)Online publication date: 10-Dec-2023
    • (2023)Communication-Efficient Distributed Minimax Optimization via Markov CompressionNeural Information Processing10.1007/978-981-99-8079-6_42(540-551)Online publication date: 20-Nov-2023

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media