Invexity Preserving Transformations for Projection Free Optimization with Sparsity Inducing Non-convex Constraints

Keller, Sebastian Mathias; Murezzan, Damian; Roth, Volker

doi:10.1007/978-3-030-12939-2_47

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11269))

Included in the following conference series:

German Conference on Pattern Recognition

2625 Accesses

Abstract

Forward stagewise and Frank Wolfe are popular gradient based projection free optimization algorithms which both require convex constraints. We propose a method to extend the applicability of these algorithms to problems of the form \(\min _x f(x) \quad s.t. \quad g(x) \le \kappa \) where f(x) is an invex (Invexity is a generalization of convexity and ensures that all local optima are also global optima.) objective function and g(x) is a non-convex constraint. We provide a theorem which defines a class of monotone component-wise transformation functions \(x_i = h(z_i)\). These transformations lead to a convex constraint function \(G(z) = g(h(z))\). Assuming invexity of the original function f(x) that same transformation \(x_i = h(z_i)\) will lead to a transformed objective function \(F(z) = f(h(z))\) which is also invex. For algorithms that rely on a non-zero gradient \(\nabla F\) to produce new update steps invexity ensures that these algorithms will move forward as long as a descent direction exists.

S. M. Keller and D. Murezzan—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In the supplementary materials, we provide an analytical proof which shows that for least squares regression problems the first active variable will always be correctly identified by the forward-stagewise method independent of the constraint. For other functions, the correctness of the first active variable can easily be empirically verified.
2.
This is a crucial property in the context of sparse regression, as only then the sparsity patterns in x- and z-space are identical.

References

Ben-Israel, A., Mond, B.: What is invexity? J. Aust. Math. Soc. Ser. B. Appl. Math. 28, 1–9 (1986)
Article MathSciNet MATH Google Scholar
Chechik, G., Globerson, A., Tishby, N., Weiss, Y.: Information bottleneck for Gaussian variables. J. Mach. Learn. Res. 6(1), 165–188 (2005)
MathSciNet MATH Google Scholar
Dinuzzo, F., Ong, C.S., Pillonetto, G., Gehler, P.V.: Learning output kernels with block coordinate descent. In: Proceedings of the 28th International Conference on Machine Learning (ICML-2011), pp. 49–56 (2011)
Google Scholar
Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96(456), 1348–1360 (2001)
Article MathSciNet MATH Google Scholar
Frank, M., Wolfe, P.: An algorithm for quadratic programming. Naval Res. Logistics (NRL) 3(1–2), 95–110 (1956)
Article MathSciNet Google Scholar
Friedman, J.H.: Fast sparse regression and classification. Int. J. Forecast. 28(3), 722–738 (2012)
Article Google Scholar
Gasso, G., Rakotomamonjy, A., Canu, S.: Solving non-convex lasso type problems with dc programming. In: IEEE Workshop on Machine Learning for Signal Processing, MLSP 2008, pp. 450–455. IEEE (2008)
Google Scholar
Giorgi, G.: On first order sufficient conditions for constrained optima. In: Maruyama, T., Takahashi, W. (eds.) Nonlinear and Convex Analysis in Economic Theory, pp. 53–66. Springer, Heidelberg (1995). https://doi.org/10.1007/978-3-642-48719-4_5
Chapter Google Scholar
Giorgi, G.: On some generalizations of preinvex functions 49 (2008)
Google Scholar
Gorodnitsky, I.F., Rao, B.D.: Sparse signal reconstruction from limited data using focuss: a re-weighted minimum norm algorithm. IEEE Trans. Signal Process. 45(3), 600–616 (1997)
Article Google Scholar
Hastie, T., Taylor, J., Tibshirani, R., Walther, G.: Forward stagewise regression and the monotone lasso. Electron. J. Stat. 1, 1–29 (2007). https://doi.org/10.1214/07-EJS004
Article MathSciNet MATH Google Scholar
Jaggi, M.: Revisiting Frank-Wolfe: projection-free sparse convex optimization. In: ICML, vol. 1, pp. 427–435 (2013)
Google Scholar
Lanza, A., Morigi, S., Sgallari, F.: Convex image denoising via non-convex regularization. In: Aujol, J.-F., Nikolova, M., Papadakis, N. (eds.) SSVM 2015. LNCS, vol. 9087, pp. 666–677. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-18461-6_53
Chapter Google Scholar
Li, G., Yan, Z., Wang, J.: A one-layer recurrent neural network for constrained nonsmooth invex optimization. Neural Netw. 50, 79–89 (2014)
Article MATH Google Scholar
Li, X., Zhao, T., Zhang, T., Liu, H.: The picasso package for nonconvex regularized M-estimation in high dimensions in R. Technical report (2015)
Google Scholar
Mazumder, R., Friedman, J.H., Hastie, T.: SparseNet: coordinate descent with nonconvex penalties. J. Am. Stat. Assoc. 106(495), 1125–1138 (2011)
Article MathSciNet MATH Google Scholar
Mishra, S., Giorgi, G.: Invexity and Optimization. Nonconvex Optimization and Its Applications. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78562-0
Book MATH Google Scholar
Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Rey, M., Fuchs, T., Roth, V.: Sparse meta-Gaussian information bottleneck. In: Proceedings of the 31st International Conference on Machine Learning (ICML-2014), pp. 910–918 (2014)
Google Scholar
Rey, M., Roth, V.: Meta-Gaussian information bottleneck. In: Advances in Neural Information Processing Systems-NIPS 25 (2012)
Google Scholar
Tibshirani, R.J.: A general framework for fast stagewise algorithms. J. Mach. Learn. Res. 16, 2543–2588 (2015)
MathSciNet MATH Google Scholar
Tishby, N., Pereira, F.C., Bialek, W.: The information bottleneck method. arXiv preprint physics/0004057 (2000)
Zhang, C.H.: Nearly unbiased variable selection under minimax concave penalty. Ann. Stat. 38, 894–942 (2010)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

This project is supported by the the Swiss National Science Foundation project CR32I2 159682.

Author information

Authors and Affiliations

University of Basel, Spiegelgasse 1, Basel, Switzerland
Sebastian Mathias Keller, Damian Murezzan & Volker Roth

Authors

Sebastian Mathias Keller
View author publications
You can also search for this author in PubMed Google Scholar
Damian Murezzan
View author publications
You can also search for this author in PubMed Google Scholar
Volker Roth
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sebastian Mathias Keller .

Editor information

Editors and Affiliations

University of Freiburg, Freiburg im Breisgau, Baden-Württemberg, Germany
Thomas Brox
University of Stuttgart, Stuttgart, Baden-Württemberg, Germany
Andrés Bruhn
CISPA Helmholtz Center for Information Security, Saarbrücken, Germany
Mario Fritz

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 289 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Keller, S.M., Murezzan, D., Roth, V. (2019). Invexity Preserving Transformations for Projection Free Optimization with Sparsity Inducing Non-convex Constraints. In: Brox, T., Bruhn, A., Fritz, M. (eds) Pattern Recognition. GCPR 2018. Lecture Notes in Computer Science(), vol 11269. Springer, Cham. https://doi.org/10.1007/978-3-030-12939-2_47

Download citation

DOI: https://doi.org/10.1007/978-3-030-12939-2_47
Published: 14 February 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-12938-5
Online ISBN: 978-3-030-12939-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics