Skip to main content
Log in

VAGA: a novel viscosity-based accelerated gradient algorithm

Convergence analysis and applications

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Proximal Algorithms are known to be very popular in the area of signal processing, image reconstruction, variational inequality and convex optimization due to their small iteration costs and applicability to the non-smooth optimization problems. Various real-world machine learning problems have been solved utilizing the non-smooth convex loss minimization framework, and a recent trend is to design new accelerated algorithms to solve such frameworks efficiently. In this paper, we propose a novel viscosity-based accelerated gradient algorithm (VAGA), that utilizes the concept of viscosity approximation method of fixed point theory for solving the learning problems. We discuss the boundedness of the sequence generated by this iterative algorithm and prove the strong convergence of the algorithm under the few specific conditions. To test the practical performance of the algorithm on real-world problems, we applied it to solve the regularized multitask regression problem with sparsity-inducing regularizers. We present the detailed comparative analysis of our algorithm with few traditional proximal algorithms on three real benchmark multitask regression datasets. We also apply the proposed algorithm to the task of joint splice-site recognition problem of bio-informatics. The improved results demonstrate the efficacy of our algorithm over state-of-the-art proximal gradient descent algorithms. To the best of our knowledge, it is the first time that a viscosity-based iterative algorithm is applied to solve the real world problem of regression and recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. http://www.yelab.net/software/MALSAR/

References

  1. Ansari QH, Sahu DR (2016) Chapter 5 - extragradient methods for some nonlinear problems. In: Alfuraidan MR, Ansari QH (eds) Fixed point theory and graph theory. Academic Press, Oxford, pp 187–230. https://doi.org/10.1016/B978-0-12-804295-3.50005-X. ISBN: 9780128042953

  2. Argyriou A, Evgeniou T, Pontil M (2007) Multi-task feature learning. In: Advances in neural information processing systems 19, MIT Press

  3. Attouch H, Czarnecki MO, Peypouquet J (2011) Prox-penalization and splitting methods for constrained variational problems. SIAM J Optim 21(1):149–173. https://doi.org/10.1137/100789464

    Article  MathSciNet  MATH  Google Scholar 

  4. Attouch H, Peypouquet J, Redont P (2015) Fast convergence of an inertial gradient-like system with vanishing viscosity. arXiv:1507.04782

  5. Bach F, Jenatton R, Mairal J, Obozinski G (2012) Optimization with sparsity-inducing penalties. Found Trends Mach Learn 4(1):1–106. https://doi.org/10.1561/2200000015

    Article  MATH  Google Scholar 

  6. Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J Imag Sci 2(1):183–202. https://doi.org/10.1137/080716542

    Article  MathSciNet  MATH  Google Scholar 

  7. BoŢ RI, Csetnek ER (2014) Forward-backward and tseng’s type penalty schemes for monotone inclusion problems. Set-Valued and Variational Analysis 22(2):313–331. https://doi.org/10.1007/s11228-014-0274-7

    Article  MathSciNet  MATH  Google Scholar 

  8. Chambolle A, Dossal C (2015) On the convergence of the iterates of the “fast iterative shrinkage/thresholding algorithm”. J Optim Theory Appl 166(3):968–982. https://doi.org/10.1007/s10957-015-0746-4

    Article  MathSciNet  MATH  Google Scholar 

  9. Chen X, Pan W, Kwok JT, Carbonell JG (2009) Accelerated gradient method for multi-task sparse learning problem. In: Proceedings of the 2009 Ninth IEEE international conference on data mining. IEEE Computer Society, Washington, ICDM ’09, pp 746–751 . https://doi.org/10.1109/ICDM.2009.128

  10. Cho SY, Qin X, Wang L (2014) Strong convergence of a splitting algorithm for treating monotone operators. J fixed point theory appl 2014(1):1–15. https://doi.org/10.1186/1687-1812-2014-94

    Article  MathSciNet  MATH  Google Scholar 

  11. Combettes PL, Pesquet JC (2011) Proximal splitting methods in signal processing. Springer, New York, pp 185–212. https://doi.org/10.1007/978-1-4419-9569-8_10

    MATH  Google Scholar 

  12. Combettes PL, Wajs VR (2005) Signal recovery by proximal forward-backward splitting. Multiscale Model Simul 4(4):1168–1200. https://doi.org/10.1137/050626090

    Article  MathSciNet  MATH  Google Scholar 

  13. Hubbard T, Barker D, Birney E, Cameron G, Chen Y, Clark L, Cox T, Cuff J, Curwen V, Down T, Durbin R, Eyras E, Gilbert J, Hammond M, Huminiecki L, Kasprzyk A, Lehvaslaiho H, Lijnzaad P, Melsopp C, Mongin E, Pettett R, Pocock M, Potter S, Rust A, Schmidt E, Searle S, Slater G, Smith J, Spooner W, Stabenau A, Stalker J, Stupka E, Ureta-Vidal A, Vastrik I, Clamp M (2002) The ensembl genome database project. Nucleic Acids Res 30(1):38–41. https://doi.org/10.1093/nar/30.1.38. http://nar.oxfordjournals.org/content/30/1/38.abstract, http://nar.oxfordjournals.org/content/30/1/38.full.pdf+html

    Article  Google Scholar 

  14. Zhou JJC, Ye J (2012) MALSAR: Multi-tAsk Learning via StructurAl Regularization. Arizona State University, http://www.MALSAR.org

  15. Johnstone PR, Moulin P (2015) Convergence of an inertial proximal method for l1-regularized least-squares. In: 2015 IEEE international conference on acoustics, speech and signal processing (ICASSP), , pp 3566–3570. https://doi.org/10.1109/ICASSP.2015.7178635

  16. Kim B, Yu D, Won JH (2016) Comparative study of computational algorithms for the lasso with high-dimensional, highly correlated data. Appl Intell :1–20. https://doi.org/10.1007/s10489-016-0850-7

  17. Lehdili N, Moudafi A (1996) Combining the proximal algorithm and tikhonov regularization. Optimization 37(3):239–252. https://doi.org/10.1080/02331939608844217

    Article  MathSciNet  MATH  Google Scholar 

  18. Liu J, Ye J (2010) Efficient l1/lq norm regularization. CoRR arXiv:1009.4766

  19. Lorenz DA, Pock T (2015) An inertial forward-backward algorithm for monotone inclusions. J Math Imaging Vision 51(2):311–325. https://doi.org/10.1007/s10851-014-0523-2

    Article  MathSciNet  MATH  Google Scholar 

  20. Marino G, Xu HK (2006) A general iterative method for nonexpansive mappings in hilbert spaces. J Math Anal Appl 318(1):43–52. https://doi.org/10.1016/j.jmaa.2005.05.028. http://www.sciencedirect.com/science/article/pii/S0022247X05004713

    Article  MathSciNet  MATH  Google Scholar 

  21. Moudafi A (2000) Viscosity approximation methods for fixed-points problems. J Math Anal Appl 241(1):46–55. https://doi.org/10.1006/jmaa.1999.6615. http://www.sciencedirect.com/science/article/pii/S0022247X99966155

    Article  MathSciNet  MATH  Google Scholar 

  22. Nesterov Y (2013) Gradient methods for minimizing composite functions. Math Program 140(1):125–161. https://doi.org/10.1007/s10107-012-0629-5

    Article  MathSciNet  MATH  Google Scholar 

  23. Parikh N, Boyd S (2014) Proximal algorithms. Found Trends Optim 1(3):127–239. https://doi.org/10.1561/2400000003

    Article  Google Scholar 

  24. Rosasco L, Villa S, Vũ BC (2016) Stochastic forward–backward splitting for monotone inclusions. J Optim Theory Appl 169(2):388–406. https://doi.org/10.1007/s10957-016-0893-2

  25. Sahu DR, Ansari QH, Yao JC (2015) The prox-tikhonov-like forward-backward method and applications. Taiwan J Math 19(2):481–503. https://doi.org/10.11650/tjm.19.2015.4972

    Article  MathSciNet  MATH  Google Scholar 

  26. Shu X, Lu H (2014) Linear discriminant analysis with spectral regularization. Appl Intell 40(4):724–731. https://doi.org/10.1007/s10489-013-0485-x

    Article  Google Scholar 

  27. Sonnenburg S, Rätsch G, Henschel S, Widmer C, Behr J, Zien A, Bona Fd, Binder A, Gehl C, Franc V (2010) The shogun machine learning toolbox. J Mach Learn Res 11:1799–1802. http://dl.acm.org/citation.cfm?id=1756006.1859911

    MATH  Google Scholar 

  28. Takahashi S, Takahashi W (2007) Viscosity approximation methods for equilibrium problems and fixed point problems in hilbert spaces. J Math Anal Appl 331(1):506–515. https://doi.org/10.1016/j.jmaa.2006.08.036. http://www.sciencedirect.com/science/article/pii/S0022247X06008894

    Article  MathSciNet  MATH  Google Scholar 

  29. Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B Methodol 58 (1):267–288. http://www.jstor.org/stable/2346178

    MathSciNet  MATH  Google Scholar 

  30. Tseng P (2000) A modified forward-backward splitting method for maximal monotone mappings. SIAM J Control Optim 38(2):431–446. https://doi.org/10.1137/S0363012998338806

    Article  MathSciNet  MATH  Google Scholar 

  31. Verma M, Shukla K (2017) A new accelerated proximal gradient technique for regularized multitask learning framework. Pattern Recogn Lett 95:98–103. https://doi.org/10.1016/j.patrec.2017.06.013. http://www.sciencedirect.com/science/article/pii/S0167865517302209

    Article  Google Scholar 

  32. Villa S, Salzo S, Baldassarre L, Verri A (2013) Accelerated and inexact forward-backward algorithms. SIAM J Optim 23(3):1607–1633. https://doi.org/10.1137/110844805

    Article  MathSciNet  MATH  Google Scholar 

  33. Xu HK (2004) Viscosity approximation methods for nonexpansive mappings. J Math Anal Appl 298(1):279–291. https://doi.org/10.1016/j.jmaa.2004.04.059. http://www.sciencedirect.com/science/article/pii/S0022247X04004160

    Article  MathSciNet  MATH  Google Scholar 

  34. Xu HK (2006) A regularization method for the proximal point algorithm. J Glob Optim 36(1):115–125. https://doi.org/10.1007/s10898-006-9002-7

    Article  MathSciNet  MATH  Google Scholar 

  35. Yu YL (2013) Better approximation and faster algorithm using the proximal average. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds) Advances in neural information processing systems 26, Curran Associates, Inc. pp 458–466. http://papers.nips.cc/paper/4934-better-approximation-and-faster-algorithm-using-the-proximal-average.pdf

  36. Zhao P, Yu B (2006) On model selection consistency of lasso. J Mach Learn Res 7:2541–2563. http://dl.acm.org/citation.cfm?id=1248547.1248637

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mridula Verma.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Verma, M., Sahu, D.R. & Shukla, K.K. VAGA: a novel viscosity-based accelerated gradient algorithm. Appl Intell 48, 2613–2627 (2018). https://doi.org/10.1007/s10489-017-1110-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-017-1110-1

Keywords

Navigation