We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Skip to main content
Log in

A novel adaptive optimization method for deep learning with application to froth floatation monitoring

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Although Adam and its variants are widely used optimization methods, they have issues such as non-convergence and slow optimization speed. Researches have shown that weighting more of the past information in the the second moment estimation could be beneficial to the optimization process. But the design of update formulas and the standard of ideal switching time need to be considered. In this paper, a novel optimization method called Adaptive learning Rate switCH (ARCH) is proposed. According to a well-designed update formula, ARCH can increase the weight of historical information in the second moment estimation continuously. Besides, the switching time can be selected adaptively and automatically through experimental performance. A theoretical proof of the convergence of the proposed algorithm is also presented. To verify the performance of ARCH, a series of comparative experiments, which compare ARCH with other optimization methods in several classical convolution neural networks, are carried out. Experimental results have shown that ARCH has fast convergence speed as well as good generalization performance. Moreover, the algorithm proposed in this paper is also applied in practical froth flotation monitoring and results show that ARCH can perform excellently in practical application as well.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1
Algorithm 2
Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data Availability

The MNIST datasets analysed during this study are available in National Institute of Standards and Technology (NIST) repository: http://yann.lecun.com/exdb/mnist/. The Cifar-10 datasets used in this study are available in Alex Krizhevsky’s home page: http://www.cs.toronto.edu/~kriz/cifar.html. The froth floatation photos are not publicly available due to relevant data protection laws but are available from the corresponding author on reasonable request.

References

  1. Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117

    Article  Google Scholar 

  2. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  3. Socher R, Bengio Y, Manning CD (2012) Deep learning for nlp (without magic). In: Tutorial abstracts of ACL 2012, USA, pp 5–9

  4. Liang D, Ma F, Li W (2020) New gradient-weighted adaptive gradient methods with dynamic constraints. IEEE Access 8:110929–110942

    Article  Google Scholar 

  5. Lv K, Jiang S, Li J (2017) Learning gradient descent: better generalization and longer horizons. In: Precup D, Teh YW (eds) Proceedings of the 34th international conference on machine learning, vol 70. USA, pp 2247–2255

  6. Robbins H, Monro S (1951) A stochastic approximation method. The Annals Math Stat:400–407

  7. Sun S, Cao Z, Zhu H, Zhao J (2019) A survey of optimization methods from a machine learning perspective. IEEE Trans Cybern 50(8):3668–3681

    Article  Google Scholar 

  8. Lin J, Song C, He K, Wang L, Hopcroft JE (2020) Nesterov accelerated gradient and scale invariance for adversarial attacks. In: International conference on learning representations, pp 1–12

  9. Nesterov Y (1983) A method for unconstrained convex minimization problem with the rate of convergence o (1/k ̂2). In: Doklady an ussr, vol 269, pp 543–547

  10. Sun S, Cao Z, Zhu H, Zhao J (2020) A survey of optimization methods from a machine learning perspective. IEEE Trans Cybern 50(8):3668–3681

    Article  Google Scholar 

  11. Kingma D, Ba J (2014) Adam: a method for stochastic optimization. Int Conf Learn Representations:1–13

  12. Duchi J, Hazan E, Singer Y (2011) Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res 12(7):2121–2159

    MathSciNet  MATH  Google Scholar 

  13. Tieleman T, Hinton G (2012) Lecture 6.5-rmsprop: divide the gradient by a running average of its recent magnitude. COURSERA: Neural Netw Mach Learn 4(2):26–31

    Google Scholar 

  14. Yang L, Cai D (2021) Adadb: an adaptive gradient method with data-dependent bound. Neurocomputing 419:183–189

    Article  Google Scholar 

  15. Reddi SJ, Kale S, Kumar S (2018) On the convergence of adam and beyond. In: International conference on learning representations, pp 1–12

  16. Zhou Z, Zhang Q, Lu G (2019) Adashift: decorrelation and convergence of adaptive learning rate methods. In: International conference on learning representations, pp 1–26

  17. Luo L, Xiong Y, Liu Y (2019) Adaptive gradient methods with dynamic bound of learning rate. In: International conference on learning representations, pp 1–19

  18. Huang H, Wang C, Dong B (2019) Nostalgic adam: weighting more of the past gradients when designing the adaptive learning rate. In: Twenty-eighth international joint conference on artificial intelligence, pp 2556–2562

  19. Wilson AC, Roelofs R, Stern M, Srebro N, Recht B (2017) The marginal value of adaptive gradient methods in machine learning. In: Proceedings of the 31st international conference on neural information processing systems, pp 4151–4161

  20. Li ZM, Gui WH, Zhu JY (2019) Fault detection in flotation processes based on deep learning and support vector machine. J Cent South Univ 26(9):2504–2515

    Article  Google Scholar 

  21. Zhou X, Wang Q, Zhang R (2020) A hybrid feature selection method for production condition recognition in froth flotation with noisy labels. Miner Eng 153:106–201

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant No. 61873285), the National Key Research and Development Program of China (Grant No. 2018AAA0101603), International Cooperation and Exchange of the National Natural Science Foundation of China (Grant No. 61860206014).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaojun Zhou.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ma, B., Du, Y., Zhou, X. et al. A novel adaptive optimization method for deep learning with application to froth floatation monitoring. Appl Intell 53, 11820–11832 (2023). https://doi.org/10.1007/s10489-022-04083-1

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-04083-1

Keywords

Navigation