Abstract
This experiment integrates a particle filter concept with a gradient descent optimizer to reduce loss during iteration and obtains a particle filter-based gradient descent (PF-GD) optimizer that can determine the global minimum with excellent performance. Four functions are applied to test optimizer deployment to verify the PF-GD method. Additionally, the Modified National Institute of Standards and Technology (MNIST) database is used to test the PF-GD method by implementing a logistic regression learning algorithm. The experimental results obtained with the four functions illustrate that the PF-GD method performs much better than the conventional gradient descent optimizer, although it has some parameters that must be set before modeling. The results of implementing the MNIST dataset demonstrate that the cross-entropy of the PF-GD method exhibits a smaller decrease than that of the conventional gradient descent optimizer, resulting in higher accuracy of the PF-GD method. The PF-GD method provides the best accuracy for the training model, 97.00%, and the accuracy of evaluating the model with the test dataset is 90.37%, which is higher than the accuracy of 90.08% obtained with the conventional gradient descent optimizer.
Similar content being viewed by others
References
Naderpour H, Mirrashid M (2015) Application of soft computing to reinforced concrete beams strengthened with fibre reinforced polymers: a state-of-the-art review. Comput Tech Civ Struct Eng 38:305–323. https://doi.org/10.4203/csets.38.13
Hiziroglu A (2013) Soft computing applications in customer segmentation: state-of-art review and critique. Expert Syst Appl 40:6491–6507. https://doi.org/10.1016/j.eswa.2013.05.052
Naderpour H, Nagai K, Haji M, Mirrashid M (2019) Adaptive neuro-fuzzy inference modelling and sensitivity analysis for capacity estimation of fiber reinforced polymer-strengthened circular reinforced concrete columns. Expert Syst 36:e12410. https://doi.org/10.1111/exsy.12410
Naderpour H, Mirrashid M, Nagai K (2019) An innovative approach for bond strength modeling in FRP strip-to-concrete joints using adaptive neuro–fuzzy inference system. Eng Comput. https://doi.org/10.1007/s00366-019-00751-y
Naderpour H, Mirrashid M (2019) Shear failure capacity prediction of concrete beam–column joints in terms of ANFIS and GMDH. Pract Period Struct Des Constr 24:04019006. https://doi.org/10.1061/(ASCE)SC.1943-5576.0000417
Vieira S, Pinaya WH, Mechelli A (2017) Using deep learning to investigate the neuroimaging correlates of psychiatric and neurological disorders: methods and applications. Neurosci Biobehav Rev 74:58–75. https://doi.org/10.1016/j.neubiorev.2017.01.002
Li S, Dou Y, Niu X, Lv Q, Wang Q (2017) A fast and memory saved GPU acceleration algorithm of convolutional neural networks for target detection. Neurocomputing 230:48–59. https://doi.org/10.1016/j.neucom.2016.11.046
Günnemann N, Pfeffer J (2017) Predicting defective engines using convolutional neural networks on temporal vibration signals. In: First international workshop on learning with imbalanced domains: theory and applications, PMLR, Munich, Germany, pp 92–102
Akeret J, Chang C, Lucchi A, Refregier A (2017) Radio frequency interference mitigation using deep convolutional neural networks. Astron Comput 18:35–39. https://doi.org/10.1016/j.ascom.2017.01.002
Zor K, Timur O, Teke A (2017) A state-of-the-art review of artificial intelligence techniques for short-term electric load forecasting. In: 2017 6th international youth conference on energy (IYCE), IEEE, Budapest, Hungary, pp 1–7
Kaushal M, Khehra BS, Sharma A (2018) Soft computing based object detection and tracking approaches: state-of-the-art survey. Appl Soft Comput 70:423–464. https://doi.org/10.1016/j.asoc.2018.05.023
Tang Z, Luo L, Peng H, Li S (2018) A joint residual network with paired ReLUs activation for image super-resolution. Neurocomputing 273:37–46. https://doi.org/10.1016/j.neucom.2017.07.061
Yang W, Jin L, Tao D, Xie Z, Feng Z (2016) DropSample: a new training method to enhance deep convolutional neural networks for large-scale unconstrained handwritten Chinese character recognition. Pattern Recognit 58:190–203. https://doi.org/10.1016/j.patcog.2016.04.007
Le HT, Phung SL, Bouzerdoum A, Tivive FHC (2018) Human motion classification with micro-doppler radar and bayesian-optimized convolutional neural networks. In: 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, Calgary, AB, Canada, pp 2961–2965
Yang J, Yang G (2018) Modified convolutional neural network based on dropout and the stochastic gradient descent optimizer. Algorithms 11:28. https://doi.org/10.3390/a11030028
Bello I, Zoph B, Vasudevan V, Le HT (2017) Neural optimizer search with reinforcement learning. In: Proceedings of the 34th international conference on machine learning, PMLR, Sydney, Australia, pp 459–468
Vuckovic J (2018) Kalman gradient descent: adaptive variance reduction in stochastic optimization. arXiv preprint arXiv:181012273
Patel V (2016) Kalman-based stochastic gradient method with stop condition and insensitivity to conditioning. SIAM J Optim 26:2620–2648. https://doi.org/10.1137/15M1048239
Bittner B, Pronzato L (2004) Kalman filtering in stochastic gradient algorithms: construction of a stopping rule. In: 2004 IEEE international conference on acoustics, speech, and signal processing, IEEE, Montreal, Quebec, Canada, pp ii-709
Chernodub AN (2014) Training neural networks for classification using the extended kalman filter: a comparative study. Opt Mem Neural Netw 23:96–103. https://doi.org/10.3103/S1060992X14020088
Insom P, Cao C, Boonsrimuang P, Bao S, Chen W, Ni X (2016) A support vector machine-based particle filter for improved land cover classification applied to MODIS data. In: 2016 IEEE international geoscience and remote sensing symposium (IGARSS), IEEE, Beijing, China, pp 775–778
Insom P, Cao C, Boonsrimuang P, Liu D, Saokarn A, Yomwan P, Xu Y (2015) A support vector machine-based particle filter method for improved flooding classification. IEEE Geosci Remote Sens Lett 12:1943–1947. https://doi.org/10.1109/LGRS.2015.2439575
Liu B (2018) Particle filtering methods for stochastic optimization with application to large-scale empirical risk minimization. arXiv:180708534
Ruder S (2016) An overview of gradient descent optimization algorithms. arXiv preprint arXiv:160904747
Robbins H, Monro S (1951) A stochastic approximation method. Ann Math Stat 22:400–407. https://doi.org/10.1214/aoms/1177729586
Zuo J, Jia Y (2013) Particle filter guided by iterated extended kalman filter. In: 2013 13th international conference on control, automation and systems (ICCAS 2013), IEEE, Gwangju, South Korea, pp 1605–1609
Fu G-X, Gao M-L, Zou G-F, Liu W-C, Liu L-N (2018) An improved particle filter based on cuckoo search for visual tracking. In: 2018 Chinese control and decision conference (CCDC), IEEE, Shenyang, China, pp 3687–3691
Wang F, Lin Y (2009) Improving particle filter with a new sampling strategy. In: 2009 4th international conference on computer science & education, IEEE, Nanning, China, pp 408–412
Zhang T, Xu C, Yang M (2019) Learning multi-task correlation particle filters for visual tracking. IEEE Trans Pattern Anal Mach Intell 41:365–378. https://doi.org/10.1109/TPAMI.2018.2797062
Huang L, Fu Q, Li G, Luo B, Chen D, Yu H (2019) Improvement of maximum variance weight partitioning particle filter in urban computing and intelligence. IEEE Access 7:106527–106535. https://doi.org/10.1109/ACCESS.2019.2932144
Xu X, Zhao N, Dong H (2008) The iterated extended kalman particle filter for speech enhancement. In: 2008 9th international conference on signal processing, IEEE, Beijing, China, pp 104–107
Zhao Z, Wang J, Cheng X, Qi Y (2010) Particle swarm optimized particle filter and its application in visual tracking. In: 2010 sixth international conference on natural computation, IEEE, Yantai, China, pp 2673–2676
Zhu J, Wang X, Fang Q (2013) The improved particle filter algorithm based on weight optimization. In: 2013 international conference on information science and cloud computing companion, IEEE, Guangzhou, China, pp 351–356
Labbe R (2019) FilterPy. https://filterpy.readthedocs.io/en/latest/. Accessed 11 March 2019
Wikipedia (2019) Test functions for optimization. https://en.wikipedia.org/wiki/Test_functions_for_optimization. Accessed 25 Aug 2019
LeCun Y, Cortes C, Burges C (2019) The mnist database of handwritten digits. http://yann.lecun.com/exdb/mnist/. Accessed 25 Aug 2019
Insom P, Cao C, Boonsrimuang P, Torteeka P, Boonprong S, Liu D, Chen W (2017) The dynamics of wetland cover change using a state estimation technique applied to time-series remote sensing imagery. Geomat Nat Hazards Risk 8:1662–1677. https://doi.org/10.1080/19475705.2017.1370025
Ronkkonen J, Li X, Kyrki V, Lampinen J (2008) A generator for multimodal test functions with multiple global optima. In: Proceedings of the seventh international conference on simulated evolution and learning (SEAL’08). Lecture notes in computer science (LNCS 5361), Springer, Berlin, pp 239–248
Acknowledgements
The authors would like to thank the staff of the International Academy of Aviation Industry, King Mongkut’s Institute of Technology Ladkrabang, for their contributions to this article.
Funding
This work was supported by Research Seed Grant for New Lecturer, KMITL Research Fund, King Mongkut’s Institute of Technology Ladkrabang, Bangkok 10520, Thailand.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflicts of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kamsing, P., Torteeka, P. & Yooyen, S. An enhanced learning algorithm with a particle filter-based gradient descent optimizer method. Neural Comput & Applic 32, 12789–12800 (2020). https://doi.org/10.1007/s00521-020-04726-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-020-04726-9