Abstract
Training optimization plays a vital role in the development of convolution neural network (CNN). CNNs are hard to train because of the presence of multiple local minima. The optimization problem for a CNN is non-convex, hence, has multiple local minima. If any of the chosen hyper-parameters are not appropriate, it will end up at bad local minima, which leads to poor performance. Hence, proper optimization of the training algorithm for CNN is the key to converge to a good local minimum. Therefore, in this paper, we introduce an evolutionary convolution neural network (ModPSO-CNN) algorithm. The proposed algorithm results in the fusion of modified particle swarm optimization (ModPSO) along with backpropagation (BP) and convolution neural network (CNN). The training of CNN involves ModPSO along with backpropagation (BP) algorithm to encourage performance improvement by avoiding premature convergence and local minima. The ModPSO have adaptive, dynamic and improved parameters, to handle the issues in training CNN. The adaptive and dynamic parameters bring a proper balance between the global and local search ability, while an improved parameter keeps the diversity of the swarm. The proposed ModPSO algorithm is validated on three standard mathematical test functions and compared with three variants of the benchmark PSO algorithm. Furthermore, the performance of the proposed ModPSO-CNN is also compared with other training algorithms focusing on the analysis of computational cost, convergence and accuracy based on a standard problem specific to classification applications, such as CIFAR-10 dataset and face and skin detection dataset.
Similar content being viewed by others
References
Bo L, Ren X, Fox D (2010) Kernel descriptors for visual recognition. In: Advances in neural information processing systems, pp 244–252
Boureau Y-L, Ponce J, LeCun Y (2010) A theoretical analysis of feature pooling in visual recognition. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp 111–118
Bulan O, Kozitsky V, Ramesh P, Shreve M (2017) Segmentation-and annotation-free license plate recognition with deep localization and failure identification. IEEE Trans Intell Transp Syst 18(9):2351–2363
Chan T-H, Jia K, Gao S, Lu J, Zeng Z, Ma Y (2015) Pcanet: a simple deep learning baseline for image classification? IEEE Trans Image Process 24(12):5017–5032
Coates A, Ng A, Lee H (2011) An analysis of single-layer networks in unsupervised feature learning. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics, pp 215–223
Damer N, Opel A, Nouak A (2014) Cmc curve properties and biometric source weighting in multi-biometric score-level fusion. In: 17th international conference on information fusion (FUSION). IEEE, pp 1–6
DeCann B, Ross A (2013) Relating roc and cmc curves via the biometric menagerie. In: 2013 IEEE Sixth international conference on biometrics: theory, applications and systems (BTAS). IEEE, pp 1–8
Ding C, Tao D (2015) Robust face recognition via multimodal deep face representation. IEEE Trans Multimedia 17(11):2049–2058
Higashi N, Iba H (2003) Particle swarm optimization with gaussian mutation. In: Proceedings of the 2003 IEEE swarm intelligence symposium, SIS’03, IEEE, pp 72–79
Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov RR (2012) Improving neural networks by preventing co-adaptation of feature detectors. arXiv:1207.0580
Huang FJ, Boureau Y-L, LeCun Y et al (2007) Unsupervised learning of invariant feature hierarchies with applications to object recognition. In: IEEE conference on computer vision and pattern recognition, CVPR’07, IEEE, pp 1–8
Jarrett K, Kavukcuoglu K, LeCun Y et al (2009) What is the best multi-stage architecture for object recognition?. In: 2009 IEEE 12th international conference on computer vision, IEEE, pp 2146–2153
Kennedy J (2011) Particle swarm optimization. In: Encyclopedia of machine learning, Springer, pp 760–766
Khamsemanan N, Nattee C, Jianwattanapaisarn N (2017) Human identification from freestyle walks using posture-based gait feature. IEEE Trans Inform Forensics Secur 13(1):119–128
Krizhevsky A (2014) Cuda-convnet. code. google. com/p/cudaconvnet
Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images. Technical Report, Citeseer
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
LeCun Y, Kavukcuoglu K, Farabet C et al (2010) Convolutional networks and applications in vision. ISCAS 2010:253–256
Liu H, Tian H-Q, Chen C, Li Y-F (2013) An experimental investigation of two wavelet-mlp hybrid frameworks for wind speed prediction using ga and pso optimization. Int J Electric Power Energy Syst 52:161–173
Low C-Y, Teoh AB-J, Toh K-A (2017) Stacking pcanet+: an overly simplified convnets baseline for face recognition. IEEE Signal Process Lett 24:1581–1585
Ludermir TB, De Oliveira WR (2013) Particle swarm optimization of mlp for the identification of factors related to common mental disorders. Expert Syst Appl 40(11):4648–4652
Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp 807–814
Ngiam J, Chen Z, Chia D, Koh PW, Le QV, Ng AY (2010) Tiled convolutional neural networks. In: Advances in neural information processing systems, pp 1279–1287
Phung SL, Bouzerdoum A (2007) A pyramidal neural network for visual pattern recognition. IEEE Trans Neural Networks 18(2):329–343
Phung SL, Bouzerdoum A, Chai D (2005) Skin segmentation using color pixel classification: analysis and comparison. IEEE Trans Pattern Anal Mach Intell 27(1):148–154
Rehman SU, Tu S, Huang Y, Yang Z (2016) Face recognition: a novel un-supervised convolutional neural network method. In: IEEE international conference of online analysis and computing science (ICOACS), IEEE, pp 139–144
Rehman S u, Tu S, Huang Y, Liu G et al (2017) Csfl: A novel unsupervised convolution neural network approach for visual pattern classification. AI Commun 30(5):311–324
Rehman S u, Tu S, Huang Y, Magurawalage C M S, Chang C-C et al (2018) Optimization of cnn through novel training strategy for visual classification problems. Entropy 20(4):290
Rehman O U, Tu S, Rehman S U, Khan S, Yang S (2018) Design optimization of electromagnetic devices using an improved quantum inspired particle swarm optimizer. Appl Comput Electromagnet Soc J 33:9
Rehman O U, Rehman S U, Tu S, Khan S, Waqas M, Yang S (2018) A quantum particle swarm optimization method with fitness selection methodology for electromagnetic inverse problems. IEEE Access 6:63 155–63 163
Rehman S U, Tu S, Huang Y, Rehman O U (2018) A benchmark dataset and learning high-level semantic embeddings of multimedia for cross-media retrieval. IEEE Access 6:67 176–67 188
Seha SNA, Hatzinakos D (2018) Human recognition using transient auditory evoked potentials: a preliminary study. IET Biometrics 7(3):242–250
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Tang Y (2013) Deep learning using linear support vector machines. arXiv:1306.0239
ur Rehman S, Huang Y, Tu S, ur Rehman O (2018) Facebook5k: a novel evaluation resource dataset for cross-media search. In: International conference on cloud computing and security, Springer, pp 512–524
Wang T, Wu DJ, Coates A, Ng AY (2012) End-to-end text recognition with convolutional neural networks. In: 21st International conference on pattern recognition (ICPR), IEEE, pp 3304–3308
Yang J, Yu K, Gong Y, Huang T (2009) Linear spatial pyramid matching using sparse coding for image classification. In: IEEE conference on computer vision and pattern recognition, CVPR 2009, IEEE, pp 1794–1801
Yang Z, Zhang Y-J, ur Rehman S, Huang Y (2017) Image captioning with object detection and localization. In: International conference on image and graphics, Springer, pp 109–118
Yu K, Zhang T (2010) Improved local coordinate coding using local tangents. In: ICML. Citeseer, pp 1215–1222
Zeiler M D, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision. Springer, pp 818–833
Zhan Z-H, Zhang J, Li Y, Chung HS-H (2009) Adaptive particle swarm optimization. IEEE Trans Syst Man Cybern Part B 39(6):1362–1381
Zhang K, Zhang Z, Li Z, Qiao Y (2016) Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process Lett 23(10):1499–1503
Acknowledgements
This work is supported in part by the National Natural Science Foundation of China (No. 61801008), National Key R&D Program of China (No. 2018YFB0803600), Beijing Natural Science Foundation National (No. L172049), and Beijing Science and Technology Planning Project (NO. Z171100004717001).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interests.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Communicated by V. Loia.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Tu, S., Rehman, S.u., Waqas, M. et al. ModPSO-CNN: an evolutionary convolution neural network with application to visual recognition. Soft Comput 25, 2165–2176 (2021). https://doi.org/10.1007/s00500-020-05288-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-020-05288-7