Abstract
Deep learning techniques can form generalized models that can solve any problem that is not solvable by traditional approaches. It explains the omnipresence of deep learning models across all domains. However, a lot of time is spent on finding the optimal hyperparameters to help the model generalize and give the highest accuracy. This paper investigates a proposed model incorporating hybrid layers and a novel approach for weight initialization aimed at—(1) Reducing the overall trial and error time spent in finding the optimal number of layers by providing the necessary insights. (2) Reducing the randomness in weight initialization with the help of a novel incremental backpropagation based model architecture. The model, along with the principal component analysis-based initialization, substantially provides a stable weight initialization, thereby improving the train and test performance and speeding up the process of convergence to an optimal solution. Furthermore, three data sets were tested on the proposed approach, and they outperformed the state-of-the-art initialization methods.








Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The data sets generated during and/or analyzed during the current study are available in the open-source Kaggle repository. The following are details. 1. Fashion MNIST Data Set: https://www.kaggle.com/datasets/zalando-research/fashionmnist. 2. Intel Image Classification Data set: https://www.kaggle.com/datasets/puneet6060/intel-image-classification. 3. EMNIST Letters Data Set: https://www.kaggle.com/datasets/crawford/emnist.
References
Baldominos A, Saez Y, Isasi P (2020) On the automated, evolutionary design of neural networks: past, present, and future. Neural Comput Appl 32(2):519–545
Basodi S, Ji C, Zhang H, Pan Y (2020) Gradient amplification: an efficient way to train deep neural networks. Big Data Min Anal 3(3):196–207
Cohen G, Afshar S, Tapson J, van Schaik A (2017) Emnist: an extension of mnist to handwritten letters. CoRR abs/1702.05373
Ding S, Li H, Su C, Yu J, Jin F (2013) Evolutionary artificial neural networks: a review. Artif Intell Rev 39(3):251–260
Fahlman SE, Lebiere C (1990) The cascade-correlation learning architecture. CARNEGIE-MELLON, Pittsburgh
Frean M (1990) The upstart algorithm: a method for constructing and training feedforward neural networks. Neural Comput 2(2):198–209
Gu J, Wang Z, Kuen J, Ma L, Shahroudy A, Shuai B, Liu T, Wang X, Wang G, Cai J, Chen T (2018) Recent advances in convolutional neural networks. Pattern Recognit 77:354–377
Han Y, Huang G, Song S, Yang L, Wang H, Wang Y (2022) Dynamic neural networks: a survey. IEEE Trans Pattern Anal Mach Intell 44(11):7436–7456
Hancock JT, Khoshgoftaar TM (2020) Survey on categorical data for neural networks. J Big Data 7(1):1–41
He H, Chen M, Xu G, Zhu Z, Zhu Z (2021) Learnability and robustness of shallow neural networks learned by a performance-driven bp and a variant of pso for edge decision-making. Neural Comput Appl 33:13809–13830
Hu YQ, Yu Y (2020) A technical view on neural architecture search. Int J Mach Learn Cybern 11(4):795–811
Huang G, Sun Y, Liu Z, Sedra D, Weinberger K (2016) Deep networks with stochastic depth. In: Computer Vision—European conference on computer vision (ECCV) 2016, vol 9908, pp 646–661
Igomu E, Ige E, Adesina O (2022) Coupled modeling and process optimization in a genetic-algorithm paradigm for reverse osmosis dialysate production plant. S Afr J Chem Eng 42:337–350
Kaggle (2019) Intel image classification dataset. https://www.kaggle.com/puneet6060/intel-image-classification
Kavzoglu T (1999) Determining optimum structure for artificial neural networks. In: Annual Technical Conference and Exhibition of the Remote Sensing Society
Li J, Han P, Ren X, Hu J, Chen L, Shang S (2021) Sequence labeling with meta-learning. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/TKDE.2021.3118469
Li J, Shang S, Chen L (2021) Domain generalization for named entity boundary detection via metalearning. IEEE Trans Neural Netw Learn Syst 32(9):3819–3830. https://doi.org/10.1109/TNNLS.2020.3015912
Li J, Sun A, Ma Y (2021) Neural named entity boundary detection. IEEE Trans Knowl Data Eng 33(4):1790–1795. https://doi.org/10.1109/TKDE.2020.2981329
Li Z, Liu F, Yang W, Peng S, Zhou J (2022) A survey of convolutional neural networks: Analysis, applications, and prospects. IEEE Trans Neural Netw Learn Syst 33(12):6999–7019
Liu Y, Zhou C, Chen Y (2006) Weight initialization of feedforward neural networks by means of partial least squares. In: 2006 International Conference on Machine Learning and Cybernetics, pp 3119–3122
MacDonald G, Godbout A, Gillcash B, Cairns S (2019) Volume-preserving neural networks: A solution to the vanishing gradient problem. arXiv preprint arXiv:1911.09576
Mansouri E, Manfredi M, Hu JW (2022) Environmentally friendly concrete compressive strength prediction using hybrid machine learning. Sustainability 14(20)
Ojha VK, Abraham A, Snášel V (2017) Metaheuristic design of feedforward neural networks: a review of two decades of research. Eng Appl Artif Intell 60:97–116
Owens F, Zheng F, Irvine D (2005) A multi-output-layer perceptron. Neural Comput Appl 4:10–20
Seuret M, Alberti M, Liwicki M, Ingold R (2017) Pca-initialized deep neural networks applied to document image analysis. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol 01, pp 877–882
Singh J, Banerjee R (2019) A study on single and multi-layer perceptron neural network. In: 2019 3rd International Conference on Computing Methodologies and Communication (ICCMC), pp 35–40
Singh P, Varshney M, Namboodiri VP (2020) Cooperative initialization based deep neural network training. In: 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), pp 1130–1139
Sodhi SS, Chandra P, Tanwar S (2014) A new weight initialization method for sigmoidal feedforward artificial neural networks. In: 2014 International Joint Conference on Neural Networks (IJCNN), pp 291–298
Sousa C (2016) An overview on weight initialization methods for feedforward neural networks. In: 2016 International Joint Conference on Neural Networks (IJCNN)
Xiao H, Rasul K, Vollgraf R (2017) Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. CoRR abs/1708.07747
Yao X (1999) Evolving artificial neural networks. Proc IEEE 87(9):1423–1447
Zaki MJ, Meira W Jr, Meira W (2014) Data mining and analysis: fundamental concepts and algorithms. Cambridge University Press, pp 548–583
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chaudhari, R., Agarwal, D., Ravishankar, K. et al. Multi-output incremental back-propagation. Neural Comput & Applic 35, 14897–14910 (2023). https://doi.org/10.1007/s00521-023-08490-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-023-08490-4