Abstract
In this paper, we propose a correlation projection network (CPNet) that determines its parameters analytically for pattern classification. This network consists of multiple modules with each module containing two layers. We first introduce a label encoding process for each module to facilitate a locally supervised learning. Subsequently, in each module, the first layer conducts what we call the correlation projection process for feature extraction. The second layer determines its parameters analytically through solving a least squares problem. By introducing a corresponding label decoding process, the proposed CPNet achieves a multi-exit structure which is the first of its kind in multilayer analytic learning. Due to the analytic learning technique, the proposed method only needs to visit the dataset once, and is hence significantly faster than the commonly used backpropagation, as verified in the experiments. We also conduct classification tasks on various benchmark datasets which demonstrate competitive results compared with several state-of-the-arts.
Similar content being viewed by others
Notes
The MP inverse is also called pseudoinverse in many references.
References
Barton SA (1991) A matrix method for optimizing a neural network. Neural Comput 3(3):450–459
Belilovsky E, Eickenberg M, Oyallon E (2019) Decoupled greedy learning of CNNs. arXiv preprint arXiv:1901.08164
Belilovsky E, Eickenberg M, Oyallon E (2019) Greedy layerwise learning can scale to imagenet. In: International conference on machine learning, pp 583–593
Ben-Israel A, Greville TNE (2003) Generalized inverses: theory and applications, 2nd edn. Springer, New York
Chan T, Jia K, Gao S, Lu J, Zeng Z, Ma Y (2015) PCANet: a simple deep learning baseline for image classification? IEEE Trans Image Process 24(12):5017–5032
Chen S, Grant P, Cowan C (1992) Orthogonal least-squares algorithm for training multioutput radial basis function networks. In: IEE proceedings of radar and signal processing, vol 139. IET, pp 378–384
Dawson M, Olvera J, Fung A, Manry M (1992) Inversion of surface parameters using fast learning neural networks. In: [Proceedings] IGARSS’92 international geoscience and remote sensing symposium, vol 2. IEEE, pp 910–912
Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT press
Guang-Bin H, Qin-Yu Z, Chee-Kheong S (2004) Extreme learning machine: a new learning scheme of feedforward neural networks. In: 2004 IEEE international joint conference on neural networks, vol 2
Guo Ping C.P.C, Sun Y (1995) An exact supervised learning for a three-layer supervised neural network. In: Proceedings of international conference on neural information processing. IEEE
Guo P, Lyu MR (2004) A pseudoinverse learning algorithm for feedforward neural networks with stacked generalization applications to software reliability growth data. Neurocomputing 56:101–121
Guo P, Lyu M.R, Mastorakis N (2001) Pseudoinverse learning algorithm for feedforward neural networks. Adv Neural Netw Appl 321–326
Guo P, Zhou X.L, Wang K (2018) PILAE: a non-gradient descent learning scheme for deep feedforward neural networks. Arxiv:1811.01545
Hao WL, Zhang Z (2016) Incremental pcanet: a lifelong learning framework to achieve the plasticity of both feature and classifier constructions. In: International conference on brain inspired cognitive systems
He K, Peng Y, Liu S, Li J (2020) Regularized negative label relaxation least squares regression for face recognition. Neural Process Lett 1–19
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: The IEEE conference on computer vision and pattern recognition
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554
Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1–3):489–501
Hutchinson B, Deng L, Yu D (2013) Tensor deep stacking networks. IEEE Trans Patt Anal Mach Intell 35(8):1944–1957
Jaderberg M, Czarnecki W, Osindero S, Vinyals O, Graves A, Silver D, Kavukcuoglu K (2016) Decoupled neural interfaces using synthetic gradients. In: International conference on machine learning
Jang S, Tan G, Toh K, Teoh ABJ (2020) Online heterogeneous face recognition based on total-error-rate minimization. IEEE Trans Syst Man Cybern Syst 50(4):1286–1299
Kuo CCJ, Zhang M, Li S, Duan J, Chen Y (2019) Interpretable convolutional neural networks via feedforward design. J Vis Commun Image Repres 60:346–359
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Li L, Zhao K, Sun R, Gan J, Yuan G, Liu T (2020) Parameter-free extreme learning machine for imbalanced classification. Neural Process Lett 1–18
Liu P, Huang Y, Meng L, Gong S, Zhang G (2016) Two-stage extreme learning machine for high-dimensional data. Int J Mach Learn Cybern 7(5):765–772
Low C, Park J, Teoh A.B (2019) Stacking-based deep neural network: deep analytic network for pattern classification. IEEE Trans Cybern 1–14
Martínez-Rego D, Fontenla-Romero O, Alonso-Betanzos A (2012) Nonlinear single layer neural network training algorithm for incremental, nonstationary and distributed learning scenarios. Patt Recogn 45(12):4536–4546
Mesquita DP, Gomes JPP, Rodrigues LR (2019) Artificial neural networks with random weights for incomplete datasets. Neural Process Lett 50(3):2345–2372
Mostafa H, Ramesh V, Cauwenberghs G (2018) Deep supervised learning using local errors. Front Neurosci 12:608
Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp 807–814
Nielsen MA (2015) Neural networks and deep learning, vol 25. Determination press San Francisco, CA
Nøkland A, Eidnes LH (2019) Training neural networks with local error signals. In: International conference on machine learning, pp 4839–4850
Pan SJ, Yang Q (2009) A survey on transfer learning. IEEE Trans Knowled Data Eng 22(10):1345–1359
Park J, Sandberg IW (1991) Universal approximation using radial-basis-function networks. Neural Comput 3:246–257
Potdar K, Pardawala TS, Pai CD (2017) A comparative study of categorical variable encoding techniques for neural network classifiers. Int J Comput Appl 175(4):7–9
Schmidt W, Kraaijveld M, Duin R (1992) Feedforward neural networks with random weights. In: Proceedings of 11th IAPR international conference on pattern recognition, vol II. Conference B: pattern recognition methodology and systems. IEEE, pp 1–4
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition
Specht DF (1990) Probabilistic neural networks. Neural Netw 3(1):109–118
Stone M (1974) Cross-validatory choice and assessment of statistical predictions. J Roy Stat Soc Ser B (Methodological) 36(2):111–147
Toh KA (2008) Deterministic neural classification. Neural Comput 20(6):1565–1595
Toh KA (2018) Analytic network learning. arXiv preprint arXiv:1811.08227
Toh KA (2018) Kernel and range approach to analytic network learning. Int J Netw Distrib Comput 7(1):20–28
Toh K.A (2018) Learning from the kernel and the range space. In: The proceedings of the 17th 2018 IEEE conference on computer and information science (ICIS). IEEE, pp 417–422
Toh KA, Lin Z, Li Z, Oh B, Sun L (2018) Gradient-free learning based on the kernel and the range space. arXiv preprint arXiv:1810.11581
Toh KA, Lin Z, Sun L, Li Z (2018) Stretchy binary classification. Neural Netw 97:74–91
Wang R, Kwong S, Wang X (2012) A study on random weights between input and hidden layers in extreme learning machine. Soft Comput 16(9):1465–1475
Wang X, Zhang T, Wang R (2019) Noniterative deep learning: incorporating restricted boltzmann machine into multilayer random weight neural networks. IEEE Trans Syst Man Cybern Syst 49(7):1299–1308
Werbos P (1974) Beyond regression: new tools for prediction and analysis in the behavioral sciences. PhD dissertation, Harvard University
Wu J, Qiu S, Zeng R, Kong Y, Senhadji L, Shu H (2017) Multilinear principal component analysis network for tensor object classification. IEEE Access 5:3322–3331
Xiao D, Li B, Mao Y (2017) A multiple hidden layers extreme learning machine method and its application. Math Prob Eng. https://doi.org/10.1155/2017/4670187
Xu Y, Zhang J, Long Z, Lv M (2019) Daily urban water demand forecasting based on chaotic theory and continuous deep belief neural network. Neural Process Lett 50(2):1173–1189
Zhuang H, Lin Z, Toh KA (2020) Training a multilayer network with low-memory kernel-and-range projection. J Franklin Inst 357(1):522–550
Acknowledgements
This work was supported in part by the Science and Engineering Research Council, Agency of Science, Technology and Research, Singapore, through the National Robotics Program under Grant No. 1922500054. The computational work for this article was partially performed on resources of the National Supercomputing Centre, Singapore (https://www.nscc.sg). This research was also partially supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (Grant number: NRF-2018R1D1A1A09081956).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhuang, H., Lin, Z. & Toh, KA. Correlation Projection for Analytic Learning of a Classification Network. Neural Process Lett 53, 3893–3914 (2021). https://doi.org/10.1007/s11063-021-10570-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-021-10570-2