Abstract
Convolutional neural networks (CNNs) obtain promising results via layered kernel convolution and pooling operations, yet the learning dynamics of the kernel remain obscure. We propose a continuous form to describe kernel-based convolutions through integration in neural manifolds. The status of spatial expression is proposed to analyze the stability of kernel-based CNNs. We divide CNN dynamics into the three stages of unstable vibration, collaborative adjusting, and stabilized fluctuation. According to the system control matrix of the kernel, the kernel-based CNN training proceeds via the unstable and stable status and is verified by numerical experiments.
Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Chollet F. Xception: deep learning with depthwise separable convolutions. In: Proceedings of IEEE Conference on Computer Vision & Pattern Recognition, 2016. 1–8
Szegedy C, Vanhoucke V, Ioffe S, et al. Rethinking the inception architecture for computer vision. In: Proceedings of IEEE Conference on Computer Vision & Pattern Recognition, 2016. 2818–2826
Zheng T Y, Chen G, Wang X Y, et al. Real-time intelligent big data processing: technology, platform, and applications. Sci China Inf Sci, 2019, 62: 082101
Yosinski J, Clune J, Nguyen A, et al. Understanding neural networks through deep visualization. In: Proceedings of the 31st International Conference on Machine Learning, 2015. 1–15
Brahma P P, Wu D, She Y. Why deep learning works: a manifold disentanglement perspective. IEEE Trans Neural Netw Learn Syst, 2016, 27: 1997–2008
Guo W L, Wei H K, Zhao J S, et al. Numerical analysis near singularities in RBF networks. J Mach Learn Res, 2018, 19: 1–39
Amari S, Park H, Ozeki T. Singularities affect dynamics of learning in neuromanifolds. Neural Computation, 2006, 18: 1007–1065
Sun H F, Peng L Y, Zhang Z N. Information geometry and its applications. Adv Math, 2011, 48: 75–102
Pratik C, Adam O, Stanley O, et al. Deep relaxation: partial differential equations for optimizing deep neural networks. Res Math Sci, 2018, 5: 1–22
Schilling R J, Carroll J J, Al-Ajlouni A F. Approximation of nonlinear systems with radial basis function neural networks. IEEE Trans Neural Netw, 2001, 12: 1–15
Wieland A P. Evolving neural network controllers for unstable systems. In: Proceedings of International Joint Conference on Neural Networks, 1991. 667–673
Scharf L L, Lytle D W. Stability of parameter estimates for a Gaussian process. IEEE Trans Aerosp Electron Syst, 1973, 9: 847–851
Vinogradska J, Bischoff B, Achterhold J, et al. Numerical quadrature for probabilistic policy search. IEEE Trans Pattern Anal Mach Intell, 2020, 42: 164–175
Saxe A M, McClelland J L, Ganguli S. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks. 2013. ArXiv: 1312.6120
Buxhoeveden D P, Casanova M F. The minicolumn hypothesis in neuroscience. Brain, 2002, 125: 935–951
Lee T H, Trinh H M, Park J H. Stability analysis of neural networks with time-varying delay by constructing novel Lyapunov functionals. IEEE Trans Neural Netw Learn Syst, 2018, 29: 4238–4247
Faydasicok O, Arik S. A novel criterion for global asymptotic stabilityof neutral type neural networks with discrete time delays. In: Proceedings of International Conference on Neural Information Processing, 2018. 353–360
Cousseau F, Ozeki T, Amari S. Dynamics of learning in multilayer perceptrons near singularities. IEEE Trans Neural Netw, 2008, 19: 1313–1328
Amari S. Natural gradient works efficiently in learning. Neural Comput, 1998, 10: 251–276
Wei H K, Zhang K J, Cousseau F, et al. Dynamics of learning near singularities in layered networks. Neural Comput, 2008, 20: 813–843
Sabour S, Frosst N, Hinton G E. Dynamic routing between capsules. In: Proceedings of the 31st Conference on Neural Information Processing Systems, 2017. 1–11
Li B. Parametric definition and production of directional convolution kernel. Chin J Comput, 1988, 11: 701–704
Lecun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition. Proc IEEE, 1998, 86: 2278–2324
Alex K, Ilya S, Geoffrey E H. ImageNet classification with deep convolutional neural networks. In: Proceedings of the Conference and Workshop on Neural Information Processing Systems, 2012. 1–9
Karen S, Andrew Z. Very deep convolutional networks for large-scale image recognition. In: Proceedings of International Conference on Learning Representations, 2014. 1–14
He K M, Zhang X Y, Ren S Q. Deep residual learning for image recognition. In: Proceedings of IEEE Conference on Computer Vision & Pattern Recognition, 2015. 1–12
Acknowledgements
This work was supported by Key Project of National Natural Science Foundation of China (Grant No. 61933013), Strategic Priority Research Program of the Chinese Academy of Sciences (Grant No. XDA22030301), NSFC-Key Project of General Technology Fundamental Research United Fund (Grant No. U1736211), Natural Science Foundation of Guangdong Province (Grant No. 2019A1515011076), and Key Project of Natural Science Foundation of Hubei Province (Grant No. 2018CFA024).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wu, W., Jing, X., Du, W. et al. Learning dynamics of kernel-based deep neural networks in manifolds. Sci. China Inf. Sci. 64, 212103 (2021). https://doi.org/10.1007/s11432-020-3022-3
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11432-020-3022-3