Gradient-Free Neural Network Training Based on Deep Dictionary Learning with the Log Regularizer

Xie, Ying; Li, Zhenni; Zhao, Haoli

doi:10.1007/978-3-030-88013-2_46

Ying Xie¹⁶,
Zhenni Li¹⁶ &
Haoli Zhao¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 13022))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

1769 Accesses

Abstract

Gradient-free neural network training is attracting increasing attentions, which efficiently to avoid the gradient vanishing issue in traditional neural network training with gradient-based methods. The state-of-the-art gradient-free methods introduce a quadratic penalty or use an equivalent approximation of the activation function to achieve the training process without gradients, but they are hardly to mine effective signal features since the activation function is a limited nonlinear transformation. In this paper, we first propose to construct the neural network training as a deep dictionary learning model for achieving the gradient-free training of the network. To further enhance the ability of feature extraction in network training based on gradient-free method, we introduce the logarithm function as a sparsity regularizer which introduces accurate sparse activations on the hidden layer except for the last layer. Then, we employ a proximal block coordinate descent method to forward update the variables of each layer and apply the log-thresholding operator to achieve the optimization of the non-convex and non-smooth subproblems. Finally, numerical experiments conducted on several publicly available datasets prove the sparse representation of inputs is effective for gradient-free neural network training.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Brock, A., De, S., Smith, S.L., Simonyan, K.: High-performance large-scale image recognition without normalization. arXiv preprint arXiv:2102.06171 (2021)
Wu, Z., Zhao, D., Liang, Q., Yu, J., Gulati, A., Pang, R.: Dynamic sparsity neural networks for automatic speech recognition. In: ICASSP 2021–2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6014–6018. IEEE (2021)
Google Scholar
Woodworth, B., et al.: Is local SGD better than minibatch SGD? In: International Conference on Machine Learning, pp. 10334–10343. PMLR (2020)
Google Scholar
Kristiadi, A., Hein, M., Hennig, P.: Being Bayesian, even just a bit, fixes overconfidence in ReLU networks. In: International Conference on Machine Learning, pp. 5436–5446. PMLR (2020)
Google Scholar
Yao, Z., Cao, Y., Zheng, S., Huang, G., Lin, S.: Cross-iteration batch normalization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12331–12340 (2021)
Google Scholar
Peng, S., Huang, H., Chen, W., Zhang, L., Fang, W.: More trainable inception-ResNet for face recognition. Neurocomputing 411, 9–19 (2020)
Article Google Scholar
Li, J., Xiao, M., Fang, C., Dai, Y., Xu, C., Lin, Z.: Training neural networks by lifted proximal operator machines. IEEE Trans. Pattern Anal. Mach. Intell. (2020)
Google Scholar
Taylor, G., Burmeister, R., Xu, Z., Singh, B., Patel, A., Goldstein, T.: Training neural networks without gradients: a scalable ADMM approach. In: International Conference on Machine Learning, pp. 2722–2731. PMLR (2016)
Google Scholar
Wang, J., Yu, F., Chen, X., Zhao, L.: ADMM for efficient deep learning with global convergence. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 111–119 (2019)
Google Scholar
Carreira-Perpinan, M., Wang, W.: Distributed optimization of deeply nested systems. In: Artificial Intelligence and Statistics, pp. 10–19. PMLR (2014)
Google Scholar
Lau, T.T.K., Zeng, J., Wu, B., Yao, Y.: A proximal block coordinate descent algorithm for deep neural network training. arXiv preprint arXiv:1803.09082 (2018)
Zeng, J., Lau, T.T.K., Lin, S., Yao, Y.: Global convergence of block coordinate descent in deep learning. In: International Conference on Machine Learning, pp. 7313–7323. PMLR (2019)
Google Scholar
Gu, F., Askari, A., El Ghaoui, L.: Fenchel lifted networks: a lagrange relaxation of neural network training. In: International Conference on Artificial Intelligence and Statistics, pp. 3362–3371. PMLR (2020)
Google Scholar
Chen, Y., Su, J.: Dict layer: a structured dictionary layer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 422–431 (2018)
Google Scholar
Liu, Y., Chen, Q., Chen, W., Wassell, I.: Dictionary learning inspired deep network for scene recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)
Google Scholar
Singhal, V., Aggarwal, H.K., Tariyal, S., Majumdar, A.: Discriminative robust deep dictionary learning for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 55(9), 5274–5283 (2017)
Article Google Scholar
Qiao, L., Sun, T., Pan, H., Li, D.: Inertial proximal deep learning alternating minimization for efficient neutral network training. In: ICASSP 2021–2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3895–3899. IEEE (2021)
Google Scholar
Li, Z., Zhao, H., Guo, Y., Yang, Z., Xie, S.: Accelerated log-regularized convolutional transform learning and its convergence guarantee. IEEE Trans. Cybern. (2021)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Automation, Guangdong University of Technology, Guangzhou, 510006, China
Ying Xie & Zhenni Li
School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, 510006, China
Haoli Zhao

Authors

Ying Xie
View author publications
You can also search for this author in PubMed Google Scholar
Zhenni Li
View author publications
You can also search for this author in PubMed Google Scholar
Haoli Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhenni Li .

Editor information

Editors and Affiliations

University of Science and Technology Beijing, Beijing, China
Huimin Ma
Chinese Academy of Sciences, Beijing, China
Liang Wang
Tsinghua University, Beijing, China
Changshui Zhang
Zhejiang University, Hangzhou, China
Fei Wu
Chinese Academy of Sciences, Beijing, China
Tieniu Tan
Hunan University, Changsha, China
Yaonan Wang
Sun Yat-Sen University, Guangzhou, Guangdong, China
Jianhuang Lai
Beijing Jiaotong University, Beijing, China
Yao Zhao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xie, Y., Li, Z., Zhao, H. (2021). Gradient-Free Neural Network Training Based on Deep Dictionary Learning with the Log Regularizer. In: Ma, H., et al. Pattern Recognition and Computer Vision. PRCV 2021. Lecture Notes in Computer Science(), vol 13022. Springer, Cham. https://doi.org/10.1007/978-3-030-88013-2_46

Download citation

DOI: https://doi.org/10.1007/978-3-030-88013-2_46
Published: 22 October 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-88012-5
Online ISBN: 978-3-030-88013-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics