Effective node selection technique towards sparse learning

Ibrokhimov, Bunyodbek; Hur, Cheonghwan; Kang, Sanggil

doi:10.1007/s10489-020-01720-5

Effective node selection technique towards sparse learning

Published: 15 May 2020

Volume 50, pages 3239–3251, (2020)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Bunyodbek Ibrokhimov¹,
Cheonghwan Hur¹ &
Sanggil Kang¹

479 Accesses
8 Citations
Explore all metrics

Abstract

Neural networks are getting wider and deeper to achieve state-of-the-art results in various machine learning domains. Such networks result in complex structures, high model size, and computational costs. Moreover, these networks are failing to adapt to new data due to their isolation in the specific domain-target space. To tackle these issues, we propose a sparse learning method to train the existing network on new classes by selecting non-crucial parameters from the network. Sparse learning also manages to keep the performance of existing classes with no additional network structure and memory costs by employing an effective node selection technique, which analyzes and selects unimportant parameters by using information theory in the neuron distribution of the fully connected layers. Our method could learn up to 40% novel classes without notable loss in the accuracy of existing classes. Through experiments, we show how a sparse learning method competes with state-of-the-art methods in terms of accuracy and even surpasses the performance of related algorithms in terms of efficiency in memory, processing speed, and overall training time. Importantly, our method can be implemented in both small and large applications, and we justify this by using well-known networks such as LeNet, AlexNet, and VGG-16.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Data-Driven Sparse Structure Selection for Deep Neural Networks

Learning Sparse Neural Networks with Identity Layers

SRS-DNN: a deep neural network with strengthening response sparsity

Article 26 June 2019

References

R. Girshick, J. Donahue, T. Darrell, and J. Malik, \Rich feature hierarchies for accurate object detection and semantic segmentation," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 580–587
R. Girshick, “Fast r-cnn," in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1440–1448
J.-J. Hwang and T.-L. Liu, “Pixel-wise deep learning for contour detection," arXiv preprint arXiv:1504.01989, 2015
Johnson, J., Karpathy, A., & Fei-Fei, L. (2016). Densecap: fully convolutional localization networks for dense captioning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 4565-4574)
Chen, W., Wilson, J., Tyree, S., Weinberger, K. Q., & Chen, Y. (2016). Compressing convolutional neural networks in the frequency domain. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1475-1484)
Farabet C, Couprie C, Najman L, LeCun Y (2012) Learning hierarchical features for scene labeling. IEEE Trans Pattern Anal Mach Intell 35(8):1915–1929
Article Google Scholar
Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3431-3440)
Bansal, A., Russell, B., & Gupta, A. (2016). Marr revisited: 2d-3d alignment via surface normal prediction. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5965-5974)
Eigen, D., & Fergus, R. (2015). Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In Proceedings of the IEEE international conference on computer vision (pp. 2650-2658)
Wang, X., Fouhey, D., & Gupta, A. (2015). Designing deep networks for surface normal estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 539-547)
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778)
Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: towards real-time object detection with region proposal networks. In Advances in neural information processing systems (pp. 91-99)
Sermanet, P., Eigen, D., Mathieu, M., Zhang, X., Fergus, R., & Lecun, Y. (2013). OverFeat detection using deep learning. In International Conference on Learning Representations (ICLR) (Vol. 16)
Boukli Hacene G, Gripon V, Farrugia N, Arzel M, Jezequel M (2018) Transfer incremental learning using data augmentation. Appl Sci 8(12):2512
Article Google Scholar
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105)
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., ... & Rabinovich, A. (2015). Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1–9)
Han, K., Vedaldi, A., & Zisserman, A. (2019). Learning to discover novel visual categories via deep transfer clustering. In Proceedings of the IEEE International Conference on Computer Vision (pp. 8401-8409)
Harris B, Bae I, Egger B (2019) Architectures and algorithms for on-device user customization of CNNs. Integration 67:121–133
Article Google Scholar
Lawrence, N. D., & Platt, J. C. (2004). Learning to learn with the informative vector machine. In Proceedings of the twenty-first international conference on Machine learning (p. 65)
Bonilla, E. V., Chai, K. M., & Williams, C. (2008). Multi-task Gaussian process prediction. In Advances in neural information processing systems (pp. 153-160)
Schwaighofer, A., Tresp, V., & Yu, K. (2005). Learning Gaussian process kernels via hierarchical Bayes. In Advances in neural information processing systems (pp. 1209-1216)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2016). Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2818-2826)
Wang, Y. X., & Hebert, M. (2016). Learning from small sample sets by combining unsupervised meta-training with CNNs. In Advances in Neural Information Processing Systems (pp. 244-252)
Wu, Y., Chen, Y., Wang, L., Ye, Y., Liu, Z., Guo, Y., & Fu, Y. (2019). Large scale incremental learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 374-382)
Ando RK, Zhang T (2005) A framework for learning predictive structures from multiple tasks and unlabeled data. J Mach Learn Res 6(Nov):1817–1853
MathSciNet MATH Google Scholar
Castro, F. M., Marín-Jiménez, M. J., Guil, N., Schmid, C., & Alahari, K. (2018). End-to-end incremental learning. In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 233-248)
Argyriou A, Evgeniou T, Pontil M (2008) Convex multi-task feature learning. Mach Learn 73(3):243–272
Article Google Scholar
Hinton, G., Vinyals, O., & Dean, J. (2015). Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531
Wu, J., Leng, C., Wang, Y., Hu, Q., & Cheng, J. (2016). Quantized convolutional neural networks for mobile devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 4820-4828)
Yu D, Deng L (2010) Deep learning and its applications to signal and information processing [exploratory dsp]. IEEE Signal Process Mag 28(1):145–154
Article Google Scholar
Denil, M., Shakibi, B., Dinh, L., Ranzato, M. A., & De Freitas, N. (2013). Predicting parameters in deep learning. In Advances in neural information processing systems (pp. 2148-2156)
Cheng J, Wu J, Leng C, Wang Y, Hu Q (2017) Quantized CNN: a unified approach to accelerate and compress convolutional networks. IEEE Trans Neural Networks Learning Syst 29(10):4730–4743
Article Google Scholar
Hu, H., Peng, R., Tai, Y. W., & Tang, C. K. (2016). Network trimming: a data-driven neuron pruning approach towards efficient deep architectures. arXiv preprint arXiv:1607.03250
Hur C, Kang S (2019) Entropy-based pruning method for convolutional neural networks. J Supercomput 75(6):2950–2963
Article Google Scholar
Han, S., Pool, J., Tran, J., & Dally, W. (2015). Learning both weights and connections for efficient neural network. In Advances in neural information processing systems (pp. 1135-1143)
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar
Fei-Fei, L., Fergus, R., & Perona, P. (2004). Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. In 2004 conference on computer vision and pattern recognition workshop (pp. 178-178). IEEE
Doersch, C., & Zisserman, A. (2019). Sim2real transfer learning for 3D human pose estimation: motion to the rescue. In Advances in Neural Information Processing Systems (pp. 12929-12941)
Dawalatabad, N., Madikeri, S., Sekhar, C. C., & Murthy, H. A. (2019). Incremental transfer learning in two-pass information bottleneck based speaker Diarization system for meetings. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 6291-6295). IEEE
Deng J, Frühholz S, Zhang Z, Schuller B (2017) Recognizing emotions from whispered speech based on acoustic feature transfer learning. IEEE Access 5:5235–5246
Google Scholar

Download references

Acknowledgments

This work was supported by Inha University Research Grant. This research was supported by Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education(NRF-2019R1A2C1008048).

Author information

Authors and Affiliations

Department of Computer Engineering, Inha University, Inha-ro 100, Incheon, Nam-gu, 22212, South Korea
Bunyodbek Ibrokhimov, Cheonghwan Hur & Sanggil Kang

Authors

Bunyodbek Ibrokhimov
View author publications
You can also search for this author in PubMed Google Scholar
Cheonghwan Hur
View author publications
You can also search for this author in PubMed Google Scholar
Sanggil Kang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sanggil Kang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ibrokhimov, B., Hur, C. & Kang, S. Effective node selection technique towards sparse learning. Appl Intell 50, 3239–3251 (2020). https://doi.org/10.1007/s10489-020-01720-5

Download citation

Published: 15 May 2020
Issue Date: October 2020
DOI: https://doi.org/10.1007/s10489-020-01720-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Effective node selection technique towards sparse learning

Abstract

Access this article

Similar content being viewed by others

Data-Driven Sparse Structure Selection for Deep Neural Networks

Learning Sparse Neural Networks with Identity Layers

SRS-DNN: a deep neural network with strengthening response sparsity

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Effective node selection technique towards sparse learning

Abstract

Access this article

Similar content being viewed by others

Data-Driven Sparse Structure Selection for Deep Neural Networks

Learning Sparse Neural Networks with Identity Layers

SRS-DNN: a deep neural network with strengthening response sparsity

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation