On-Device Partial Learning Technique of Convolutional Neural Network for New Classes

Hur, Cheonghwan; Kang, Sanggil

doi:10.1007/s11265-020-01520-7

On-Device Partial Learning Technique of Convolutional Neural Network for New Classes

Published: 30 January 2020

Volume 95, pages 909–920, (2023)
Cite this article

Journal of Signal Processing Systems Aims and scope Submit manuscript

244 Accesses
1 Citation
Explore all metrics

Abstract

In general, Convolutional Neural Networks (CNNs) have a complex network structure consisted of heavy layers with huge number of parameters such as the convolutional, pooling, relu-activation, and fully-connected layers. Due to the complexity and computation load, CNNs are trained on a cloud environment. There are a couple of drawbacks on learning and performing on the cloud such as security problem of personal information and dependency of communication state. Recently, CNNs are directly trained at the mobile devices in order to alleviate those two drawbacks. Due to the resource limitation of the mobile devices, the structure of CNNs needs to be compressed or to reduce training overhead. In this paper, we propose an on-device partial learning technique with the following benefits: (1) does not require additional neural network structures, and (2) reduces unnecessary computation overhead. We select a subset of influential weights from a trained network to accommodate the new classification class. The selection is made based on the information of the contribution of each weight to output, which is measured using the entropy concept. In the experimental section, we demonstrate and analyze our method with a CNN image classifier using two datasets such as Mixed National Institute of Standards and Technology image data and Microsoft Common Objection in Context data. As a result, the computational resources at LeNet-5 and AlexNet showed performance improvements of 1.7× and 2.3×, respectively, and memory resources demonstrated performance improvements of 1.4× and 1.6×, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Compact Deep Neural Networks for Device-Based Image Classification

Lightweight image classifier using dilated and depthwise separable convolutions

Article Open access 23 September 2020

HAHANet: Towards Accurate Image Classifiers with Less Parameters

References

LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.
Article Google Scholar
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., ..., Rabinovich, A. (2015, June). Going deeper with convolutions. Cvpr.
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
Denil, M., Shakibi, B., Dinh, L., & De Freitas, N. (2013). Predicting parameters in deep learning. In Advances in neural information processing systems (pp. 2148-2156).
Ye, J. (2005). Generalized low rank approximations of matrices. Machine Learning, 61(1–3), 167–191.
Article MATH Google Scholar
Denil, M., Shakibi, B., Dinh, L., & De Freitas, N. (2013). Predicting parameters in deep learning. In Advances in neural information processing systems (pp. 2148-2156).
Yu, D., & Deng, L. (2011). Deep learning and its applications to signal and information processing [exploratory dsp]. IEEE Signal Processing Magazine, 28(1), 145–154.
Article Google Scholar
Cheng, J., Wu, J., Leng, C., Wang, Y., & Hu, Q. (2017). Quantized CNN: A unified approach to accelerate and compress convolutional networks. IEEE Transactions on Neural Networks and Learning Systems.
Schneider, P., Biehl, M., & Hammer, B. (2009). Adaptive relevance matrices in learning vector quantization. Neural Computation, 21(12), 3532–3561.
Article MathSciNet MATH Google Scholar
Polyak, A., & Wolf, L. (2015). Channel-level acceleration of deep face representations. IEEE Access, 3, 2163–2175.
Article Google Scholar
Han, S., Pool, J., Tran, J., & Dally, W. (2015). Learning both weights and connections for efficient neural network. In Advances in neural information processing systems (pp. 1135-1143).
Machida, H., Yoneda, H., & Kanno, H. (1992). U.S. Patent No. 5,109,436. Washington, DC: U.S. Patent and Trademark Office.
Baier, A., & Baier, P. W. (1983). Digital matched filtering of arbitrary spread-spectrum waveforms using correlators with binary quantization. In Military Communications Conference, 1983. MILCOM 1983. IEEE (Vol. 2, pp. 418-423). IEEE.
Yuan, Z. X., Xu, B. L., & Yu, C. Z. (1999). Binary quantization of feature vectors for robust text-independent speaker identification. IEEE Transactions on Speech and Audio Processing, 7(1), 70–78.
Article Google Scholar
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).
Sivaswamy, J., Krishnadas, S. R., Joshi, G. D., Jain, M., & Tabish, A. U. S. (2014, April). Drishti-gs: Retinal image dataset for optic nerve head (onh) segmentation. In Biomedical Imaging (ISBI), 2014 IEEE 11th International Symposium on (pp. 53-56). IEEE. http://cvit.iiit.ac.in/projects/mip/drishti-gs/mip-dataset2/Download.php
Harris, B., Moghaddam, M. S., Kang, D., Bae, I., Kim, E., Min, H., ... & Choi, K. (2018, January). Architectures and algorithms for user customization of CNNs. In Proceedings of the 23rd Asia and South Pacific Design Automation Conference (pp. 540–547). IEEE Press.
Killian, T. W., Daulton, S., Konidaris, G., & Doshi-Velez, F. (2017). Robust and efficient transfer learning with hidden parameter markov decision processes. In advances in neural information processing systems (pp. 6250-6261).
Shin, H. C., Roth, H. R., Gao, M., Lu, L., Xu, Z., Nogues, I., Yao, J., Mollura, D., & Summers, R. M. (2016). Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Transactions on Medical Imaging, 35(5), 1285–1298.
Article Google Scholar
Fernandes, K., Cardoso, J. S., & Fernandes, J. (2017). Transfer learning with partial observability applied to cervical cancer screening. In Iberian conference on pattern recognition and image analysis (pp. 243-250). Springer, Cham.
Xu, S., Mu, X., Chai, D., & Wang, S. (2017). Adapting remote sensing to new domain with ELM parameter transfer. IEEE Geoscience and Remote Sensing Letters, 14(9), 1618–1622.
Article Google Scholar
Afridi, M. J., Ross, A., & Shapiro, E. M. (2018). On automated source selection for transfer learning in convolutional neural networks. Pattern Recognition, 73, 65–75.
Article Google Scholar
Peng, X., Sun, B., Ali, K., & Saenko, K. (2015). Learning deep object detectors from 3d models. In proceedings of the IEEE international conference on computer vision (pp. 1278-1286).
Long, M., Zhu, H., Wang, J., & Jordan, M. I. (2016). Unsupervised domain adaptation with residual transfer networks. In advances in neural information processing systems (pp. 136-144).
Su, H., Qi, C. R., Li, Y., & Guibas, L. J. (2015). Render for cnn: Viewpoint estimation in images using cnns trained with rendered 3d model views. In proceedings of the IEEE international conference on computer vision (pp. 2686-2694).
Tjandra, A., Sakti, S., & Nakamura, S. (2017). Attention-based wav2text with feature transfer learning. In 2017 IEEE automatic speech recognition and understanding workshop (ASRU) (pp. 309-315). IEEE.
Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., & Xiao, J. (2015). 3d shapenets: A deep representation for volumetric shapes. In proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1912-1920).
Su, H., Maji, S., Kalogerakis, E., & Learned-Miller, E. (2015). Multi-view convolutional neural networks for 3d shape recognition. In proceedings of the IEEE international conference on computer vision (pp. 945-953).
Qi, C. R., Su, H., Mo, K., & Guibas, L. J. (2017). Pointnet: Deep learning on point sets for 3d classification and segmentation. In proceedings of the IEEE conference on computer vision and pattern recognition (pp. 652-660).
Qi, C. R., Su, H., Nießner, M., Dai, A., Yan, M., & Guibas, L. J. (2016). Volumetric and multi-view cnns for object classification on 3d data. In proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5648-5656).
Kalogerakis, E., Averkiou, M., Maji, S., & Chaudhuri, S. (2017). 3D shape segmentation with projective convolutional networks. In proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3779-3788).
Song, S., Lichtenberg, S. P., & Xiao, J. (2015). Sun rgb-d: A rgb-d scene understanding benchmark suite. In proceedings of the IEEE conference on computer vision and pattern recognition (pp. 567-576).
Lindblad, G. (1973). Entropy, information and quantum measurements. Communications in Mathematical Physics, 33(4), 305–322.
Article MathSciNet Google Scholar
Föllmer, H. (1973). On entropy and information gain in random fields. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete, 26(3), 207–217.
Article MathSciNet MATH Google Scholar
Borland, L., Plastino, A. R., & Tsallis, C. (1998). Information gain within nonextensive thermostatistics. Journal of Mathematical Physics, 39(12), 6490–6501.
Article MathSciNet MATH Google Scholar
Nalewajski, R. F. (2005). Partial communication channels of molecular fragments and their entropy/information indices. Molecular Physics, 103(4), 451–470.
Article Google Scholar
Huerta, M. A., & Robertson, H. S. (1969). Entropy, information theory, and the approach to equilibrium of coupled harmonic oscillator systems. Journal of Statistical Physics, 1(3), 393–414.
Article Google Scholar
Ebeling, W. (1993). Entropy and information in processes of self-organization: Uncertainty and predictability. Physica A: Statistical Mechanics and its Applications, 194(1–4), 563–575.
Article Google Scholar
LeCun, Y., Cortes, C., & Burges, C. J. (2010). MNIST handwritten digit database. AT&T Labs [Online]. Available: http://yann.lecun.com/exdb/mnist, 2.
Mallard, W. G., Westley, F., Herron, J. T., Hampson, R. F., & Frizzell, D. H. (1998). NIST chemical kinetics database, version 2Q98. Gaithersburg: National Institute of Standards and Technology. Web address: http://kinetics.nist.gov.
Lei, H., Han, T., Zhou, F., Yu, Z., Qin, J., Elazab, A., & Lei, B. (2018). A deeply supervised residual network for HEp-2 cell classification via cross-modal transfer learning. Pattern Recognition, 79, 290–302.
Article Google Scholar
Fadaeddini, A., Eshghi, M., & Majidi, B. (2018). A deep residual neural network for low altitude remote sensing image classification. In 2018 6th Iranian joint congress on fuzzy and intelligent systems (CFIS) (pp. 43-46). IEEE.
McCallum, A. 20 newsgroups. (2008). http://people.cs.umass.edu/~mccallum/data-/20_newsgroups.tar.gz
McCallum, A. SRAA. (2008) http://people.cs.umass.edu/~mccallum/data/sraa.tar.gz
Lewis, David, et al. Reuters-21578. Test Collections, (1987) http://www.daviddlewis.com/resour-ces/testcollections/reuters21578/
Voutilainen, A. (2003). Part-of-speech tagging. The Oxford handbook of computational linguistics, 219–232.
Mohit, B. (2014). Named entity recognition, In Natural language processing of semitic languages (pp. 221–245). Berlin, Heidelberg: Springer.
Book Google Scholar
Blanco, E., Castell, N., & Moldovan, D. I. (2008, May). Causal Relation Extraction. In Lrec.
Björkelund, A., Hafdell, L., & Nugues, P. (2009). Multilingual semantic role labeling. In Proceedings of the Thirteenth Conference on Computational Natural Language Learning: Shared Task (pp. 43-48). Association for Computational Linguistics.
Gupta, S., & Malik, J. (2015). Visual semantic role labeling. arXiv preprint arXiv:1505.04474.
Bereziński, P., Jasiul, B., & Szpyrka, M. (2015). An entropy-based network anomaly detection method. Entropy, 17(4), 2367–2408.
Article Google Scholar
Han, B., Zhang, Z., Xu, C., Wang, B., Hu, G., Bai, L., ... & Hancock, E. R. (2017). Deep Face Model Compression Using Entropy-Based Filter Selection. In International Conference on Image Analysis and Processing (pp. 127–136). Springer, Cham.
Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009, June). Imagenet: A large-scale hierarchical image database. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on (pp. 248-255). IEEE. http://image-net.org/download-images
Park, E., Ahn, J., & Yoo, S. (2017). Weighted-entropy-based quantization for deep neural networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Zilly, J., Buhmann, J. M., & Mahapatra, D. (2017). Glaucoma detection using entropy sampling and ensemble learning for automatic optic cup and disc segmentation. Computerized Medical Imaging and Graphics, 55, 28–41.
Article Google Scholar
Mjalli, F. S., Al-Asheh, S., & Alfadala, H. E. (2007). Use of artificial neural network black-box modeling for the prediction of wastewater treatment plants performance. Journal of Environmental Management, 83(3), 329–338.
Article Google Scholar
Hoeffding, W., & Robbins, H. (1948). The central limit theorem for dependent random variables. Duke Mathematical Journal, 15(3), 773–780.
Article MathSciNet MATH Google Scholar
LeCun, Y. (2015). LeNet-5, convolutional neural networks. URL: http://yann.lecun.com/exdb/lenet, 20.
Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., ... & Kudlur, M. (2016). TensorFlow: A System for Large-Scale Machine Learning. In OSDI (Vol. 16, pp. 265–283).

Download references

Acknowledgements

This work was supported by Inha University Grant.

Author information

Authors and Affiliations

Department of Computer Engineering, Inha University, Inha-ro 100, Nam-gu, Incheon, 22212, South Korea
Cheonghwan Hur & Sanggil Kang

Authors

Cheonghwan Hur
View author publications
You can also search for this author in PubMed Google Scholar
Sanggil Kang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sanggil Kang.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hur, C., Kang, S. On-Device Partial Learning Technique of Convolutional Neural Network for New Classes. J Sign Process Syst 95, 909–920 (2023). https://doi.org/10.1007/s11265-020-01520-7

Download citation

Received: 02 May 2018
Revised: 02 December 2019
Accepted: 16 January 2020
Published: 30 January 2020
Issue Date: July 2023
DOI: https://doi.org/10.1007/s11265-020-01520-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On-Device Partial Learning Technique of Convolutional Neural Network for New Classes

Abstract

Access this article

Similar content being viewed by others

Compact Deep Neural Networks for Device-Based Image Classification

Lightweight image classifier using dilated and depthwise separable convolutions

HAHANet: Towards Accurate Image Classifiers with Less Parameters

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

On-Device Partial Learning Technique of Convolutional Neural Network for New Classes

Abstract

Access this article

Similar content being viewed by others

Compact Deep Neural Networks for Device-Based Image Classification

Lightweight image classifier using dilated and depthwise separable convolutions

HAHANet: Towards Accurate Image Classifiers with Less Parameters

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation