Abstract
Training large neural network models from scratch is not feasible due to over-fitting on small datasets and too much time consumed on large datasets. To address this, transfer learning, namely utilizing the feature extracting capacity learned by large models, becomes a hot spot in neural network community. At the classifying stage of pre-trained neural network model, either a linear SVM classifier or a Softmax classifier is employed and that is the only trained part of the whole model. In this paper, inspired by transfer learning, we propose a classifier based on conceptors called Multi-label Conceptor Classifier (MCC) to deal with multi-label classification in pre-trained neural networks. When no multi-label sample exists, MCC equates to Fast Conceptor Classifier, a fast single-label classifier proposed in our previous work, thus being applicable to single-label classification. Moreover, by introducing a random search algorithm, we further improve the performance of MCC on single-label datasets Caltech-101 and Caltech-256, where it achieves state-of-the-art results. Also, its evaluations with pre-trained rather than fine-tuning neural networks are investigated on multi-label dataset PASCAL VOC-2007, where it achieves comparable results.
Similar content being viewed by others
References
Boser BE, Guyon IM, Vapnik VN (1992) A training algorithm for optimal margin classifiers. In: Proceedings of the fifth annual workshop on Computational learning theory, ACM, pp 144–152
Bruna J, Mallat S (2013) Invariant scattering convolution networks. IEEE Trans Pattern Anal Mach Intell 35(8):1872–1886
Chan TH, Jia K, Gao S, Lu J, Zeng Z, Ma Y (2015) PCANet: a simple deep learning baseline for image classification? IEEE Trans Image Process 24(12):5017–5032
Chatfield K, Simonyan K, Vedaldi A, Zisserman A (2014) Return of the devil in the details: delving deep into convolutional nets. arXiv preprint arXiv:1405.3531
Chen Q, Song Z, Hua Y, Huang Z, Yan S (2012) Hierarchical matching with side information for image classification. In: 2012 IEEE conference on computer vision and pattern recognition, pp 3426–3433. https://doi.org/10.1109/CVPR.2012.6248083
Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T (2014) Decaf: a deep convolutional activation feature for generic visual recognition. In: ICML, pp 647–655
Dong J, Xia W, Chen Q, Feng J, Huang Z, Yan S (2013) Subcategory-aware object classification. In: 2013 IEEE conference on computer vision and pattern recognition, pp 827–834. https://doi.org/10.1109/CVPR.2013.112
Fei-Fei L, Fergus R, Perona P (2007) Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. Comput Vis Image Underst 106(1):59–70
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587
Griffin G, Holub A, Perona P (2007) Caltech-256 object category dataset. California Institute of Technology
Guo Q, Jia J, Shen G, Zhang L, Cai L, Yi Z (2016) Learning robust uniform features for cross-media social data by using cross autoencoders. Knowl-Based Syst 102:64–75
He K, Zhang X, Ren S, Sun J (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. In: European conference on computer vision, Springer, pp 346–361
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507
Holmes M, Gray A, Isbell C (2007) Fast svd for large-scale matrices. In: Workshop on efficient machine learning at NIPS, vol 58, pp 249–252
Jaeger H (2014) Controlling recurrent neural networks by conceptors. arXiv preprint arXiv:1403.3369
Jaeger H, Haas H (2004) Harnessing nonlinearity: predicting chaotic systems and saving energy in wireless communication. Science 304(5667):78–80
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
Niu XN, Yang C, Wang H, Wang Y (2017) Investigation of ANN and SVM based on limited samples for performance and emissions prediction of a CRDI-assisted marine diesel engine. Appl Therm Eng 111:1353–1364
Perronnin F, Sánchez J, Mensink T (2010) Improving the Fisher kernel for large-scale image classification. Springer, Berlin, pp 143–156
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vision 115(3):211–252
Qian G, Zhang L (2017) A simple feedforward convolutional conceptor neural network for classification. Appl Soft Comput. https://doi.org/10.1016/j.asoc.2017.08.016
Qian G, Zhang L, Zhang Q (2017) Fast conceptor classifier in pre-trained neural networks for visual recognition. Springer International Publishing, Cham, pp 290–298
Schapire RE, Singer Y (2000) Boostexter: a boosting-based system for text categorization. Mach Learn 39(2–3):135–168
Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y (2013) Overfeat: integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229
Razavian AS, Azizpour H, Sullivan J, Carlsson S (2014) CNN features off-the-shelf: an astounding baseline for recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 806–813
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Wan L, Zeiler M, Zhang S, Cun YL, Fergus R (2013) Regularization of neural networks using dropconnect. In: Proceedings of the 30th international conference on machine learning (ICML-13), pp 1058–1066
Wei Y, Xia W, Lin M, Huang J, Ni B, Dong J, Zhao Y, Yan S (2016) Hcp: a flexible cnn framework for multi-label image classification. IEEE Trans Pattern Anal Mach Intell 38(9):1901–1907. https://doi.org/10.1109/TPAMI.2015.2491929
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision, Springer, pp 818–833
Zhang H, Li J, Ji Y, Yue H (2017) Understanding subtitles by character-level sequence-to-sequence learning. IEEE Trans Ind Inf 13(2):616–624
Zhang L, Yi Z (2011) Selectable and unselectable sets of neurons in recurrent neural networks with saturated piecewise linear transfer function. IEEE Trans Neural Netw 22(7):1021–1031
Zhang L, Yi Z, Yu J (2008) Multiperiodicity and attractivity of delayed recurrent neural networks with unsaturating piecewise linear transfer functions. IEEE Trans Neural Netw 19(1):158–167
Zhang L, Yi Z, Zhang SL, Heng PA (2009) Activity invariant sets and exponentially stable attractors of linear threshold discrete-time recurrent neural networks. IEEE Trans Autom Control 54(6):1341–1347
Acknowledgements
This work was supported by National Key Research Priorities Program of China (Grant 2016YFC0801800), Fok Ying Tung Education Foundation (Grant 151068), National Natural Science Foundation of China (Grants 61772353 and 61332002) and Foundation for Youth Science and Technology Innovation Research Team of Sichuan Province (Grant 2016TD0018). The authors would like to extend our sincere appreciation for the help of Prof. Dr Herbert Jaeger.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Rights and permissions
About this article
Cite this article
Qian, G., Zhang, L. & Wang, Y. Single-label and multi-label conceptor classifiers in pre-trained neural networks. Neural Comput & Applic 31, 6179–6188 (2019). https://doi.org/10.1007/s00521-018-3432-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-018-3432-2