Abstract
Recently, convolutional neural networks (CNN) have been attracting considerable attention in various computer vision tasks. Motivated by neuroscience, CNN has several similar properties with the learning process of human brain. A prominent difference is that each CNN is an independent learning process while the effective interaction/communication between people can play important role in the human visual system. Inspired by this fact, we proposed a novel Coupled-learning Convolutional Neural Network (Co-CNN) for the task of object recognition, which boosts its discriminative capability by employing the dynamic interaction between neural networks. Contrary to existing network architectures posing the network optimization problem as an isolated learning process, the intuition behind the Co-CNN framework is that the coupled learning mechanism may prevent the algorithm away from over-fitting to one or more particular objective functions. The proposed Co-CNN framework has three unique characteristics: (1) Co-CNN, which is a novel deep network learning framework, can simultaneously optimize both neural networks with same/different structures. (2) The learned semantic information, which can be gradually mined from neural networks, is employed to guide the communication between neural networks. (3) Co-CNN well incorporates the coupled-learning mechanism into the process of learning neural networks, and then further improve the recognition performance of neural networks by adopting the learned semantic information. Comprehensive evaluations on five benchmark datasets (CIFAR-10, CIFAR-100, MNIST, SVHN and Imagenet) well demonstrate the significant superiority of our proposed Co-CNN framework over other existing algorithms.
Similar content being viewed by others
References
Berg A, Deng J, Fei-Fei L (2010) Large-scale visual recognition challenge. In: http://www.image-net.org/challenges
Boekaerts M, Zeidner M, Pintrich PR (2000) Handbook of self-regulation. Academic Press, Cambridge
Cavana RY (1999) Modeling the environment: an introduction to system dynamics models of environmental systems. Island Press, Washington, D.C.
Dayan P, Abbott L (2005) Theoretical neuroscience: computational and mathematical modeling of neural systems. Computational neuroscience. Massachusetts Institute of Technology Press
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: IEEE Conference on computer vision and pattern recognition, pp 248–255
Gan C, Wang N, Yang Y, Yeung DY, Hauptmann AG (2015) Devnet: a deep event network for multimedia event detection and evidence recounting. In: IEEE Conference on computer vision and pattern recognition
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE Conference on computer vision and pattern recognition
Goodfellow I, Warde-Farley D, Mirza M, Courville A, Bengio Y (2013) Maxout networks. In: International conference on machine learning
Graham B (2014) Spatially-sparse convolutional neural networks. arXiv:1409.6070
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Hou X, Shen L, Sun K, Qiu G (2017) Deep feature consistent variational autoencoder. In: IEEE Winter conference on applications of computer vision, pp 1133–1141
Jain AK, Mao J, Mohiuddin K (1996) Artificial neural networks: a tutorial. IEEE Comput 29:31–44
Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images. Computer Science Department, University of Toronto, Tech Rep
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Lee TS, Mumford D (2003) Hierarchical bayesian inference in the visual cortex. J Opt Soc Amer A 20:1434–1448
Lee CY, Xie S, Gallagher P, Zhang Z, Tu Z (2014) Deeply-supervised nets. In: Advances in neural information processing systems workshop on deep learning and representation learning
Lee CY, Gallagher PW, Tu Z (2015) Generalizing pooling functions in convolutional neural networks: mixed, gated, and tree. arXiv:1509.08985
Li Y, Shen L (2017) Skin lesion analysis towards melanoma detection using deep learning network. arXiv:1703.00577
Liang M, Hu X (2015) Recurrent convolutional neural network for object recognition. In: IEEE Conference on computer vision and pattern recognition
Liang X, Xu C, Shen X, Yang J, Liu S, Tang J, Lin L, Yan S (2015) Human parsing with contexttualized convolutional neural network. In: International conference on computer vision
Lin M, Chen Q, Yan S (2014) Network in network. In: International conference on learning representations
Netzer Y, Wang T, Coates A, Bissacco A, Wu B, Ng AY (2011) Reading digits in natural images with unsupervised feature learning. In: Advances in neural information processing systems workshop on deep learning and unsupervised feature learning, vol 2011, p 4
Noh H, Hong S, Han B (2015) Learning deconvolution network for semantic segmentation. In: International conference on computer vision
Palmer ES (1999) Vision science: photons to phenomenology. MIT Press, Cambridge
Romero A, Ballas N, Kahou SE, Chassang A, Gatta C, Bengio Y (2015) Fitnets: hints for thin deep nets. In: ICLR
Sanchez J, Perronnin F (2011) High-dimensional signature compression for large-scale image classification. In: 2011 IEEE Conference on computer vision and pattern recognition (CVPR), pp 1665–1672
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: International conference on learning representations
Springenberg JT, Riedmiller M (2013) Improving deep neural networks with probabilistic maxout units. arXiv:13126116
Srivastava RK, Greff K, Schmidhuber J (2015) Training very deep networks. In: NIPS, pp 2377–2385
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: IEEE Conference on computer vision and pattern recognition
Wen X, Shao L, Xue Y, Fang W (2015) A rapid learning algorithm for vehicle classification. Inform Sci 295(1):395–406
Xu C, Lu C, Liang X, Gao J, Zheng W, Wang T, Yan S (2016) Multi-loss regularized neural network. IEEE Trans Circ Syst Video Technol 26(12):2273–2283
Xu Z, Yang Y, Hauptmann AG (2015) A discriminative cnn video representation for event detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1798–1807
Yang M, Wang X, Zeng G, Shen L (2017) Joint and collaborative representation with local adaptive convolution feature for face recognition with single sample per person. Pattern Recogn 66:117–128
Yuan C, Sun X, LV R (2016) Fingerprint liveness detection based on multi-scale lpq and pca. Chin Commun 13(7):60–65
Zhao J, Mathieu M, Goroshin R, Lecun Y (2015) Stacked what-where auto-encoders. In: arXiv:1506.02351
Zheng S, Jayasumana S, Romera-Paredes B, Vineet V, Su Z, Du D, Huang C, Torr P (2015) Conditional random fields as recurrent neural networks. In: International conference on computer vision
Zheng Y, Jeon B, Xu D, Wu QJ, Zhang H (2015) Image segmentation by generalized hierarchical fuzzy c-means algorithm. J Intell Fuzzy Syst 28(2):961–973
Acknowledgments
This work is supported by the National Natural Science Foundation of China (Grant No. 61602244 and 61502235) and partially sponsored by CCF-Tencent Open Research Fund.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Xu, C., Yang, J. & Gao, J. Coupled-learning convolutional neural networks for object recognition. Multimed Tools Appl 78, 573–589 (2019). https://doi.org/10.1007/s11042-017-5262-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-017-5262-0