Skip to main content

Advertisement

Log in

Coupled-learning convolutional neural networks for object recognition

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Recently, convolutional neural networks (CNN) have been attracting considerable attention in various computer vision tasks. Motivated by neuroscience, CNN has several similar properties with the learning process of human brain. A prominent difference is that each CNN is an independent learning process while the effective interaction/communication between people can play important role in the human visual system. Inspired by this fact, we proposed a novel Coupled-learning Convolutional Neural Network (Co-CNN) for the task of object recognition, which boosts its discriminative capability by employing the dynamic interaction between neural networks. Contrary to existing network architectures posing the network optimization problem as an isolated learning process, the intuition behind the Co-CNN framework is that the coupled learning mechanism may prevent the algorithm away from over-fitting to one or more particular objective functions. The proposed Co-CNN framework has three unique characteristics: (1) Co-CNN, which is a novel deep network learning framework, can simultaneously optimize both neural networks with same/different structures. (2) The learned semantic information, which can be gradually mined from neural networks, is employed to guide the communication between neural networks. (3) Co-CNN well incorporates the coupled-learning mechanism into the process of learning neural networks, and then further improve the recognition performance of neural networks by adopting the learned semantic information. Comprehensive evaluations on five benchmark datasets (CIFAR-10, CIFAR-100, MNIST, SVHN and Imagenet) well demonstrate the significant superiority of our proposed Co-CNN framework over other existing algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. https://github.com/mavenlin/cuda-convnet/tree/master/NIN.

References

  1. Berg A, Deng J, Fei-Fei L (2010) Large-scale visual recognition challenge. In: http://www.image-net.org/challenges

  2. Boekaerts M, Zeidner M, Pintrich PR (2000) Handbook of self-regulation. Academic Press, Cambridge

  3. Cavana RY (1999) Modeling the environment: an introduction to system dynamics models of environmental systems. Island Press, Washington, D.C.

    Google Scholar 

  4. Dayan P, Abbott L (2005) Theoretical neuroscience: computational and mathematical modeling of neural systems. Computational neuroscience. Massachusetts Institute of Technology Press

  5. Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: IEEE Conference on computer vision and pattern recognition, pp 248–255

  6. Gan C, Wang N, Yang Y, Yeung DY, Hauptmann AG (2015) Devnet: a deep event network for multimedia event detection and evidence recounting. In: IEEE Conference on computer vision and pattern recognition

  7. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE Conference on computer vision and pattern recognition

  8. Goodfellow I, Warde-Farley D, Mirza M, Courville A, Bengio Y (2013) Maxout networks. In: International conference on machine learning

  9. Graham B (2014) Spatially-sparse convolutional neural networks. arXiv:1409.6070

  10. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition

  11. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  12. Hou X, Shen L, Sun K, Qiu G (2017) Deep feature consistent variational autoencoder. In: IEEE Winter conference on applications of computer vision, pp 1133–1141

  13. Jain AK, Mao J, Mohiuddin K (1996) Artificial neural networks: a tutorial. IEEE Comput 29:31–44

    Article  Google Scholar 

  14. Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images. Computer Science Department, University of Toronto, Tech Rep

  15. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105

  16. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  17. Lee TS, Mumford D (2003) Hierarchical bayesian inference in the visual cortex. J Opt Soc Amer A 20:1434–1448

    Article  Google Scholar 

  18. Lee CY, Xie S, Gallagher P, Zhang Z, Tu Z (2014) Deeply-supervised nets. In: Advances in neural information processing systems workshop on deep learning and representation learning

  19. Lee CY, Gallagher PW, Tu Z (2015) Generalizing pooling functions in convolutional neural networks: mixed, gated, and tree. arXiv:1509.08985

  20. Li Y, Shen L (2017) Skin lesion analysis towards melanoma detection using deep learning network. arXiv:1703.00577

  21. Liang M, Hu X (2015) Recurrent convolutional neural network for object recognition. In: IEEE Conference on computer vision and pattern recognition

  22. Liang X, Xu C, Shen X, Yang J, Liu S, Tang J, Lin L, Yan S (2015) Human parsing with contexttualized convolutional neural network. In: International conference on computer vision

  23. Lin M, Chen Q, Yan S (2014) Network in network. In: International conference on learning representations

  24. Netzer Y, Wang T, Coates A, Bissacco A, Wu B, Ng AY (2011) Reading digits in natural images with unsupervised feature learning. In: Advances in neural information processing systems workshop on deep learning and unsupervised feature learning, vol 2011, p 4

  25. Noh H, Hong S, Han B (2015) Learning deconvolution network for semantic segmentation. In: International conference on computer vision

  26. Palmer ES (1999) Vision science: photons to phenomenology. MIT Press, Cambridge

    Google Scholar 

  27. Romero A, Ballas N, Kahou SE, Chassang A, Gatta C, Bengio Y (2015) Fitnets: hints for thin deep nets. In: ICLR

  28. Sanchez J, Perronnin F (2011) High-dimensional signature compression for large-scale image classification. In: 2011 IEEE Conference on computer vision and pattern recognition (CVPR), pp 1665–1672

  29. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: International conference on learning representations

  30. Springenberg JT, Riedmiller M (2013) Improving deep neural networks with probabilistic maxout units. arXiv:13126116

  31. Srivastava RK, Greff K, Schmidhuber J (2015) Training very deep networks. In: NIPS, pp 2377–2385

  32. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: IEEE Conference on computer vision and pattern recognition

  33. Wen X, Shao L, Xue Y, Fang W (2015) A rapid learning algorithm for vehicle classification. Inform Sci 295(1):395–406

    Article  Google Scholar 

  34. Xu C, Lu C, Liang X, Gao J, Zheng W, Wang T, Yan S (2016) Multi-loss regularized neural network. IEEE Trans Circ Syst Video Technol 26(12):2273–2283

    Article  Google Scholar 

  35. Xu Z, Yang Y, Hauptmann AG (2015) A discriminative cnn video representation for event detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1798–1807

  36. Yang M, Wang X, Zeng G, Shen L (2017) Joint and collaborative representation with local adaptive convolution feature for face recognition with single sample per person. Pattern Recogn 66:117–128

    Article  Google Scholar 

  37. Yuan C, Sun X, LV R (2016) Fingerprint liveness detection based on multi-scale lpq and pca. Chin Commun 13(7):60–65

    Article  Google Scholar 

  38. Zhao J, Mathieu M, Goroshin R, Lecun Y (2015) Stacked what-where auto-encoders. In: arXiv:1506.02351

  39. Zheng S, Jayasumana S, Romera-Paredes B, Vineet V, Su Z, Du D, Huang C, Torr P (2015) Conditional random fields as recurrent neural networks. In: International conference on computer vision

  40. Zheng Y, Jeon B, Xu D, Wu QJ, Zhang H (2015) Image segmentation by generalized hierarchical fuzzy c-means algorithm. J Intell Fuzzy Syst 28(2):961–973

    Google Scholar 

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China (Grant No. 61602244 and 61502235) and partially sponsored by CCF-Tencent Open Research Fund.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chunyan Xu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, C., Yang, J. & Gao, J. Coupled-learning convolutional neural networks for object recognition. Multimed Tools Appl 78, 573–589 (2019). https://doi.org/10.1007/s11042-017-5262-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-017-5262-0

Keywords

Navigation