Invariant object recognition based on combination of sparse DBN and SOM with temporal trace rule

Cai, Huimin; Wang, Shulong; Liu, Eryun; Liu, Hongxia

doi:10.1007/s11042-016-3956-3

Invariant object recognition based on combination of sparse DBN and SOM with temporal trace rule

Published: 23 September 2016

Volume 76, pages 12017–12034, (2017)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Huimin Cai ORCID: orcid.org/0000-0002-4155-2253¹,
Shulong Wang¹,
Eryun Liu² &
…
Hongxia Liu¹

232 Accesses
1 Citation
Explore all metrics

Abstract

This paper proposes a trace rule based self-organized map (SOM) model built upon a sparse 2-stage deep belief network (DBN). The combination of SOM and sparse DBN forms a hierarchical network where DBN serves as a V2 features detector while SOM layer learns to extract transformation invariant features guided by trace learning rule during training phase. The performance of our proposed method is evaluated by stimulus specific information (SSI) measuring and comparison with classic algorithms. It is demonstrated that trace rule based SOM model can generate more neurons with high SSI value which is beneficial to convey more useful and discriminative information for further object recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A structure-self-organizing DBN for image recognition

Article 04 August 2020

A novel multi-scale and sparsity auto-encoder for classification

Article 17 September 2022

Analysis of Different Sparsity Methods in Constrained RBM for Sparse Representation in Cognitive Robotic Perception

Article 12 February 2015

References

Bell AJ, Sejnowski TJ (1997) The independent components of natural scenes are edge filters. Vis Res 37:3327–3338
Article Google Scholar
Bengio Y, Lamblin P, Popovici D, Larochelle H (2007) Greedy layer-wise training of deep networks. Advances in neural information processing systems
Bengior Y (2009) Learning deep architectures for AI. Found Trends Mach Learn 2(1):1–127
Coates A, Lee H, Ng AY (2011) An analysis of single layer networks in unsupervised feature learning. J Mach Learn Res 15:215–223
Google Scholar
Geusebroek JM, Burghouts GJ, Smeulders AWM (2005) The Amsterdam library of object images. Int J Comput Vis 61(1):103–112
Article Google Scholar
Hateren JHV, Schaaf AVD (1998) Independent component filters of natural images compared with simple cells in primary visual cortex. Proc Royal Soc Biol Sci 265(1394):359–66
Article Google Scholar
Hinton GE, Salakhutdinov R (2006) Reducing the dimensiionality of data with neural networks. Science 313(5786):504–507
Article MathSciNet MATH Google Scholar
Jarrett K, Kavukcuoglu K, Ranzato MA, LeCun Y (2009) What is the best multi-stage architecture for object recognition?. In: ICCV, vol 30, pp 2146–2153
Ji N, Zhang J, Zhang C, Yin Q (2014) Enhancing performance of restricted boltzmann machines via log-sum regularization. Knowl-Based Syst 63:82–96
Article Google Scholar
Kohonen T (1981) Self-organized formation of topologically correct feature maps. Biol Cybern 43(1):59–69
Article MATH Google Scholar
Krizhevsky A, Sutskever I, Hinton G (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Proces Syst 25(2):2012
Google Scholar
LeCun Y, kavukcuoglu K, Farabet C (2010) Convolutional networks and applications in vision. IEEE Int Symp Circuits Syst 14(5):253–256
Google Scholar
Lee H, Battle A, Raina R, Ng AY (2007) Efficient sparse coding algorithms. In: NIPS, pp 801–808
Lee H, Ekanadham C, Ng AY (2008) Sparse deep belief net model for visual area v2. Adv Neural Inf Proces Syst 20:873–880
Google Scholar
Liu M, Zhang D (2016) Pairwise constraint-guided sparse learning for feature selection. IEEE Trans Cybern 46.1:298–310
Article Google Scholar
Olshausen BA, Field DJ (1996) Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381(6583):607–609
Article Google Scholar
Robinson L, Rolls ET (2015) Invariant visual object recognition: biologically plausible. Biol Cybern 109:505–535
Article MathSciNet Google Scholar
Rolls ET (2012) Invariant visual object and face recognition: neural and computational bases, and a model. VisNet Front Comput Neurosci 6(35):1–70
Google Scholar
Rolls ET, Treves A (2011) The neuronal encoding of information in the brain. Prog Neurobiol 95:448–490
Article Google Scholar
Rolls ET, Treves A, Tovee MJ, Panzeri S (1997) Information in the neuronal representation of individual stimuli in the primate temporal visual cortex. J Comput Neurosci 4:309–333
Article MATH Google Scholar
Socher R, Huval B, Bhat B, Manning CD, Ng AY (2012) Convolutional-recursive deep learning for 3d object classification. In: NIPS, pp 665–673
Szegedy C, Liu W, Jia Y (2015) Going deeper with convolutions. In: CVPR, pp 1–9
Wallis G, Rolls ET (1996) A model of invariant object recognition in the visual system. Prog Neurobiol 51:167–194
Article Google Scholar
Yang J, Yu K, Gong Y, Huang TS (2009) Linear spatial pyramid matching using sparse coding for image classification. In: CVPR, pp 1794–1801
Zhang J, Liang J, Zhao H (2013) Local energy pattern for texture classification using self-adaptive quantizatiion thresholds. IEEE Trans Image Process 22.1:31–42
Article Google Scholar
Zhang J, Zhao H, Liang J (2013) Continuous rotation invariant local descriptors for texton dictionary-based texture classification. Comput Vis Image Underst 117.1:56–75
Article Google Scholar
Zhang J, Liang J, Zhang C, Zhao H (2015) Scale invariant texture representation based on frequency decomposition and gradient orientation. Pattern Recogn Lett 51:57–62
Article Google Scholar

Download references

Acknowledgments

This work was supported in part by the Project of National Natural Science Foundation of China (Grant Nos. 61076097, 61473257).

Author information

Authors and Affiliations

School of Microelectronics, Xidian University, Xi’an, Shaanxi, 710071, China
Huimin Cai, Shulong Wang & Hongxia Liu
College of Information Science & Electronic Engineering, Zhejiang University, Hangzhou, Zhejiang, 310027, China
Eryun Liu

Authors

Huimin Cai
View author publications
You can also search for this author in PubMed Google Scholar
Shulong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Eryun Liu
View author publications
You can also search for this author in PubMed Google Scholar
Hongxia Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shulong Wang.

Appendix

Assume that P(t) = 1/S, where S is the number of objects, we can derive the possible maximum SSI of a neuron by:

$$ P(r)=\sum\limits_{t\in S}P(t)P(r\mid t)=\frac{1}{S}\sum\limits_{t\in S}P(r\mid t) $$

(11)

$$\begin{array}{@{}rcl@{}} I(s,R)&=&\sum\limits_{r\in R}P(r\mid s)\log_{2}\frac{P(r\mid s)}{P(r)} \end{array} $$

(12)

$$\begin{array}{@{}rcl@{}} &=&\sum\limits_{r\in R}P(r\mid s)\log_{2}\frac{S\cdot P(r\mid s)}{\sum\limits_{t\in S}P(r\mid t)} \end{array} $$

(13)

$$\begin{array}{@{}rcl@{}} &=&\sum\limits_{r\in R}P(r\mid s)\left[ \log_{2}S+\log_{2}\frac{P(r\mid s)}{\sum\limits_{t\in S}P(r\mid t)} \right] \end{array} $$

(14)

$$\begin{array}{@{}rcl@{}} &=&\log_{2}S+\sum\limits_{r\in R}P(r\mid s)\log_{2}\frac{P(r\mid s)}{\sum\limits_{t\in S}P(r\mid t)} \end{array} $$

(15)

$$\begin{array}{@{}rcl@{}} &\because& 0\leq P(r\mid s)\leq \sum\limits_{t\in S}P(r\mid t) \end{array} $$

(16)

$$\begin{array}{@{}rcl@{}} &\therefore& \log_{2}\frac{P(r\mid s)}{{\sum}_{t\in S}P(r\mid t)}\leq 0 \end{array} $$

(17)

$$\begin{array}{@{}rcl@{}} &\therefore& \sum\limits_{r\in R}P(r\mid s)\log_{2}\frac{P(r\mid s)}{{\sum}_{t\in S}P(r\mid t)}\leq 0 \end{array} $$

(18)

$$\begin{array}{@{}rcl@{}} &\therefore& I(s,R)\leq \log_{2}S \end{array} $$

(19)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cai, H., Wang, S., Liu, E. et al. Invariant object recognition based on combination of sparse DBN and SOM with temporal trace rule. Multimed Tools Appl 76, 12017–12034 (2017). https://doi.org/10.1007/s11042-016-3956-3

Download citation

Received: 18 April 2016
Revised: 20 August 2016
Accepted: 08 September 2016
Published: 23 September 2016
Issue Date: May 2017
DOI: https://doi.org/10.1007/s11042-016-3956-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Invariant object recognition based on combination of sparse DBN and SOM with temporal trace rule

Abstract

Access this article

Similar content being viewed by others

A structure-self-organizing DBN for image recognition

A novel multi-scale and sparsity auto-encoder for classification

Analysis of Different Sparsity Methods in Constrained RBM for Sparse Representation in Cognitive Robotic Perception

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Invariant object recognition based on combination of sparse DBN and SOM with temporal trace rule

Abstract

Access this article

Similar content being viewed by others

A structure-self-organizing DBN for image recognition

A novel multi-scale and sparsity auto-encoder for classification

Analysis of Different Sparsity Methods in Constrained RBM for Sparse Representation in Cognitive Robotic Perception

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation