Learning representation hierarchies by sharing visual features: a computational investigation of Persian character recognition with unsupervised deep learning

Sadeghi, Zahra; Testolin, Alberto

doi:10.1007/s10339-017-0796-7

Learning representation hierarchies by sharing visual features: a computational investigation of Persian character recognition with unsupervised deep learning

Research Report
Published: 25 February 2017

Volume 18, pages 273–284, (2017)
Cite this article

Cognitive Processing Aims and scope Submit manuscript

650 Accesses
9 Citations
Explore all metrics

Abstract

In humans, efficient recognition of written symbols is thought to rely on a hierarchical processing system, where simple features are progressively combined into more abstract, high-level representations. Here, we present a computational model of Persian character recognition based on deep belief networks, where increasingly more complex visual features emerge in a completely unsupervised manner by fitting a hierarchical generative model to the sensory data. Crucially, high-level internal representations emerging from unsupervised deep learning can be easily read out by a linear classifier, achieving state-of-the-art recognition accuracy. Furthermore, we tested the hypothesis that handwritten digits and letters share many common visual features: A generative model that captures the statistical structure of the letters distribution should therefore also support the recognition of written digits. To this aim, deep networks trained on Persian letters were used to build high-level representations of Persian digits, which were indeed read out with high accuracy. Our simulations show that complex visual features, such as those mediating the identification of Persian symbols, can emerge from unsupervised learning in multilayered neural networks and can support knowledge transfer across related domains.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

The complete letter dataset can be downloaded from http://farsiocr.ir.
http://ccnl.psy.unipd.it/research/deeplearning.

References

Ackley D, Hinton GE, Sejnowski TJ (1985) A learning algorithm for Boltzmann machines. Cogn Sci 9:147–169. doi:10.1016/S0364-0213(85)80012-4
Article Google Scholar
Alaei A, Nagabhushan P, Pal U (2009) Fine classification of unconstrained handwritten Persian/Arabic numerals by removing confusion amongst similar classes. In: 10th International conference on document analysis and recognition. pp 601–605. doi:10.1109/ICDAR.2009.181
Alaei A, Nagabhushan P, Pal U (2010) A new two-stage scheme for the recognition of Persian handwritten characters. In: Proceedings—12th international conference on frontiers handwriting recognition, ICFHR 2010. pp 130–135. doi:10.1109/ICFHR.2010.27
Alaei A, Pal U, Nagabhushan P (2012) A comparative study of Persian/Arabic handwritten character recognition. In: 2012 International conference on frontiers handwriting recognition. pp 123–128. doi:10.1109/ICFHR.2012.152
Bengio Y (2009) Learning deep architectures for AI. Now Publishers Inc., Breda
Google Scholar
Bengio Y (2011) Deep learning of representations for unsupervised and transfer learning. In: International conference on machine learning. pp 1–20
Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35:1798–1828
Article PubMed Google Scholar
Borji A, Hamidi M, Mahmoudi F (2008) Robust handwritten character recognition with features inspired by visual ventral stream. Neural Process Lett 28:97–111. doi:10.1007/s11063-008-9084-y
Article Google Scholar
Chapelle O, Schölkopf B, Zien A (2006) Semi-supervised learning. MIT Press, Cambridge
Book Google Scholar
Ciresan D, Schmidhuber J (2015) Multi-column deep neural networks for offline handwritten Chinese character classification. In: 2015 International joint conference on neural networks (IJCNN). IEEE, pp 1–6
Ciresan D, Meier U, Schmidhuber J (2012) Transfer learning for Latin and Chinese characters with deep neural networks. In: International joint conference on neural networks
Clark A (2013) Whatever next? Predictive brains, situated agents, and the future of cognitive science. Behav Brain Sci 36:181–204. doi:10.1017/S0140525X12000477
Article PubMed Google Scholar
Collobert R, Weston J (2008) A unified architecture for natural language processing: deep neural networks with multitask learning. In: International conference on machine learning
Cox DD, Dean T (2014) Neural networks and neuroscience-inspired computer vision. Curr Biol 24:R921–R929. doi:10.1016/j.cub.2014.08.026
Article CAS PubMed Google Scholar
Dehaene S, Cohen L (2007) Cultural recycling of cortical maps. Neuron 56:384–398. doi:10.1016/j.neuron.2007.10.004
Article CAS PubMed Google Scholar
Dehaene S, Cohen L, Sigman M, Vinckier F (2005) The neural code for written words: a proposal. Trends Cogn Sci 9:335–341. doi:10.1016/j.tics.2005.05.004
Article PubMed Google Scholar
Dehaene S, Pegado F, Braga LW et al (2010) How learning to read changes the cortical networks for vision and language. Science 330(80):1359–1364. doi:10.1126/science.1194140
Article CAS PubMed Google Scholar
DiCarlo JJ, Zoccolan D, Rust NC (2012) How does the brain solve visual object recognition? Neuron 73:415–434
Article CAS PubMed PubMed Central Google Scholar
Ebrahimpour R, Esmkhani A, Faridi S (2010) Farsi handwritten digit recognition based on mixture of RBF experts. IEICE Electron Express 7:1014–1019. doi:10.1587/elex.7.1014
Article Google Scholar
Felleman DJ, Van Essen DC (1991) Distributed hierarchical processing in the primate cerebral cortex. Cereb Cortex 1:1–47
Article CAS PubMed Google Scholar
Finkbeiner M, Coltheart M (2009) Letter recognition: from perception to representation. Cogn Neuropsychol 26:1–6. doi:10.1080/02643290902905294
Article PubMed Google Scholar
Fukushima K (1988) Neocognitron: a hierarchical neural network capable of visual pattern recognition. Neural Netw 1:119–130
Article Google Scholar
Ghods V, Kabir E (2010) Feature extraction for online Farsi characters. In: 12th International conference on frontiers handwriting recognition. pp 477–482. doi:10.1109/ICFHR.2010.81
Grainger J, Rey A, Dufau S (2008) Letter perception: from pixels to pandemonium. Trends Cogn Sci 12:381–387. doi:10.1016/j.tics.2008.06.006
Article PubMed Google Scholar
Grainger J, Dufau S, Ziegler JC (2016) A vision of reading. Trends Cogn Sci 1529:1–9. doi:10.1016/j.tics.2015.12.008
Google Scholar
Hamidi M, Borji A (2009) Invariance analysis of modified C2 features: case study—handwritten digit recognition. Mach Vis Appl 21:969–979. doi:10.1007/s00138-009-0216-9
Article Google Scholar
Hinton GE (2002) Training products of experts by minimizing contrastive divergence. Neural Comput 14:1771–1800
Article PubMed Google Scholar
Hinton GE (2007) Learning multiple layers of representation. Trends Cogn Sci 11:428–434
Article PubMed Google Scholar
Hinton GE (2010) A practical guide to training restricted Boltzmann machines. Technical reports UTML TR 2010-003, Univ Toronto 9:1
Hinton GE, Salakhutdinov R (2006) Reducing the dimensionality of data with neural networks. Science 313(80):504–507. doi:10.1126/science.1127647
Article CAS PubMed Google Scholar
Hinton GE, Osindero S, Teh Y (2006) A fast learning algorithm for deep belief nets. Neural Comput 18:1527–1554
Article PubMed Google Scholar
Kaushanskaya M, Marian V (2009) The bilingual advantage in novel word learning. Psychon Bull Rev 16:705–710
Article PubMed Google Scholar
Khosravi H, Kabir E (2007) Introducing a very large dataset of handwritten Farsi digits and a study on their varieties. Pattern Recognit Lett 28:1133–1141. doi:10.1016/j.patrec.2006.12.022
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 24:609–616
Google Scholar
Kruger N, Janssen P, Kalkan S et al (2013) Deep hierarchies in the primate visual cortex: what can we learn for computer vision? IEEE Trans Pattern Anal Mach Intell 35:1847–1871. doi:10.1109/TPAMI.2012.272
Article PubMed Google Scholar
Le QV, Ranzato MA, Monga R et al (2012) Building high-level features using large scale unsupervised learning. In: International conference on machine learning, Edinburgh
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324. doi:10.1109/5.726791
LeCun Y, Bengio Y, Hinton GE (2015) Deep learning. Nature 521:436–444. doi:10.1038/nature14539
Article CAS PubMed Google Scholar
Mohamed A, Dahl GE, Hinton GE (2012) Acoustic modeling using deep belief networks. IEEE Trans Audio Speech Lang Process 20:14–22. doi:10.1109/TASL.2011.2109382
Article Google Scholar
Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22:1345–1359
Article Google Scholar
Parvez MT, Mahmoud SA (2013) Offline arabic handwritten text recognition: a survey. ACM Comput Surv 45:23:1–23:35. doi:10.1145/2431211.2431222
Article Google Scholar
Raina R, Battle A, Lee H et al (2007) Self-taught learning: transfer learning from unlabeled data. In: International conference on machine learning. pp 759–766
Sadeghi Z (2016) Deep learning and developmental learning: emergence of fine-to-coarse conceptual categories at layers of deep belief network. Perception 45:1036–1045. doi:10.1177/0301006616651950
Article PubMed Google Scholar
Salimi H, Giveki D (2012) Farsi/Arabic handwritten digit recognition based on ensemble of SVD classifiers and reliable multi-phase PSO combination rule. Int J Doc Anal Recognit 16:371–386. doi:10.1007/s10032-012-0195-7
Article Google Scholar
Sigaud O, Droniou A (2015) Towards deep developmental learning. IEEE Trans Auton Ment Dev 33:1–16. doi:10.1109/TAMD.2015.2496248
Google Scholar
Simoncelli EP, Olshausen BA (2001) Natural image statistics and neural representation. Annu Rev Neurosci 24:1193–1216
Article CAS PubMed Google Scholar
Stoianov I, Zorzi M (2012) Emergence of a “visual number sense” in hierarchical generative models. Nat Neurosci 15:194–196. doi:10.1038/nn.2996
Article CAS PubMed Google Scholar
Testolin A, Zorzi M (2016) Probabilistic models and generative neural networks: towards an unified framework for modeling normal and impaired neurocognitive functions. Front Comput Neurosci. doi:10.3389/fncom.2016.00073
PubMed PubMed Central Google Scholar
Testolin A, Stoianov I, De Filippo De Grazia M, Zorzi M (2013) Deep unsupervised learning on a desktop PC: a primer for cognitive scientists. Front Psychol 4:251
Article PubMed PubMed Central Google Scholar
Testolin A, Stoianov I, Sperduti A, Zorzi M (2016) Learning orthographic structure with sequential generative neural networks. Cogn Sci 40:579–606
Article PubMed Google Scholar
Testolin A, Stoianov I, Zorzi M (2017) Letter perception emerges from unsupervised deep learning and recycling of natural image features (under review)
Vapnik VN (1999) An overview of statistical learning theory. IEEE Trans Neural Netw 10:988–999
Article CAS PubMed Google Scholar
Vinckier F, Dehaene S, Jobert A et al (2007) Hierarchical coding of letter strings in the ventral stream: dissecting the inner organization of the visual word-form system. Neuron 55:143–156. doi:10.1016/j.neuron.2007.05.031
Article CAS PubMed Google Scholar
Widrow B, Hoff M (1960) Adaptive switching circuits. In: IRE WESCON convention record. pp 96–140
Wiley RW, Wilson C, Rapp B (2016) The effects of alphabet and expertise on letter perception. J Exp Psychol Hum Percept Perform 42:1186–1203. doi:10.1037/xhp0000213
Article PubMed Google Scholar
Zorzi M, Testolin A, Stoianov I (2013) Modeling language and cognition with deep unsupervised learning: a tutorial overview. Front Psychol 4:515. doi:10.3389/fpsyg.2013.00515
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

This work was partially supported through a Grant to A.T. from the Italian Ministry of Research. Part of this research was performed, while both authors were visiting the Parallel Distributed Processing Lab at Stanford University, California, USA. Computing resources were provided by the Stanford Center for Mind, Brain and Computation. The authors warmly thank Prof. Jay McClelland for financial support and for making it possible to access Stanford MBC resources.

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, University of Tehran, Tehran, Iran
Zahra Sadeghi
Computational Cognitive Neuroscience Lab, University of Padova, Padua, Italy
Zahra Sadeghi & Alberto Testolin
Department of General Psychology, University of Padova, Via Venezia 12/2, 35131, Padua, Italy
Alberto Testolin

Authors

Zahra Sadeghi
View author publications
You can also search for this author in PubMed Google Scholar
Alberto Testolin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alberto Testolin.

Additional information

Handling Editor: John K. Tsotsos (York University); Reviewers: Mahdi Biparva (York University), Alireza Alaei (Griffith University).

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (PDF 233 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sadeghi, Z., Testolin, A. Learning representation hierarchies by sharing visual features: a computational investigation of Persian character recognition with unsupervised deep learning. Cogn Process 18, 273–284 (2017). https://doi.org/10.1007/s10339-017-0796-7

Download citation

Received: 22 March 2016
Accepted: 15 February 2017
Published: 25 February 2017
Issue Date: August 2017
DOI: https://doi.org/10.1007/s10339-017-0796-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning representation hierarchies by sharing visual features: a computational investigation of Persian character recognition with unsupervised deep learning

Abstract

Access this article

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Supplementary material 1 (PDF 233 kb)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation