Abstract
Although modern deep learning approaches have achieved astounding results in most visual pattern recognition tasks, they do it using large datasets of labeled data. Besides the fact that, in many applications, such labels are costly to obtain, the need for them is not observed in a biologically intelligent machine like the human brain. “What-Where” sets were proposed as a way to represent visual patterns in a manipulatable manner, where two-dimensional geometric transformations can be exploited to increase invariance, and thus reduce the need for large amounts of training data. However, the cornerstone of classification using these sets is a similarity measure that implicates a time-consuming computation due to the unstructured nature of sets. In this work, we propose a grid-based coding strategy to represent the sets as sparse binary vectors. By doing so, we achieve three main advantages: first, leveraging pointer-coding of active bits, we reduce the time complexity of the similarity computation from quadratic to linear in the number of elements of the smaller set being compared; second, we use the theoretical framework of sparse representations to justify the classification robustness exhibited in the original work; third, we bring the model under the widely accepted biological constraint that populations of neurons in the brain code sparse representations.






Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Ahmad S, Hawkins J (2016) How do neurons operate on sparse distributed representations? A mathematical theory of sparsity, neurons and active dendrites. arXiv
Ahmad S, Scheinkman L (2019) How can we be so dense? The benefits of using highly sparse representations. arXiv
Anderson J (1995) An introduction to neural networks, 1st edn. MIT Press
Cardoso Â, Wichert A (2010) Neocognitron and the map transformation cascade. Neural Netw 23:74–88. https://doi.org/10.1016/j.neunet.2009.09.004
Chafee MV, Averbeck BB, Crowe DA (2007) Representing spatial relationships in posterior parietal cortex: single neurons code object-referenced position. Cereb Cortex 17(12):2914–2932. https://doi.org/10.1093/cercor/bhm017
Ciresan D, Meier U, Schmidhuber J (2012) Multi-column deep neural networks for image classification. In: IEEE conference on computer vision and pattern recognition, pp. 3642–3649
Fukushima K (1980) Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol Cybern 36(4):193–202. https://doi.org/10.1007/BF00344251
Fukushima K (1989) Analysis of the process of visual pattern recognition by the neocognitron. Neural Netw 2(6):413–420. https://doi.org/10.1016/0893-6080(89)90041-5
Fukushima K (2003) Neocognitron for handwritten digit recognition. Neurocomputing 51:161–180. https://doi.org/10.1016/S0925-2312(02)00614-8
Hawkins J, Ahmad S, Purdy S, Lavin A (2016) Biological and machine intelligence (BAMI online book). https://numenta.com/resources/biological-and-machine-intelligence/
Hubel D, Wiesel T (1962) Receptive fields, binocular interaction and functional architecture in the cat‘s visual cortex. J Physiol 160:106–154. https://doi.org/10.1113/jphysiol.1962.sp006837
Hubel D, Wiesel T (1968) Receptive fields and functional architecture of monkey striate cortex. J physiol 195:215–243
Hubel DH (1988) Eye, brain, and vision (Scientific American Library). W. H. Freeman & Co
Hubel DH, Wiesel TN (1964) Effects of monocular deprivation in kittens. Naunyn-Schmiedebergs Archiv für Experimentelle Pathologie und Pharmakologie 248(6):492–497. https://doi.org/10.1007/BF00348878
Kanerva P (1988) Sparse distributed memory, 1st edn. MIT Press
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp. 1097–1105
Lecun Y, Bengio Y (1995) Convolutional Networks for images, speech, and time-series. In: The handbook of brain theory and neural networks, pp. 276–279. MIT press. https://doi.org/10.1017/CBO9781107415324.004
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. In: EEE, vol. 86, pp. 2278–2324. https://doi.org/10.1109/5.726791
Olshausen BA, Field DJ (1997) Sparse coding with an overcomplete basis set: a strategy employed by V1? Vision Res 37(23):3311–3325
Posner MI, Raichle ME (1994) Images of mind, 1st edn. W. H. Freeman & Co
Riesenhuber M, Poggio T (1999) Hierarchical models of object recognition in cortex. Nat Neurosci 2(11):1019–1025. https://doi.org/10.1038/14819
Sa-Couto L, Wichert A (2019) Attention inspired network: steep learning curve in an invariant pattern recognition model. Neural Netw 114:38–46. https://doi.org/10.1016/j.neunet.2019.01.018
Sa-Couto L, Wichert A (2020) Storing object-dependent sparse codes in a Willshaw associative network. Neural Comput 32:136–152
Serre T (2019) Deep learning: the good, the Bad and the ugly. Ann Rev Vision Sci 5(1):399–426
Serre T, Wolf L, Bileschi S, Riesenhuber M, Poggio T (2007) Robust object recognition with cortex-like mechanisms. IEEE Trans Pattern Anal Mach Intell 29(3):411–426
Treisman A, Gormican S (1988) Feature analysis in early vision: evidence from search asymmetries. Psychol Rev 95(1):15–30
Wichert A (2011) The role of attention in the context of associative memory. Cognit Comput 3:311–320. https://doi.org/10.1007/s12559-010-9064-1
Wichert A, Pereira JD, Carreira P (2008) Visual search light model for mental problem solving. Neurocomputing 71:2806–2822. https://doi.org/10.1016/j.neucom.2007.08.019
Funding
This work was funded by national funds from Fundação para a Ciência e Tecnologia (FCT) through doctoral Grant (SFRH/BD/144560/2019 and UIDB/50021/2020) awarded to the first author.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Sa-Couto, L., Wichert, A. “What-Where” sparse distributed invariant representations of visual patterns. Neural Comput & Applic 34, 6207–6214 (2022). https://doi.org/10.1007/s00521-021-06759-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-021-06759-0